• sequential programming, VLIW concept, desire to emulate the real world with parallel threads, free-of-charge exploitation of multiple cores (eight per myth machine, eight per rice machine), pros and cons of **thread**ing versus **fork**ing.
  • C++ **thread**s, thread construction using function pointers, blocks, functors, join, detach, race conditions, mutex, IA32 implementation of lock and unlock, spin-lock, busy waiting, preemptive versus cooperative multithreading, yield, **sleep****_for**.
  • condition variables, rendezvous and thread communication, **lock_guard**, wait, notify_one, notify_all, deadlock, busy waiting.
  • semaphore concept and **s****emaphore** implementation, generalized counter, pros and cons of semaphore versus exposed condition variables, thread pools, cost of threads versus processes.
  • active threads, blocked threads, ready thread queue, high-level implementation details of the thread manager, mutex, and condition_variable_any.
  • pure C alternatives via **pthread**s, pros of **pthread**s over C++ thread package.

Logical control flows are concurrent if they overlap in time. This general phenomenon, known as concurrency, shows up at many different levels of a computer system. Hardware exception handlers, processes, and Linux signal handlers are all familiar examples.

Modern operating systems provide three basic approaches for building concurrent programs:

  • Processes
  • I/O multiplexing
  • Threads

Concurrent Programming with Processes

Multiprocessing overview

Program: code you write to execute tasks

Process: an instance of your program running; consists of program and execution state. Each process has a unique PID.

Multiprocessing : multiple processes can run the same program

fork() creates a clone of the current process, and they run concurrently

1
pid_t fork();
  • parent (original) process forks off a child (new) process
  • The child starts execution on the next program instruction. The parent continues execution with the next program instruction. The order in which process will execute from now on is up to the OS! Nondeterministic ordering of execution across processes, but a parent can wait for its children to terminate using waitpid().
  • fork() is called once, but returns twice.
    • In the parent, fork() will return the PID of the child (only way for parent to get child’s PID)
    • In the child, fork() will return 0 (this is not the child’s PID, it’s just 0)
    • if fork() returns < 0, that means an error occurred
  • A process can use getppid() to get the PID of its parent, getpid() returns the PID of the current process.
  • Everything is duplicated in the child process
    • File descriptor table (increasing reference counts on open file table entries)
    • Mapped memory regions (the address space)
    • Regions like stack, heap, etc. are copied

The parent process’ file descriptor table is cloned on fork and the reference counts within the relevant open file table entries are incremented. This explains how the child can still output to the same terminal!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
int main(int argc, char *argv[]) {
    char str[128];
    strcpy(str, "Hello");
    printf("str's address is %p\n", str);
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // The child should modify str
        printf("I am the child. str's address is %p\n", str);
        strcpy(str, "Howdy");
        printf("I am the child and I changed str to %s. str's address is still %p\n", str, str);
    } else {
        // The parent should sleep and print out str
        printf("I am the parent. str's address is %p\n", str);
        printf("I am the parent, and I'm going to sleep for 2 seconds.\n");
        sleep(2);
        printf("I am the parent. I just woke up. str's address is %p, and its value is %s\n", str, str);
    }

    return 0;
}
1
2
3
4
5
6
7
$ ./fork-copy
str's address is 0x7ffc8cfa9990
I am the parent. str's address is 0x7ffc8cfa9990
I am the parent, and I'm going to sleep for 2 seconds.
I am the child. str's address is 0x7ffc8cfa9990
I am the child and I changed str to Howdy. str's address is still 0x7ffc8cfa9990
I am the parent. I just woke up. str's address is 0x7ffc8cfa9990, and its value is Hello

How can the parent and child use the same address to store different data?

  • Each program thinks it is given all memory addresses to use
  • The operating system maps these virtual addresses to physical addresses
  • When a process forks, its virtual address space stays the same
  • The operating system will map the child’s virtual addresses to different physical addresses than for the parent

Isn’t it expensive to make copies of all memory when forking?

  • The operating system only lazily makes copies.
  • It will have them share physical addresses until one of them changes its memory contents to be different than the other.
  • This is called copy on write (only make copies when they are written to).

waitpid() is a function that a parent can call to wait for its child to exit.

1
pid_t waitpid(pid_t pid, int *status, int options);
  • pid: the PID of the child to wait on
  • status: where to put info about the child’s termination (or NULL)
  • options: optional flags to customize behavior (always 0 for now)
  • the function returns when the specified child process exits
  • the return value is the PID of the child that exited, or -1 on error (e.g. no child to wait on)
  • If the child process has already exited, this returns immediately - otherwise, it blocks
  • We can use WIFEXITED and WEXITSTATUS (among others) to extract info from the status.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>
#include "exit-utils.h"

static const int kForkFailed = 1;
static const int kWaitFailed = 2;

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    exitIf(pid == -1, kForkFailed, stderr, "Fork function failed.\n");
    if (pid == 0) {
        printf("I'm the child, and the parent will wait up for me.\n");
        return 110; // contrived exit status (not a bad number, though)
    } else {
        int status;
        int result = waitpid(pid, &status, 0);
        exitIf(result != pid, kWaitFailed,
                   stderr, "Parent's wait for child process with pid %d failed.\n", pid);
        if (WIFEXITED(status)) {
            printf("Child exited with status %d.\n", WEXITSTATUS(status));
        } else {
            printf("Child terminated abnormally.\n");
        }
        return 0;
    }
}

Debugging Multiprocess Programs

How do I debug two processes at once? **gdb** has built-in support for debugging multiple processes

  • 1
    
    set detach-on-fork off
    
    • This tells **gdb** to capture any fork’d processes, though it pauses them upon the **fork**.
  • 1
    
    info inferiors
    
    • This lists the processes that **gdb** has captured.
  • 1
    
    inferior X
    
    • Switch to a different process to debug it.
  • 1
    
    detach inferior X
    
    • Tell **gdb** to stop watching the process, and continue it

examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
$ gdb basic-fork
The target architecture is assumed to be i386:x86-64
Reading symbols from basic-fork...done.
(gdb) set detach-on-fork off
(gdb) b 17
Breakpoint 1 at 0x400b61: file basic-fork.c, line 17.
(gdb) run
Starting program: /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
Greetings from process 11750! (parent 11748)

Breakpoint 1, main (argc=1, argv=0x7fffffffea08) at basic-fork.c:17
17    pid_t pid = fork();
(gdb) info inferior
  Num  Description       Executable
* 1    process 11750     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
(gdb) n
[New process 11754]
Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.23.so...done.
Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/ld-2.23.so...done.
18    exitIf(pid == -1, kForkFailed, stderr, "fork function failed.\n");
(gdb) info inferior
  Num  Description       Executable
* 1    process 11750     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
  2    process 11754     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
(gdb) n
19    printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
(gdb) n
Bye-bye from process 11750! (parent 11748)
20    return 0;
(gdb) info inferior
  Num  Description       Executable
* 1    process 11750     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
  2    process 11754     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
(gdb) inferior 2
[Switching to inferior 2 [process 11754] (/afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork)]
[Switching to thread 2.1 (process 11754)]
#0  0x00007ffff7ad941a in __libc_fork () at ../sysdeps/nptl/fork.c:145
145 ../sysdeps/nptl/fork.c: No such file or directory.
(gdb) where
#0  0x00007ffff7ad941a in __libc_fork () at ../sysdeps/nptl/fork.c:145
#1  0x0000000000400b66 in main (argc=1, argv=0x7fffffffea08) at basic-fork.c:17
(gdb) finish
Run till exit from #0  0x00007ffff7ad941a in __libc_fork () at ../sysdeps/nptl/fork.c:145
0x0000000000400b66 in main (argc=1, argv=0x7fffffffea08) at basic-fork.c:17
17    pid_t pid = fork();
Value returned is $1 = 0
(gdb) n
18    exitIf(pid == -1, kForkFailed, stderr, "fork function failed.\n");
(gdb) n
19    printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
(gdb) n
Bye-bye from process 11754! (parent 11750)
20    return 0;
(gdb) detach inferior 1
Detaching from program: /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork, process 11750
(gdb) info inferior
  Num  Description       Executable
* 1    <null>            /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
  2    process 11754     /afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork
(gdb) inferior 2
[Switching to inferior 2 [process 11754] (/afs/.ir.stanford.edu/users/c/g/cgregg/cs110/winter-2019/lecture-examples/processes/basic-fork)]
[Switching to thread 2.1 (process 11754)]
#0  main (argc=1, argv=0x7fffffffea08) at basic-fork.c:20
20    return 0;
(gdb) cont
Continuing.
[Inferior 2 (process 11754) exited normally]
(gdb)

Analysis fork()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
int main(int argc, char *argv[]) {
    printf("Starting the program\n");
    pid_t pidOrZero1 = fork();
    pid_t pidOrZero2 = fork();
    
    if (pidOrZero1 != 0 && pidOrZero2 != 0) {
        printf("Hello\n");
    }

    if (pidOrZero2 != 0) {
        printf("Hi there\n");
    }

    return 0;
}
1
2
3
                                  Parent(pidOrZero1=nonzero, pidOrZero2=nonzero)
            Child1(pidOrZero1=0, pidOrZero2=nonzero)                         Child2(pidOrZero1=nonzero, pidOrZero2=0)
            Grandchild(pidOrZero1=0, pidOrZero2=0)

Inter-process communications

  • The waitpid function and signals are primitive IPC mechanisms that allow processes to send tiny messages to processes running on the same host.
  • The sockets interface is an important form of IPC that allows processes on different hosts to exchange arbitrary byte streams.
  • Examples include pipes, FIFOs, System V shared memory, and System V semaphores.

Signals

Race Conditions

Pros and Cons of Processes

  • Processes have a clean model for sharing state information between parents and children: file tables are shared and user address spaces are not.
  • It is impossible for one process to accidentally overwrite the virtual memory of another process, which eliminates a lot of confusing failures-an obvious advantage.
  • Separate address spaces make it more difficult for processes to share state information. To share information, they must use explicit IPC (interprocess communications) mechanisms.
  • Another disadvantage of process-based designs is that they tend to be slower because the overhead for process control and IPC is high.

A Concurrent Server Based on Processes

After accepting the connection request, the server forks a child, which gets a complete copy of the server’s descriptor table. The child closes its copy of Listening descriptor 3 , and the parent closes its copy of connected descriptor 4, since they are no longer needed.

1
gcc -o echoserverp -I . echoserverp.c csapp.c  echo.c
1
gcc -o echoclient -I . echoclient.c csapp.c
1
./echoserverp 9999
1
./echoclient 127.0.0.1 9999

Concurrent Programming with I/O Multiplexing

I/O multiplexing is not the only way to write an event-driven program. Event-driven programs also can be based on threads.

Pros and Cons of 1/0 Multiplexing

Concurrent Programming with Threads

Pros and Cons of 1/0 Threads

Reference