ˇ@
Synchronization: critical sections and mutexes
ˇ@
ˇ@
Implementing pipes so they work
First attempt:
#include <stdef.h>
enum {N = 8*1024;} //8 kilobyte pipes
struct pipe{
size_t r,w; //read, write cursors,
//r is offset 0, w is offset 4
char buf[100]; //starts at offset 8
//assume read, write 1 byte at a time
}
void writec(struct pipe *p, char c){
p->buf[p->w++ %N] = c;
}
char readc(struct pipe *p){
return p->buf[p->r++ %N];
}
Things that can go wrong
1) Read end can pass write end
2) Write end could wrap, pass read end
3) integer overflow in ++
4) multiple processes accessing thread simultaneously.
Overflow is solved by having size_t unsigned, so it wraps around to 0 with overflow, i.e. overflows ˇ§nicelyˇ¨ if N is power of 2.
First rewrite:
void writec(struct pipe * p, char c) {
while (p->w ˇV p->r == N) //pipe is full
continue; //you wait
p->buf[p->w++ % N] = c;
}
char readc(struct pipe *p){
while (p->w ˇV p->r == 0) //wait while the pipe is empty
continue;
return p->buf[p->r++ % N];
//still doesnˇ¦t handle overflow
}
Second rewrite:
void writec(struct pipe * p, char c) {
while (p->w ˇV p->r == N) //pipe is full
continue; //you wait
p->buf[p->w++ % N] = c;
if(p->w == 2*N){
p->w -= N;
p->r -= N;
}
}
Except for efficiency, problems 1) and 2) are handled.
Making things work with multiple processes accessing pipe simultaneously.
-the situation with no readers works
Problem 1: incrementation is OK if a single instruction, but typically it requires several instructions, which might be interleaved with different threads.
Possible actual machine code:
load 4(r1), r2 %get p->w from memory
add $1, r2, r3 %increment
store r3, 4(r1) %store new value of p->w
and $8191, r3, r3 %take %N
store r0, 4(r1, r3) %store character
The way this might work:
Thread 1 Thread2
r2 = 100
r3 = 101 ˇ÷ r2 = 100
p-> = 101 ˇö r3 = 101
ˇ÷ p->w = 101
No effect ˇö
ˇ÷ No effect
c1 = p->buf[101] ˇö c2 = p->buf[101]
One of the characters is lost.
The core of the problem: the line:
p->buf[p->w++ % N] = c;
contains multiple assignments, whose order is undefined.
Rewrite 2:
void writec(struct pipe *p, char c) {
while(p->w ˇV p->r == N)
continue;
size_t w = p->w;
size_t new_w = w + 1;
p->buf[w % N] = c; //„˛tell the compiler ˇ§donˇ¦t optimize // this!ˇ¨ gcc = o0
p->w = new_w;
// overflow check code goes here!
}
Make analogous changes to readc.
If we make assumptions that:
1)loads and stores are carefully ordered
2)loads and stores are atomic (READ-WRITE COHERENCE)
3)itˇ¦s OK to spend a busy-wait
Then the situation with 1 reader and 1 writer works.
Attacking the problem of multiple readers and writers.
A critical section:
-a series of instructions that at most 1 processor should be executing at any given time.
We need to enforce this somehow.
This is a problem if
1) single processor, but preemptive multitasking
2) multiple processors, 1 thread
If neither applies, thereˇ¦s no problem (assuming non-preemptive multitasking).
2 subproblems:
A)MUTUAL EXCLUSION
B)BOUNDED WAIT
if a thread wants in, it should get in quickly (<= 5 sec)
avoids starvation
Solution: make whole readc and writec a critical section.
Will prevent incorrect data (but it can loop forever if thereˇ¦s nothing in the pipe)
But... you donˇ¦t want critical sections to be too large
- other threads canˇ¦t do useful work
- you might even starve them.
You donˇ¦t want critical secion to be too small
- you get races
MINIMAL CRITICAL SECTIONS:
- avoid races
- if you make them smaller, you donˇ¦t avoid races
Make the following a critical section:
size_t w = p->w;
size_t new_w = w + 1;
p->buf[w % N] = c;
p->w = new_w;
Thereˇ¦s still a problem:
while(p->w ˇV p->r == N)
continue;
Another thread may intervene, so thereˇ¦s no more room to write, but the function already made the check, so it writes!
Rewrite 3:
void writec(struct pipe* p, char c) {
for (ii) {
disable_interrupts(); //critical section starts
Critical section: (must be fast)
if (p->w ˇV p-> r != N) {
p->buf[p->w++ % N] = c;
enable_interrupts(); //critical section ends
return;
}
enable_interrupts();
}
}
This works for a single CPU case, with preemptive multitasking.
For multiple CPUs, use a MUTEX:
typedef ? mutex_t;
void lock (mutex_t *); //grab control of mutex, twiddling thumbs //while waiting
void unlock( mutex_t *);
Implementing locks:
typedef int mutex_t;
void unlock(mutex_t* m) { *m = 0;}
void lock (mutex_t * m) {
while (*m)
continue;
*m = 1;
}
However, this can get interrupted!
X86 processor has instructions:
xchg %ebx(%eax) atomic (but slow)
the test_and_st instruction. It functions as follows:
void test_and_set(int *m, int n){
int 0 = *m;
*m = n;
return 0;
}
Re-implement lock:
void lock (mutex_t * m) {
while (test_and_set(m,1) == 1)
continue;
*m = 1;
}
Replace disable_interrupts() in readc and writec with
lock(&m);
and enable_interrupts() with
unlock(&m);
This locks at the wrong level (a single, global lock).
COARSE-GRAINED LOCK
Single lock that governs many resources;
Simpler, easier to program
FINER.GRAINED.LOCKS
govern few resources
+ better utilization
Add to pipe structure a new field:
mutex_t m;
in readc, writec:
lock(&p->m);
unlock(&p->m);
For even finer-grained locks, we can have separate locks for reading and writing.
Add to pipe structure a new field:
mutex_t rm, wm;
in readc, writec:
lock(&p->wm); //in write
lock(&p->rm); //in read