ˇ@

Synchronization: critical sections and mutexes

ˇ@

ˇ@

Implementing pipes so they work

 

First attempt:

#include <stdef.h>

enum {N = 8*1024;} //8 kilobyte pipes

struct pipe{

   size_t r,w;  //read, write cursors,

             //r is offset 0, w is offset 4

 

   char buf[100]; //starts at offset 8

   //assume read, write 1 byte at a time

}

 

void writec(struct pipe *p, char c){

   p->buf[p->w++ %N] = c;

}

 

char readc(struct pipe *p){

   return p->buf[p->r++ %N];

}

 

Things that can go wrong

        1) Read end can pass write end

        2) Write end could wrap, pass read end

        3) integer overflow in ++

        4) multiple processes accessing thread simultaneously.

 

Overflow is solved by having size_t unsigned, so it wraps around to 0 with overflow, i.e. overflows ˇ§nicelyˇ¨ if N is power of 2.

 

First rewrite:

void writec(struct pipe * p, char c) {

   while (p->w ˇV p->r == N) //pipe is full

      continue; //you wait

p->buf[p->w++ % N] = c;

}

 

char readc(struct pipe *p){

   while (p->w ˇV p->r == 0) //wait while the pipe is empty

      continue;

   return p->buf[p->r++ % N];

   //still doesnˇ¦t handle overflow

}

 

Second rewrite:

void writec(struct pipe * p, char c) {

   while (p->w ˇV p->r == N) //pipe is full

      continue; //you wait

p->buf[p->w++ % N] = c;

if(p->w == 2*N){

   p->w -= N;

   p->r -= N;

}

}

 

Except for efficiency, problems 1) and 2) are handled.

 

Making things work with multiple processes accessing pipe simultaneously.

        -the situation with no readers works

 

Problem 1: incrementation is OK if a single instruction, but typically it requires several instructions, which might be interleaved with different threads.

 

Possible actual machine code:

load 4(r1), r2     %get p->w from memory

add $1, r2, r3     %increment

store r3, 4(r1) %store new value of p->w

and $8191, r3, r3  %take %N

store r0, 4(r1, r3) %store character

 

The way this might work:

Thread 1                         Thread2

r2 = 100

r3 = 101                   ˇ÷            r2 = 100

p-> = 101                 ˇö            r3 = 101

                                 ˇ÷            p->w = 101

No effect                  ˇö

                                ˇ÷            No effect

c1 = p->buf[101]     ˇö            c2 = p->buf[101]

 

One of the characters is lost.

The core of the problem: the line:

p->buf[p->w++ % N] = c;

contains multiple assignments, whose order is undefined.

 

Rewrite 2:

void writec(struct pipe *p, char c) {

   while(p->w ˇV p->r == N)

      continue;

size_t w = p->w;

size_t new_w = w + 1;

p->buf[w % N] = c;    //„˛tell the compiler ˇ§donˇ¦t optimize                      // this!ˇ¨  gcc = o0

p->w = new_w;

   // overflow check code goes here!

}

Make analogous changes to readc.

If we make assumptions that:

        1)loads and stores are carefully ordered

        2)loads and stores are atomic (READ-WRITE COHERENCE)

        3)itˇ¦s OK to spend a busy-wait

 

Then the situation with 1 reader and 1 writer works.

 

Attacking the problem of multiple readers and writers.

A critical section:

        -a series of instructions that at most 1 processor should be executing at any given time.

 

We need to enforce this somehow.

This is a problem if

1)      single processor, but preemptive multitasking

2)      multiple processors, 1 thread

 

If neither applies, thereˇ¦s no problem (assuming non-preemptive multitasking).

 

2 subproblems:

        A)MUTUAL EXCLUSION

        B)BOUNDED WAIT

                if a thread wants in, it should get in quickly (<= 5 sec)

                avoids starvation

 

Solution: make whole readc and writec a critical section.

Will prevent incorrect data (but it can loop forever if thereˇ¦s nothing in the pipe)

 

But... you donˇ¦t want critical sections to be too large

-         other threads canˇ¦t do useful work

-         you might even starve them.

You donˇ¦t want critical secion to be too small

-         you get races

 

MINIMAL CRITICAL SECTIONS:

-         avoid races

-         if you make them smaller, you donˇ¦t avoid races

 

Make the following a critical section:

size_t w = p->w;

size_t new_w = w + 1;

p->buf[w % N] = c;   

p->w = new_w;

 

Thereˇ¦s still a problem:

while(p->w ˇV p->r == N)

   continue;

Another thread may intervene, so thereˇ¦s no more room to write, but the function already made the check, so it writes!

 

Rewrite 3:

void writec(struct pipe* p, char c) {

   for (ii) {

      disable_interrupts(); //critical section starts

   Critical section: (must be fast)

      if (p->w ˇV p-> r != N) {

          p->buf[p->w++ % N] = c;

          enable_interrupts();  //critical section ends

          return;

     }

        enable_interrupts();

}

}

This works for a single CPU case, with preemptive multitasking.

 

For multiple CPUs, use a MUTEX:

typedef ? mutex_t;

void lock (mutex_t *); //grab control of mutex, twiddling thumbs                    //while waiting

void unlock( mutex_t *);

 

Implementing locks:

typedef int mutex_t;

void unlock(mutex_t* m) { *m = 0;}

void lock (mutex_t * m) {

   while (*m)

      continue;

   *m = 1;

}

 

However, this can get interrupted!

 

X86 processor has instructions:

        xchg %ebx(%eax)            atomic (but slow)

 

the test_and_st instruction. It functions as follows:

void test_and_set(int *m, int n){

   int 0 = *m;

   *m = n;

   return 0;

}

 

Re-implement lock:

void lock (mutex_t * m) {

   while (test_and_set(m,1) == 1)

      continue;

   *m = 1;

}

 

Replace disable_interrupts() in readc and writec with

lock(&m);

and enable_interrupts() with

unlock(&m);

 

This locks at the wrong level (a single, global lock).

COARSE-GRAINED LOCK

        Single lock that governs many resources;

        Simpler, easier to program

FINER.GRAINED.LOCKS

        govern few resources

        + better utilization

 

Add to pipe structure a new field:

mutex_t m;

 

in readc, writec:

lock(&p->m);

unlock(&p->m);

 

For even finer-grained locks, we can have separate locks for reading and writing.

Add to pipe structure a new field:

mutex_t rm, wm;

 

in readc, writec:

lock(&p->wm);      //in write

lock(&p->rm);      //in read