CS 111: Lecture 9

October 23, 2008

Written by James McEvoy


We want serializability so that threads don't step on shared memory. For this we need a level of atomicity where certain commands cannot be interrupted and divided.

Running Example: Pipes

For this example, we have a program A reading from a file and writing to a pipe. Program B is reading from that pipe.

Lazy Evaluation: Run B as much as possible until it needs to read from the pipe and then run A.

Eager Evaluation: Run A as much as possible until the pipe fills up or the program finishes and then run B.

Unix users neither exclusively but a combination of the two.

Since pipes are a bounded buffers, we need to implement some sort of flow control so that the programs don't step on each other. We have two possible solutions:

  1. Polling: Keep attempting to read and write until there are characters to read or space to write to.
  2. Blocking: Blocks on read and write until the data or space available.

Read/Write Coherence

Read and write commands are only atomic when dealing in one word chunks. Larger chunks can be moved with one x86 command but they aren't atomic. Even when dealing on the bit level, there can be issues because the entire word is loaded into a register and written back. That means if a bit is changed in the same word, there will be a race condition to see who stores the word back into memory first.

Pipe Implementation with Polling

// N must be a power of 2

struct pipe {
  char buf[N];
  size_t r,w;
}

void writec(struct pipe* p, char c) {
  while (p->w - p->r == N)
    yield();
  p->buf[p->w++%N] = c;
}

char readc(struct pipe* p) {
 while (p->w - p->r == 0)
    yield();
 return p->buf[p->r++%N];
}

Failure Scenario: There cannot be two writers or readers to the same pipe. For example: (a& b&) | c

Critical Sections

A critical section is a set of instructions such that atomicity is preserved if no two threads are executing those two instructions simultaneously on behalf of the same object.

We want an enforcement mechanism that provides for mutual exclusion so that only one thread is executing the section and that provides for a bounded wait so that we aren't executing a long loop in a critical section which would prevent other threads from accessing the processor.

Uniprocessor Critical Sections

With no interrupts and cooperative multitasking, we don't need to do anything because we know only the running program will be executing the critical section at any given time.

With interrupts in a preemptive multitasking environment, we could just make readc and writec system calls because they can easily be atomic by disabling interrupts temporarily.

Multiprocessor Critical Sections

For multiprocessors, we create a new type which will control whether or not a critical section is locked. That type is named mutex_t for mutual exclusion.

typedef int mutex_t;

The basic idea is that we call a lock function on the mutex before the critical section, execute the code, and then call unlock on the mutex. The question is how many mutexes to use for a program. The two extremes are described as follows:

  1. Coarse-grained locking: Use a single lock on many objects/actions - i.e. one global mutex variable.
  2. Fine-grained locking: Many locks used - i.e. adding a mutex variable to the struct definition so a new one is in every instantiation.

In the case of pipes, the following implementations go in order of coarsest to finest.

Note that we should make critical sections as small as possible since the CPU is tied up for the duration. However, be careful not to make them too small that there are still race conditions. Here is a heuristic for designing good critical sections:

  1. Look for writes to shared data.
  2. Expand horizon to include dependent reads.

Implementing the Mutex Lock

void lock(mutex_t* m) {
  while(*m)
    continue;
  *m = 1;
}

The problem with this implementation is that we basically introduced another critical section within the lock function! Since there is no easy way around this recursive problem, Intel decided to help us out. They added a new instruction called xchg which will exchange two values completely atomically. The cost is that it is really slow, but the penalty is worth in this case.

We created a C function called tas (test and set) which we can use to create a polling lock function.

int tas(int* p, int new) {
  int dd = *p;
  *p = new;
  return old;
}

void lock(mutex_t* m) {
  while (tas(m,1) == 1)
    continue;
}

Single Writer, Multiple Readers

The following example shows how to use a mutex to allow multiple readers but only one writer.

mutex_t m;
int readers;

// Lock for read
lock(&p->m);
p->readers++;
unlock(&p->m);

// Lock for write
for(;;) {
lock(&p->m); if (!p->readers)
  return;
unlock(&p->m);
}

Blocking Mutexes

As always, blocking is the way to go so we don't waste CPU cycles looping as we do with polling. This mutex blocking implementation uses a new bmutex struct that contains a list of the current blocked processes. This allows a process which is releasing the lock to reset all the blocked processes to a RUNNABLE state. The mutex variable inside the struct makes sure only one process is accessing it at a time while the locked boolean allows for a quick check to see the status.

struct bmutex {
  mutex_t m;
  bool locked;
  proc_t* blocked_list;
}

Aquire function pseudo code:

The release function was left as an exercise. One hint is that all the processes in the blocked list needed to be reset to RUNNABLE so that they can compete for the next change to lock after the release.