Winter 2015, Lecture 9

Synchronization Primitives and Deadlock

Written by Arman Afghani and Timothy Portfolio


Table of Contents
  1. How to Implement Critical Sections
    1. A Novel Approach
    2. Making Adjustments
  2. Bank Transactions with Locks
  3. Piping with Locks
    1. Initial Implementation
    2. Using Blocking Mutexes
  4. Semaphores
  5. Condition Variables

1. How to Implement Critical Sections

a. A Novel Approach

Assume two conditions for a machine A:

  1. It has a uniprocessor.
  2. There are no interrupts.

Let's say we want to use the following:

typedef _______ lock_t;
void lock(lock_t* l);
void unlock(lock_t* l);

Because of the two conditions above, the lock and unlock functions become no-ops. In other words, if the two conditions above are satisfied, it is impossible for a machine to fall into deadlock.

For a machine B, let's remove the second condition above so that there are now interrupts. A scheduler can now give the CPU to another process. Let there be a cheap way to make interrupts become pending (e.g. based on a particular level of priority). However, this would not work if we were on a machine C that has a multiprocessor.

Let's try implementing some code that can address this.

typedef int lock_t;

void lock(lock_t* l)
{ 
	while (*l) 
		continue;
	*l = 1; 
}

void unlock(lock_t* l)
{ 
	*l = 0; 
}

This fails. In the lock function, we could have an interrupt right before setting the lock pointer equal to 1, so both calls to the function think they have a lock.

Another thing to consider: suppose lock_t is 32 bits, but the memory bus is 16 bits. This means there may be a race condition; we would have to assume loads and stores are atomic. In x86 architecture, working with variables over 4 bytes may lead to a value that may never be stored.


An additional issue is unaligned access, where the loads and stores may not be on an increment of 4 bytes.

Union Misalignment
union {
int a[2];
char c[8];
} u;
u.a[0] = 12;
u.a[1] = 97;
char *p = &u.c[0];
p++;
int *q = (int *) p;
//the pointer to q is no longer aligned

We also encounter this problem when dealing with an entity smaller than 4 bytes.

Struct s
struct {
bool a:1;
bool b:1;
char c1, c2, c3;
int x;
} s;

Say there is a thread 1 that sets s.a equal to 1, and a thread 2 sets s.b equal to 1. The first store is essentially overwritten.

Your natural assumption as a programmer should be that you're going to have bugs when dealing with multiple threads accessing the same piece of memory.


b. Making Adjustments

Something that does work on x86 architecture is aligned loads and stores of integers.

Bus, RAM, and CPUs

CPU 1 checks on whether CPU 0 is communicating with the RAM.

Let's adjust the lock function from before:

void lock(lock_t *l)
{
	while (*l++)
		continue;
}

This still doesn't work. Look at the assembly code generated by this function, specifically the usage of incl(%eax).

Fortunately, there are special atomic instructions to help on x86.

Sample C code for the two instructions is as follows:

void lock(int *l) 
{
	while (true)
	{
		int x = *l;
		lock_incl(&l);
		if (x==0)
			break; 
	}
}
//PRECONDITION: We don't own the lock

void lock(lock_t *l) 
{
	while (xchgl (l, 1))
		continue;
} //once we exit loop, we have the lock


//PRECONDITION: We own the lock
void unlock(lock_t *l) 
{
	*l = 0;
} 

2. Bank Transactions with Locks

Let's create a function that deals with withdrawing funds from a bank account.

int balance;

bool withdraw(int w) 
{
	if (balance < w || w < 0)
		return 0;
	w -= balance;
	return 1;
}

This function is not thread-safe. Let's add locking.

int balance;
lock_t ballock;

bool withdraw(int w) 
{
	lock(&ballock);
	bool ok = (0 <= w && w <= balance);
	if (ok)
		w -= balance;
	unlock(&ballock);
	return ok;
}

There is still a problem with this setup. The longer the critical section, the longer other entities have to wait. Fortunately, software developers have called Intel enough to get a resolution for this. The function below is the result.

bool compare_and_swap(int* p, int old, int new) 
{
	if (*p == old) 
	{
		*p = new;
		return 1; 
	}
	return 0;
}

Now we can safely withdraw money from our account.

int balance;

bool withdraw(int w)
{
	while (true)
	{
		int b = balance;
		if (w < 0 || b < w)
			return 0;
		if (compare_and_swap(&balance, b, b-w))
			break;
	}
	return 1; 
}

Overall, this provides nicer performance, but contention could lead to the loop running multiple extra times, chewing up CPU. Another downside is that we can only use a one-word object.

3. Piping with Locks

a. Initial Implementation

Let's create a pipe structure and an auxiliary function that uses it.

struct pipe
{
	char buf[1024];
	size_t r, w;
	lock_t l;
}

char readpipe(struct pipe *p)
{
	while (true)
	{
		lock(&p->l);
		if (p->r != p->w)
			break;
		unlock(&p->l);
	}
	char p = p->buf[p->r++ % 1024];
	unlock(&p->l);
	return c;
}

This strategy burns up CPU. We need a way to tell the operating system, "Please don't run me unless there is data in this pipe."

b. Using Blocking Mutexes

To do this, we need to use a blocking mutex. This is like a spinlock, except if the desired resource isn't available, the thread or process sleeps until the resource is available and gives up the CPU until then.

A basic setup for a blocking mutex:

struct bmutex
{
	lock_t l;
	int locked;
	struct proc* blocked;
} bmutex_t;

Let's write some functions that allow for acquiring and releasing the blocking mutex.

void acquire(struct bmutex_t *b)
{
	while (true)
	{
		lock(&b->l);
		if (!b->locked)
			break;
		cp->next = b->blocked; //cp = current process entry
		cp->status = BLOCKED;
		b->blocked = cp; 
		unlock(&b->l);
	}
	yield();
	b->locked = 1;
	unlock(&b->l);
}

void release(struct bmutex_t *b) 
{
	lock(&b->l);
	*b->locked = 0;
	struct proc *p = b->blocked;
	b->blocked = p->next;
	p->status = RUNNABLE;
	unlock(&b->l);
}

4. Semaphores

A semaphore is an abstract data type that allows for multiple threads to have a lock. This is often assisted by an integer variable that tracks the number of locks available for threads or processes.

The primitives for a semaphore are:

Some sample code using semaphores:

char readpipe(struct pipe *p)
{
	while (true)
	{
		acquire(&p->l);
		if (p->r != p->w)
			break;
		release(&p->l);
	}
	char p = p->buf[p->r++ % 1024];
	release(&p->l);
	return c;
}

5. Condition Variables

A condition variable serves as a combination of a blocking mutex and an extra boolean variable for a particular condition, which is known to the program but unimportant to the operating system.

The condition variable API:

//PRECONDITION: You have acquired b.
void wait(condvar_t* c, bmutex_t* b);

/*ACTION: Releases b, then blocks until
some other thread notifies us that the
condition may be true, then reacquires b
and returns. */

void notify(condvar_t* c);
/*ACTION: Tells OS to awaken a process
blocked on c. */

void broadcast(condvar_t* c);
/*ACTION: Like notify, but wakes up all
processes blocked on c. */

Modiying the code in the previous piping setup:

struct pipe
{
	char buf[1024];
	size_t r, w;
	lock_t l;
	condvar_t nonempty;
	condvar_t nonfull;
	bmutex_t b;
}

char readpipe(struct pipe *p)
{
	acquire(&p->b);
	while (true)
	{
		if (p->r != p->w)
			break;
		wait(&p->nonempty, &p->b);
	}
	char p = p->buf[p->r++ % 1024];
	release(&p->b);
	notify(&p->nonfull);
	return c;
}


All content above comes from lecture 9 of Computer Science 111, Winter 2015, taught by Professor Paul Eggert at UCLA.