CS111: Lecture 9
Synchronization primitives and Deadlocks
By Ishan Roy Choudhury and Nitin Gaddipati
Winter 2015

Synchronization Primitives
How to implement critical sections

void lock(lock_t *);
void unlock(lock_t *);
typdef (0 byte datatype) lock_t;


Let's make the following assumptions:

1) uniprocessor

2) no interrupts

 

In this case, lock, unlock are no-ops.

A = assumptions 1 + 2

B = assumptions 1 + 3 (cheap way to make interrupts go pending)

Assumption 3 can be implemented using fucntion splx(). For example, using LOCK splx(6)  (don't give me interrupts unless they are level 6+) and UNLOCK splx(4). These functions are extremely fast since they use only a register.

 


type def int lock_t;
void lock (lock_t *l){
    while (*l)
        continue;
    *l = 1;
}

void unlock(lock_t *l){
    *l = 0;
}


Problems with this:

1) two CPU's can both think they have the lock even though they don't

2) int = 32 bits, but memory bus is 16 bits

 

Atomicity in Loads & Stores

x86:

Larger than 4 bytes? Watch out! (may not work)

Unaligned access? Watch out!

For example:



union {
    int a[2];
    char c[8];
}
u.a[0] = 12;
u.a[1] = 97;
char *p = &u.c[0];
p++;
int *q = (int *) p;
//*q -- 97 << 24 (unaligned access)


Smaller than 4 bytes? Watch out!

For example:



struct{
    bool a; 
    bool b;
    char c;
    int x;
}s;

thread1: s.a = 1;
thread2: s.b = 1;


Always assume it won't work when more than one thread is acessing the same piece of memory!

 

Aligned loads & stores of int

Example: Figure 1a


Figure 1a


CPU 1 will get new value or old value and thus is atomic. For example if we changed lock() as follows:



void lock (lock_t *l){
    while (*l++ != 0)
        continue;
}


Even though race condition is eliminated at source code level, this is still WRONG!

Special instructions on x86 (made in order to deal with problems above)
Lock_incl

Benefit: atomic

Drawback: slower


Thus, lock() can now be implemented as:



void lock (lock_t *l){
    while (true){
        int x = *l;
        lock_incl(&l);
        if (x == 0)
            break;
    }
}
    

Problem: value of x may not be the same if it is changed in the middle of the process.


Xchgl

int xchgl(int *p, int new){
    int old = *p;
    *p = new;
    return old;
}
        

Thus, lock() and unlock() can now be implemented as:



void lock (lock_t *l){
    while (xchgl(l, 1)){
        continue;
    }
}

void unlock (lock_t *l){
    *l = 0;
}


Problem: Won't work if pre-conditions are violated by callers of lock() and unlock() For example, calling lock even when you already own lock.


Withdraw

int balance;
lock_t ballock;
bool withdraw (int w) {
    lock(&ballock);
    bool ok = (0  <= w && w  <= balance);
    if (ok)
        w -= balance;
    unlock(&ballock);
    return ok;
}


Compare and Swap

bool compare_and_swap(int *p, int old, int new){
    if(*p == old) {
        *p = new;
        return 1;
    } else
        return 0;
}


Then we can implement withdraw() in the following way:



bool withdraw (int w){
    while(true){
        int b = balance;
        if (w  < 0 || b  < w) return 0;
        if (compare_and_swap(&balance, b, b-w))
            break;
    }
    return 1;
}


Benefits over lock(), unlock(): Better performance

Drawback over lock(), unlock(): Contention can chew up CPU and assumes state fits in 4 bytes



struct pipe{
    char buf[1024];
    size_t r,w;
    lock_t l;
};

char readpipe(struct pipe *p){
    while(true){
        lock(&p->l);
        if(p->r != p->w)
            break;
        unlock(&p->l);
    }
    char p = p->buf[p->r++%1024];
    unlock(&p->l);
    return p;
}


Deadlocks
Blocking Mutex

We need a way to tell the OS, please don't run me unless there's data in the pipe. Thus, we use a Blocking Mutex, which is like a spin lock, except you don't spin.



typedef struct bmutex{
    lock_t l;
    bool locked;
    struct proc *blocked;

}bmutex_t;

void acquire(struct bmutex_t *b){
    while(true){
        lock(&b->l);
        if(!b->locked)
            break;
        cp->next = b->blocked;
        b->blocked = cp; //cp points to current process table entry
        cp->status = BLOCKED;
        unlock(&b->l);
        yield();
    }
    b->locked = 1;
    unlock(&b->l);
}

void release(struct bmutex_t *b){
    lock(&b->l);
    *b->locked = 0;
    struct proc *p = b->blocked;
    b->blocked = p->next;
    p->status = RUNNABLE;
    unlock(&b->l);
}


Making pipes work with blocking mutex:



struct pipe{
    char buf[1024];
    size_t r,w;
    bmutex_t b;
};

char readpipe(struct pipe *p){
    while(true){
        acquire(&p->b);
        wait_for(&p->b, are_we_there_yet, p);
        if(p->r != p->w)
            break;
        release(&p->b);
    }
    char p = p->buf[p->r++%1024];
    release(&p->b);
    return p;
}

bool are_we_there_yet(void *p){
    return p->r != p->w;
}
    

Condition variable = blocking mutex + extra boolean var for the condition

Condition var API

wait(condvar_t *c, bmutere_t *b)


PRE-CONDITION: you have acquired b

ACTION: releases b, then blocks until some other thread notifies us that condition may be true. Finally, reacquires b & returns.


notify(condvar *c)


Tells OS that c is true. OS will awaken a process blocked on c.


wait(condvar_t *c, bmutere_t *b)


Like notify, but wakes up EVERYBODY


Modify readpipe using Condition var API


struct pipe{
    char buf[1024];
    size_t r,w;
    bmutex_t b;
    condvar_t nonempty, nonfull;
};

char readpipe(struct pipe *p){
    acquire(&p->b);
    while(true){
        wait_for(&p->b, are_we_there_yet, p);
        if(p->r != p->w)
            break;
        wait(&p->nonempty, &p->b);
    }
    char p = p->buf[p->r++%1024];
    release(&p->b);
    notify(&p->nonfull);
    return p;
}


Readers of data structures vs writers

The reader will just inspect data structure (no modifications).  The writer will do read + write. All the primitives we have talked about up untill now are all writers!

 

Read-write lock allows many readers at most 1 wrtier (if any writers, no readers).