void lock(lock_t *); void unlock(lock_t *); typdef (0 byte datatype) lock_t;
Let's make the following assumptions:
1) uniprocessor
2) no interrupts
In this case, lock, unlock are no-ops.
A = assumptions 1 + 2
B = assumptions 1 + 3 (cheap way to make interrupts go pending)
Assumption 3 can be implemented using fucntion splx(). For example, using LOCK splx(6) (don't give me interrupts unless they are level 6+) and UNLOCK splx(4). These functions are extremely fast since they use only a register.
type def int lock_t; void lock (lock_t *l){ while (*l) continue; *l = 1; } void unlock(lock_t *l){ *l = 0; }
Problems with this:
1) two CPU's can both think they have the lock even though they don't
2) int = 32 bits, but memory bus is 16 bits
x86:
Larger than 4 bytes? Watch out! (may not work)
Unaligned access? Watch out!
For example:
union { int a[2]; char c[8]; } u.a[0] = 12; u.a[1] = 97; char *p = &u.c[0]; p++; int *q = (int *) p; //*q -- 97 << 24 (unaligned access)
Smaller than 4 bytes? Watch out!
For example:
struct{ bool a; bool b; char c; int x; }s; thread1: s.a = 1; thread2: s.b = 1;
Always assume it won't work when more than one thread is acessing the same piece of memory!
Aligned loads & stores of int
Example: Figure 1a
CPU 1 will get new value or old value and thus is atomic. For example if we changed lock() as follows:
void lock (lock_t *l){ while (*l++ != 0) continue; }
Even though race condition is eliminated at source code level, this is still WRONG!
Benefit: atomic
Drawback: slower
Thus, lock() can now be implemented as:
void lock (lock_t *l){ while (true){ int x = *l; lock_incl(&l); if (x == 0) break; } }
Problem: value of x may not be the same if it is changed in the middle of the process.
int xchgl(int *p, int new){ int old = *p; *p = new; return old; }
Thus, lock() and unlock() can now be implemented as:
void lock (lock_t *l){ while (xchgl(l, 1)){ continue; } } void unlock (lock_t *l){ *l = 0; }
Problem: Won't work if pre-conditions are violated by callers of lock() and unlock() For example, calling lock even when you already own lock.
int balance; lock_t ballock; bool withdraw (int w) { lock(&ballock); bool ok = (0 <= w && w <= balance); if (ok) w -= balance; unlock(&ballock); return ok; }
bool compare_and_swap(int *p, int old, int new){ if(*p == old) { *p = new; return 1; } else return 0; }
Then we can implement withdraw() in the following way:
bool withdraw (int w){ while(true){ int b = balance; if (w < 0 || b < w) return 0; if (compare_and_swap(&balance, b, b-w)) break; } return 1; }
Benefits over lock(), unlock(): Better performance
Drawback over lock(), unlock(): Contention can chew up CPU and assumes state fits in 4 bytes
struct pipe{ char buf[1024]; size_t r,w; lock_t l; }; char readpipe(struct pipe *p){ while(true){ lock(&p->l); if(p->r != p->w) break; unlock(&p->l); } char p = p->buf[p->r++%1024]; unlock(&p->l); return p; }
We need a way to tell the OS, please don't run me unless there's data in the pipe. Thus, we use a Blocking Mutex, which is like a spin lock, except you don't spin.
typedef struct bmutex{ lock_t l; bool locked; struct proc *blocked; }bmutex_t; void acquire(struct bmutex_t *b){ while(true){ lock(&b->l); if(!b->locked) break; cp->next = b->blocked; b->blocked = cp; //cp points to current process table entry cp->status = BLOCKED; unlock(&b->l); yield(); } b->locked = 1; unlock(&b->l); } void release(struct bmutex_t *b){ lock(&b->l); *b->locked = 0; struct proc *p = b->blocked; b->blocked = p->next; p->status = RUNNABLE; unlock(&b->l); }
Making pipes work with blocking mutex:
struct pipe{ char buf[1024]; size_t r,w; bmutex_t b; }; char readpipe(struct pipe *p){ while(true){ acquire(&p->b); wait_for(&p->b, are_we_there_yet, p); if(p->r != p->w) break; release(&p->b); } char p = p->buf[p->r++%1024]; release(&p->b); return p; } bool are_we_there_yet(void *p){ return p->r != p->w; }
Condition variable = blocking mutex + extra boolean var for the condition
wait(condvar_t *c, bmutere_t *b)
PRE-CONDITION: you have acquired b
ACTION: releases b, then blocks until some other thread notifies us that condition may be true. Finally, reacquires b & returns.
notify(condvar *c)
Tells OS that c is true. OS will awaken a process blocked on c.
wait(condvar_t *c, bmutere_t *b)
Like notify, but wakes up EVERYBODY
struct pipe{ char buf[1024]; size_t r,w; bmutex_t b; condvar_t nonempty, nonfull; }; char readpipe(struct pipe *p){ acquire(&p->b); while(true){ wait_for(&p->b, are_we_there_yet, p); if(p->r != p->w) break; wait(&p->nonempty, &p->b); } char p = p->buf[p->r++%1024]; release(&p->b); notify(&p->nonfull); return p; }
Readers of data structures vs writers
The reader will just inspect data structure (no modifications). The writer will do read + write. All the primitives we have talked about up untill now are all writers!
Read-write lock allows many readers at most 1 wrtier (if any writers, no readers).