Chris Rewinski
Lecture by Paul Eggert on February 8, 2012
typedef bool lock_t; void lock(lock_t *l) { while (*l) continue; *l = true; } void unlock(lock_t *l) { *l = false; }
Race condition here! We need a way to lock our lock so that locking and unlocking execute atomically. Let's turn to the x86 instruction set.
lock incl x |
Hardware implementation: Implementation of locks is possible, but difficult. Not what we need. |
xchgl r,x |
Known as "test and set." |
int xchgl(int *p, int new) { /* This is what happens: int old = *p; *p = new; return old; */ asm("xchgl ..."); } typedef int lock_t; // Precondition: we don't own lock void lock(lock_t *l) { while (xchgl(l,1) != 0) continue; } // Precondition: we own lock void unlock(lock_t *l) { xchgl(l,0); }
What if we wanted to add 2 to a variable atomically?
Using lock incl x
twice will lead to an intermediate state, which introduces a race condition.
We could lock, add 2, and unlock, but this is slow.
Instead we use another assembly instruction: cmpxchgl
(compare and swap)
bool cmpxchgl(int *p, int old, int new) { /* This is what happens: if (*p == old) { *p = new; return true; } else return false; */ asm("cmpxchgl ..."); }
Something to be careful of:
The call cmpxchgl(&v, v, v+2)
introduces a race condition! The values of v in parameter evaluation might not be the same.
So, cache v locally (assuming atomic loads and stores, which is true on x86 for aligned ints):
do { int old = v; } while (!cmpxchgl(&v, old, old+2));
The computation of the new value must be relatively fast, or the loop might go on forever because v changes faster than we can evaluate its new value.
struct rwlock_t { lock_t l; // This lock protects the struct's contents int readers; }
We can use the spinlock as a write lock, since only one writer is allowed at a time.
void lock_for_write(struct rwlock *l) { for (;;) lock(&l->l); if (!l->readers) break; unlock(&l->l); } } void unlock_for_write(struct rwlock *l) { unlock(&l->l); } void lock_for_read(struct rwlock *l) { lock(&l->l); l->readers++; unlock(&l->l); } void unlock_for_read(struct rwlock *l) { lock(&l->l); l->readers++; unlock(&l->l); }
Remember our pipe implementation? We utilized spinlocks:
struct pipe { lock_t l; char buf[N]; size_t r,w; } void writec(struct pipe *p, char c) { for (;;) { lock(&p->l); if (p->w - p->r < N) break; unlock(&p->l); } p->buf[p->w++%N] = c; unlock(&p->l); }
And similarly for readc.
The problem with spinlocks: we busy wait, which wastes CPU time.
A blocking mutex allows us to release control of the CPU to someone else while we wait for the lock.
struct bmutex { lock_t l; bool locked; proc_t *block_list; // Linked list of process descriptors waiting for this lock } // Assume that there's a proc_t *self for the current process void acquire(struct bmutex *p) { for (;;) { lock(&p->l); add(&p->block_list, self); self->blocked = true; // Handwaving: tell the OS that we are blocked, so it doesn't reschedule us if (!p->locked) break; unlock(&p->l); schedule(); // Gives up CPU control } p->locked = true; remove(&p->block_list, self); self->blocked = false; unlock(&p->l); } void release(struct bmutex *p) { lock(&p->l); // Set 1st process in the blocked list to runnable if (p->block_list) p->block_list->blocked = false; p->locked = false; unlock(&p->l); }
We can use blocking mutexes for the pipe implementation, but we still busy wait for the pipe to read or written.
We want to block until the pipe is ready for us to read/write.
In other words, we want to unblock based on a specific boolean expression.
Condition variables consist of three parts:
1. An arbitrary boolean expression
2. A blocking mutex to protect the condition variable
3. Storage for the condition variable itself
API:
wait_for(condvar_t *c, bmutex_t *b); |
b must have been acquired already |
notify(condvar_t *c); |
notifies one waiting thread |
notifyAll(condvar_t *c); |
notifies all waiting threads |
Now we have a final pipe implementation, using blocking mutexes and condition variables:
struct pipe { ... bmutex_t l; condvar_t nonfull, nonempty; } void writec(struct pipe *p, char c) { acquire(&p->l); while(p->w - p->r == N) wait_for(&p->nonfull, &p->l); p->buf[p->w++%N] = c; notify(&p->nonempty); release(&p->l); }
And similarly for readc.
When one or more processes are waiting for events which never occur.
A common task is copying data from file descriptor to another file descriptor:
for (;;) { read(ifd, buf, sizeof(buf)); write(ofd, buf, n); }
For synchronization, read:
- Acquires the lock on ifd
- Copies data
- Releases the lock on ifd
And write:
- Acquires the lock on ofd
- Copies data
- Releases the lock on ofd
Why not have a function copy(ifd, ofd, num)
?
For synchronization, copy would:
- Acquire the lock on ifd
- Acquire the lock on ofd
- Copy data
- Release the lock on ofd
- Release the lock on ifd
But what if we had two processes, P1 and P2:
P1: copy(ifd, ofd, ...);
P2: copy(ofd, ifd, ...);
If P1 acquired the lock on ifd, then P2 acquired the lock on ofd, both processes would block forever.
P1 won't give up ifd's lock until it gets ofd's lock, but P2 won't give up ofd's lock until it gets ifd's lock.
Deadlock is a race condition which occurs due to 4 conditions:
- Circular wait
- Mutual exclusion
- No preemption of locks
- Hold and wait (a process can hold a resource while waiting for another)
Eliminating any of the above conditions eliminates deadlock.
System maintains a directed graph of processes and resources. The edge P1->R1 exists if P1 owns R1. The edge R1->P2 exists is P2 is waiting for R1.
If the system detects a potential loop from a requested blocking wait, it refuses to allow that process to wait.
Processes always acquire resources in order. This condition is necessary for fairness.
If a process can't acquire a resource, it releases all resources and starts over.
This eliminates the condition of hold and wait.