CS 111 Lecture 10: Synchronization Primitives and Deadlock
Scribe Notes for 10/28/2010
by Jason Tsao
1. Synchronization continued
A semaphore is a blocking mutex allowing up to N simultaneous locks
1. limited, but multiple resources
2. A resource is locked when slots == 0
available when slots > 0
where initially slots = N;
- If N = 1, we have a standard blocking mutex. Therefore, a blocking mutex is simply a binary semaphore.
Lets try implementing a semaphore in c:
struct sema
{
int slots;
process_t *waiting; // pointer to list of processes waiting for
// this semaphore
lock_t l; // spinlock
};
Dijkstra came up with these semantics:
P(sema) down: try to acquire a lock, reduce integer.
- P is short for prolaag
V(sema) up: release, done with lock, increment the integer
- V is short for verhoog
// Prolaag and verhoog mean try-and-decrease
struct pipe
{
bmutex_t m;
char buf[N];
size_t r,w;
condvar_t nonfull, nonempty; // just a condition variable
};
void writec(struct pipe *p, char c)
{
for(;;)
{
acquire(&p->n); // Note that “->” binds more closely than &
// Thus its the same as &(p->n)
if(p->w - p->r != N)
break;
release(&p->m);
}
p->buf[p->w ++ % N] = c;
release(&p->m);
}
// There is a problem with the earlier code having to do with
// multi-core. It will work, but it will chew up a lot of cpu time.
Here is the scenario in which it will fail:
1. Tons of writers <-- These will turn into CPU vampires!
2. One lackadaisical reader
We want:
void writec(struct pipe *p, char c)
{
for(;;)
{
wait_for(&p->m, p->w - p->r != N) // Problem is we can't do this
break; // in c because we can't wait
// for the argument to be false
// Problem 1: C processes call by value,
// Problem 2: We don't even have the lock
// It's false now and we want to wait for it to be true
release(&p->m);
}
p->buf[p->w ++ % N] = c;
release(&p->m);
}
Aside: GNU tar + Signals
There was a bug in GNU tar: signal handling was timing dependent.
read(fd, buf, sizeof buf);
For GNU, many utilities/commands broke because of someone adding a
signal_handler into the GNU tar code.
Question: If a signal goes off while a read is executing, should that be an error?
Answer: It depends on what the signal_handler does.
-1 with errno == EINTR
signal_handler fires
do stuff -> Will this stuff call syscalls that will affect my read?
return;
2. Condition variables:
We need:
1. a boolean condition defining the var
2. a blocking mutex protecting the condition
3. cond var itself
API:
wait(condvar_t *c, bmutex_t *b)
- releases b, blocks until some other thread notifies on c,
then reacquires b, then returns;
PRECONDITION:
- We hold b
notify(condvar *c) // Notify a thread
broadcast(condvar *c) // Notify all threads
Using condition variables the above code turns into:
void writec(struct pipe *p, char c)
{
acquire(&p->m)
while(p->w - p->r == N)
wait(&p->nonfull, &p->m); // We won't awake until the object changes
p->buf[p->w ++ % N] = c;
notify*&p->nonempty);
release(&p->m);
}
3. Synchronization gotchas
- If preconditions not respected:
- we need debugging versions of libraries with bigger locks
- parent process that sorts <-- suppose this parent is so big it can't fit into ram
- Therefore, it forks children and pipes:
parent process
| ^
V |
brzlfuz
- If we use brzlfuz, it can deadlock if both pipes are full
- If we use sort, we won't b/c sort doesn't generate any output until it has got all of its input
4. Deadlock
- This is a race condition
- 4 conditions for deadlock
1. circular wait
2. mutual exclusion
3. No preemption of locks
4. Hold + wait
- That is: need to be able to hold one resource while waiting
for another
- It's up to the coder to avoid deadlocks no matter how slow/fast the
system they are running on
Disciplines for avoiding deadlocks
1. acquire() checks @ runtime for cycles, fails if it would create a
cycle
(+) No deadlock
(-) Complicates API + apps errno == EWOULDDEADLOCK (new error_num)
- Acquire changed, it now returns a boolean saying if it failed/not
(-) Run-time cost: O(length of dependencies) <-- in practice, this is short
2. Lock ordering
Suppose you need to acquire locks l, m ,n where &l < &m < &n
acquire(&l) then grab(&m) then grab(&n) (without waiting)
acquire(&l) grab(&m) grab(&n)
^ ^ ^
| | - if fail, then loop to acquire |
|--------------------------------------------------------------<-V
- grab will fail if it can't acquire.
if fail
release
5. Priority Inversion
- For this to open, one needs locks + scheduling priorities
- Only requires one lock
- Suppose, we have 3 threads:
T_lo T_mid T_hi <== Threads
t |
i | runnable waiting waiting
m | lock(&l); (runnable)
e | runnable acquire(&m);
| | ^-------------------------------------------^
V | | (blocked)
| release() V
| I HAVE THE CPU
V
Problem: T_lo has the lock, and T_hi wants it. When T_lo has the lock, and T_hi tries to
acquire, it will be put to sleep. A problem occurs if T_mid becomes runnable right as T_lo
releases its lock since T_mid will receive the lock before T_hi, even though T_hi wanted to
acquire the lock first.
Solution: High priority waiters should lend their priorities
to the lock owners
- This is not deadlock, just feels like it
6. Livelock (nobody waits, but no work gets done!)
- Livelock is when several threads continue to run, but the entire system does not
make progress because it keeps on repeating patterns of resource try-locks and
failures.
Here is a diagram illustrating livelock from
http://www.paulbridger.com/files2/livelock.png:
bounded buffer
------- -----------------------
requests 100 Gb/s link-> | unhandled requests |
------- -----------------------
- interrupt handler
- CPU can process reads @ 10 Gb/s
When we receive livelock:
Handled 10 Gb/s requests/s
|
| ------------------------------ polling
CPU | /---
| / \ <-- interrupt handler has all the CPU!
| / \
| / \_ _ _ __ __ _ _
----------------------------------------
Requests/s
This graph shows how although polling takes up more cpu, but it gets more work done (in
the case that livelocking occurs).
Therefore, here are the solutions
1. disable interrupts when busy
2. polling
To take a more in-depth look, see http://www.stanford.edu/class/cs240/readings/mogul.pdf
7. Event-driven Programming (smaller apps)
- Has these key attributes:
1. No locks
2. No threads
3. No scheduling
4. No deadlocks
5. No races
The main program would basically run as follows:
for(;;)
{
wait for any event e;
act on (e) <-- must be fast, must not wait
// If you have a complex action, split it up.
// Each subaction sets up event handler for next
}
8. File Systems Performance issues
2 main storage devices:
1. disk - larger, cheaper/byte
2. flash - faster
CPU speed grew exponentially, while Hard disk speed increased linearly.
Hard Disk Layout:
Example: Seagate Barracuda ES2 1 Tb drive:
sector 512 bytes/4096 bytes
7200 rpm = 120 Hz
Components of disk latency:
To read a random sector:
1. wait for read head to get to right track seek time
// average 8.5 ms read
// Many OS’s require performance to be < 100 ms
// Write is normally 9.5 ms
2. Average rotational latency = 8.333 ms to do a full rotation
- this is slow