Scribe Notes for Lecture 8: Consistency; Critical Sections

By: Jason Yang, Nick Ma, Harrison Chang

From Last Lecture: SJF, FCFS required us to wait for the job to finish. We can use a preemptive scheduler that allows us to stop our current job to do another job.

This Lecture: Preemption is the breaking of long jobs into series of shorter jobs. Let us apply preemption to our scheduling policy by introducing a new policy called Round Robin.

Round Robin

Round Robin Scheduling
Job Arrival Length Wait
A 0 5 0
B 1 2 d
C 2 9 2d
D 3 4 3d

Resulting Job Scheduling
Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Job A B C D A B C D A C D A C D A C C C C C

Upside: Average wait time = 1.5d

Downside: Spending more CPU time doing context switches, leading to lower throughput and utilization.

This version of Round Robin also seems a little unfair. Newer processes go to the head of the run queue and can starve old processes. What about fair Round Robin? Newer processes can go to the tail of the run queue instead and will not starve old processes.

Priority Scheduling

There are two kinds of priorities:

For example, SJF (determines priority by length of jobs) and FCFS (determines priority by arrival time) are internal priority schedulers.

We need something more fair! How about Eggert scheduling? With Eggert scheduling, the priority can be determined by the length of the job and the arrival time. (NOTE: a lower priority number indicates higher priority)

For example, if we run this code in bash: $ps -f -p ...
We can find information about the PRIO and NICE of the file, where PRIO represents the internal and external priority and NICE represents the external priorities set by the user.

The nice number is the user-set priority and denotes how nice a program is; it can be negative or positive. The default nice number for all processes is 0. You can set the nice number and even nice make, but only root can reduce a programs niceness.

Problems:

This leads us to synchronization. Synchronization is the coordination of actions in a shared address space. We want to avoid race conditions when behavior depends on the execution of different threads.

Synchronization and Preventing Race Conditions

Atomicity

Thread 1 AAAAAAAAAA NO JOB
Thread 2 WAITING FOR A BBBBBB

Atomic actions need to be small or indivisible. Small atomic actions are easier to implement, and if they are small enough, the hardware itself can execute them. This leads to better and more fair performance.

How to Cheat in Implementing Atomic Actions

Assume we have a bank and it does three functions:

Job Description Thread #
A Deposit ($10) Thread 1
B Withdraw ($15) Thread 2
C Check Thread 3

Initial balance: $100
Thread 1 does A
Thread 2 does B
Thread 3 does C
Final balance: $95

But what if we were to look in between threads 1 and 2 and saw that the bank balance was $85? In other words, what if the bank is withdrawing money before depositing it, even though we had ordered it to deposit first?

The truth is, it does not matter. This sort of internal bug is allowed as long as the external behavior is valid. This is the principle of serializability, which is a special case of observability, where we only look at the API for results.

API
long long balance;
bool deposit(long long pennies) {
if (pennies < 0 || pennies > LONGMAX - balance)
return 0;
balance += pennies;
return true;
}
bool withdraw(long long pennies) {
if (pennies < 0 || pennies > balance)
	return 0;
balance -= pennies;
return true;
}

A problem can arise when some other thread deposits or withdraws just before balance is modified by these functions.

A possible solution:

bool withdraw(long long pennies) {
	long long b = balance;
	if (!(0 <= pennies && pennies <= LONGMAX - b))
		return 0;
	balance = b - pennies;
	return true;
}

However, this also presents another problem. We may be nullifying a deposit that another thread could have done just before this new withdraw modifies the balance. This brings us to the topic of critical sections.

Critical Sections

A critical section is a set of instructions such that while an instruction pointer is executing that set, no other instruction pointer is pointing at a critical section for the same object.

This supports mutual exclusion and bounded wait. The mutual exclusion provides safety against race conditions, and the bounded waiting provides fairness for processes. However, there is a conflict of interest here; to provide mutual exclusion, we want to enlarge critical sections, but to provide bounded waits, we want to shrink critical sections. This leads us to the Goldilocks principle: we want large critical sections, but no larger than it has to be.

Ex: implementing pipes in threads

#define PIPE_SIZE (1<<12)
struct pipe {
	char buf[PIPE_SIZE];
	unsigned r, w;
};
char readc(struct pipe *p) {
	while(p->r == p->w)
		continue;
	return p->buf[p->r++%PIPE_SIZE];
}
void writec(struct pipe *p) {
	while(p->w - p->r == PIPE_SIZE)
		continue;
	p->buf[p->w++%PIPE_SIZE] = c;
}

This code may cause a problem when we want one thread reads at the same time another thread has incremented the write pointer but has not actually written yet. Let us create a lock that can help us define critical sections and prevent this race condition.

lock_t l;
void lock(lock_t*); 
	//this function can busywait while another thread owns the lock
	//we call this type of lock a spin lock
	//pre-condition for this function: this thread does not own the lock
	//post-condition for this function: this thread owns the lock
void unlock(lock_t*); 
	//unlock does not ever have to wait
	//the reverse of lock();
char readc(struct pipe *p) {
	lock(&l);
	while(p->r == p->w)
		continue;
	unlock(&l);
	return p->buf[p->r++%PIPE_SIZE];
}
void writec(struct pipe *p) {
	lock(&l);
	while(p->w - p->r == PIPE_SIZE)
		continue;
	p->buf[p->w++%PIPE_SIZE] = c;
	unlock(&l);
}

There is a problem here though; this placement of locks can cause infinite looping when read waits for something to be written and write waits for something to be read.

char readc(struct pipe *p) {
	lock(&l);
	while(p->r == p->w) {
		unlock(&l);
		lock(&l);
	}
	char c = p->buf[p->r++%PIPE_SIZE];
	unlock(&l);
	return c;
}

There is yet another problem here; we have heavy spinning within the while loop, taking up time as the CPU is passed between threads. Is there another solution which does not spin so much? We could implement two kinds of locks: one for reading and one for writing. The second problem is that the lock is too coarse-grained. It excludes accesses that are not really a problem.

Say we have the following:
A | B (pipe P)
C | D (pipe Q)

When A writes to pipe P, our global lock will also lock pipe Q. This prevents C and D from using their pipe. The solution to this is a finer-grained lock.

struct pipe {
	lock_t l;
	char buf[PIPE_SIZE];
	unsigned r, w;
};

But can a lock be too finely grained? Let us look at the following example:

bool isempty(struct pipe *p) {
	lock(&p->l);
	bool ise = p->r == p->w;
	unlock(&p->l);
	return ise;
}

This is still too slow, we will find out why next lecture!