Lecture 8: Consistency; Critical Sections

Recorded by: Stella Chung, Leslie Lam, and Lynn Tran

In the last lecture, we discussed two scheduling algorithms, Shortest Job First and First Come First Serve. Although these algorithms are significantly better than our previous approach, they are still too inefficient. What we want is to be break the longer jobs into a series of shorter jobs and then schedule the shorter jobs (a special case of ordinary scheduling). This is called pre-emption.

Round Robin scheduling is an algorithm that splits of the jobs into unit pieces. These pieces are all of length 1 or less.

Job	Arrival Time	Length of Job	Wait Time
A	0	5	0
B	1	2	δ
C	2	9	2δ
D	3	4	3δ

We then schedule one second per process:

Time	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20
Process Running	A	B	C	D	A	B	C	D	A	C	D	A	C	D	A	C	C	C	C	C

At time 2, B runs and then finishes, then D, then A, until C is the only process left running. This schedule reduces the average wait time to 1.5δ.

Disadvantages of Round Robin Scheduling:

There is overhead from context switching because the CPU switches between processes more often; this causes less throughput and less utilization.
There is also a possibility of unfairness: If processes continuously show up, the newer arrivals get priority and go to the head of the queue. This starves the older processes because they never get to run again.
- In order to make Round Robin fair, we can tell the new arrivals to go to the end of the run queue.

Priority Scheduling

In order to maximize the efficiency of our scheduler, we assign priorities:

External priorities - set by people who are running the application (i.e. users).
Internal priorities - set by the operating system scheduler.

Examples of priority scheduling:

Shortest Job First - the priority is the length of the job.
First Come First Serve - the priority is the arrival time.
Eggert Scheduling - the priority is the length of the job plus the arrival time.

The $ps command gives us flags about processes.

PRIO - indicates priority ranking (using a combination of internal and external priorities).
NICE - external priorities.
- Ranges from values -19 to 19 (-19 being the most important).
- The default is 0.
- $nice make - adds 5 to priority ranking.
- $nice -n9 make - adds 9 to priority ranking.
- $nice -n-9 make - subtracts 9 from priority ranking (this is only allowed for root privilege).

Problems:

Pre-emptive scheduling - at any point in time your program can be thrown out of the CPU.
Threads - can lose CPU at any time.

Synchronization

Synchronization is the most difficult part of a multithreaded application. When coordinating actions in a shared address space, poor synchronization can lead to race conditions.

Race conditions occur when the behavior of a program depends on the execution order of different threads.

There are several ways to prevent race conditions:

Sequence Coordination
- Insist on a particular order, and arrange for thread A to finish before thread B starts. Executes like sequence command (A;B).
- Simple way: Have processes A and B put them into the same thread.
- Waitpid: Child = A, Parent = B
  - B calls waitpid to wait for child A to finish executing before continuing.
Isolation
- Two threads do not share the same objects.
- A and B never access the same object, so they cannot collide.
Atomicity
- More complicated than sequence coordination and isolation.
- Not isolated, but gives some kind of sequence coordination.
- Arrange for A to be executed all before B or all after B (Acts as if A;B or B;A).
- Kinds of Atomic Actions:
  1. Small
    - Better and easier to implement.
    - If sufficiently small, could possibly be executed with a single machine code (Hardware will do it).
    - Leads to better and more fair performance.
  2. Indivisible
    - Action cannot be split into pieces.
- How to "cheat" in implementing atomic actions:
  - Observability
    - If the user can't see anything wrong, it's ok.
  - Principle of Serializability
    - If you execute out of order, but you can come up with an explanation, it can still work
    - This is a special case of observability (All you have to do is look at the API. As long as the bug cannot be seen, everything is good).

Implementing Deposit and Withdrawal

API:

bool deposit(long long pennies); 
// Returns false if depositing a negative amount

bool withdraw(long long pennies);
// Returns true if withdrawal was successful

Implementation:

long long balance;

bool deposit(long long pennies){
	if(!(0 <= pennies && pennies <= LLONG_MAX_BALANCE))
		return 0; //CAUTION: some other thread could deposit here!   
  	balance += pennies
	return 0
}
 
bool withdraw(long long pennies){
	if(!(0 <= pennies && pennies <= LLONG_MAX_BALANCE))
		return 0; //CAUTION: some other thread could withdraw here!   
	balance -= pennies;
	return true;
}

There is a problem with this code, however. What if deposit and withdraw are being run at the same time? If some other thread deposts after our check for overflow, then when we run balance += pennies, the state of balance is not the same as before, and it could overflow. This also applies to withdrawing. If another thread withdraws after our thread's check for negative, then we could withdraw more than what is actually in the balance!

Better version of withdraw:

bool withdraw(long long pennies){
	long long b = balance;
	if(!(0 <= pennies && pennies <= LLONG_MAX_BALANCE))
		return 0; //CAUTION: could still lose a deposit here!
	balance = b - pennies;
	return true;
}

Note that there is still a problem with this. If there are multiple threads being run at the same time, you could lose a deposit when you run balance = b - pennies.

Critical Sections

Critical sections consist of a set of instructions such that while an instruction pointer is pointing to any of those instructions, no other instruction pointer is pointing at the same section for the same object.

In the case of the bank example, the balance object is critical.

A critical section supports:

Mutual Exclusion
- For safety: Enlarge your critical sections for easy mutual exclusion.
Bounded Wait
- If a thread is in a critical section, it must finish the job as quickly as possible!
- Bounded waits are needed for fairness so that no one thread is preventing other threads from continuing.
- To achieve: Shrink your critical sections so you don't lock out processes unless necessary.

To support both mutual exclusion and bounded wait, the critical sections must satisfy the Goldilocks Principle. A critical section cannot be too large or too small.

Implementing Pipes in Threads

Assume a multithreaded environment. Give up some safety in standard Linux for performance

#define PIPE_SIZE (1 << 12)
struct pipe {
	char buf[PIPE_SIZE];
	unsigned r, w; // Keep track of where we are in the pipe
}

char readc(struct pipe *p) {
	while (p->r == p-> w) // While pipe is empty, wait.
		continue;
	return p->buf[p->r++ % PIPE_SIZE]; 
	// Note: mod only works cause power of 2
}

void writec(struct pipe *p) {
	while (p->w - p->r == PIPE_SIZE) // While pipe is full, wait
		continue;
	p->buf[p->w++ % PIPE_SIZE] = c;
}

We have a problem! This doesn't work in multiple threads unless we have critical sections!

Spin Locks

lock_t = a lock

void lock(lock_t*);
// Busy wait while some other thread owns lock 
// Don't give up CPU while waiting
// Precondition: This thread doesn't own the lock

// Postcondition: This thread owns the lock


void unlock(lock_t*);
// No waiting
// Precondition and Postcondition opposite from lock(lock_t*)

Now let's add locks to the previous code:

char readc(struct pipe *p) {
	lock(&l); // Lock the pipe to read
	while (p->r == p-> w) 
		continue;
	return p->buf[p->r++ % PIPE_SIZE];
	unlock(&l); // Unlock the pipe
	// Note: We return before we unlock! This is a bug
}

void writec(struct pipe *p) {
	lock(&l); // Lock the pipe to write
	while (p->w - p->r == PIPE_SIZE)
		continue;
	p->buf[p->w++ % PIPE_SIZE] = c;
	unlock(&l); // Unlock the pipe
}

While this code is safe, it is not fair. No one can use the CPU while we are waiting on empty pipes and full pipes, because we are busy waiting and we are holding the lock on the pipe. Our critical section is too large. Our problem is that there is an unbounded amount of computation inside the critical sections. We have to make our critical sections smaller.

Let's change our code to fit the Goldilocks Principle.

char readc(struct pipe *p) {
	while (p->r == p-> w) { 
		// While pipe is empty, unlock and then relock
		unlock(&l); 
		// We hope the scheduler will pre-empt between the 
		// unlock and lock commands
		lock(&l);
	}

	char c = p->buf[p->r++ % PIPE_SIZE]; 
	unlock(&l);	
	return c;
}

This code still has several problems:

Spinning lock in while loop
- The lock is still chewing up CPU in the while loop.
Lock is too coarse grain
- What if there are multiple pipes?
- There is a single global lock. This excludes all other access to all other pipes! Most time is spent waiting. While anyone is accessing a pipe, everyone has to spin.
- This lock excludes too much. It excludes accesses that are not really a problem.

To fix problem 2, instead of having a single lock, have a finer grain lock. Each pipe will have its own lock. This lock is finer grain, because it only protects one pipe. In general, finer grain locks help the Goldilocks Principle. If possible, we want to implement finer grain locks. However, finer grain locks chew up memory and make code more complicated.

Lecture 8: Consistency; Critical Sections

Jump to:

Scheduling Algorithms - Round Robin

Priority Scheduling

Synchronization

Critical Sections