CS111 Week 5 Monday 02/04/13

in order of contribution:
Ryan Mercer, Benjamin Hendricks, Misha Danes, and Michael Haley

Scheduling

Scheduling is needed when multiple processes compete for cpu time. The methods discussed will be compared using these example processes:

Name	Arrival Time	Run Time*
A	0	5
B	1	2
C	2	9
D	3	4

*the BIG assumption is that run times are known in advance. The example mentioned in class was the time taken by the mainframe in Murphey Hall to calculate payroll. It's a pretty complex task, but since it happens every month, we have a good idea of how long it will take in the future.

First Come First Served (FCFS)

The most basic procedure is FCFS which is a queue. FCFS is an example of a fair scheduling where fairness means that a process is guaranteed to be executed if it so desires. The ordering in the cpu in time units is as follows:

|A|A|A|A|A|B|B|C|C|C|C|C|C|C|C|C|D|D|D|D| ^ ^ ^ ^ A B C D (arrival times)

Name	Wait Time	Turnaround Time
A	0	5
B	4	6
C	5	14
D	13	17
Avg.	5.5	10.5

The obvious issue with FCFS is that many short processes with have to wait for a few long processes to complete. This is equivalent to making a Costco run on a Tuesday morning with two of your room mates, each picking up a 6-pac of Crest spearmint flavored dental floss, but all having to wait in line for the bros of Tau Beta Pi to find their membership card and pay for their cartfull of TI-84s, latex gloves, toner,etc.

Shortest Job First(SJF)

Now that we've become so concerned with getting the shorter processes through, we come across the next method, Shortest Job First(SJF). It's execution looks like:

|A|A|A|A|A|B|B|D|D|D|D|C|C|C|C|C|C|C|C|C| ^ ^ ^ ^ A B C D (arrival times)

Name	Wait Time	Turnaround Time
A	0	5
B	4	6
C	9	18
D	4	8
Avg.	4.25	9.25

As expected, the average wait time has decreased, which decreased the turnaround time by the same amount since the run times are the same. But suppose a short job takes 3 seconds, and a longer job takes 4, if another 3 second job arrives before the first finishes, it will be chosen next for execution. It could possibly be 100,000-∞ seconds before the 4 second processes is chosen to execute, making this method unfair.

Combo of FCFS and SJF

While we really would like the speed of SJF, we would like like the fairness of FCFS. Simply combining the two will do just that. In this case, the idea of shortest job is modified to lowest points, and points are the sums of time of arrival and estimated run time. The previous examples with jobs A-D results in the same time as SJF, but how it performs can be roughly imagined. The average wait and turnaround times will be somewhere between SJF and SJF.

Round Robin(RR)

Another scheme introduces the concept of preemption, which is interrupting a program without it's participation and resuming it as a later time. RR uses the combination of FCFS and Preemtion.

|A|B|C|D|A|B|C|D|A|C|D|A|C|D|A|C|C|C|C|C| ^ ^ ^ ^ A B C D (arrival times)

Name	Wait Time	Turnaround Time
A	0	15
B	0	5
C	0	18
D	0	11
Avg.	0	12.25

While the wait times disappear completely in this example, the times appear to have transferred to the turnaround times.

Applications: handling typed characters printed to screen

Priority Scheduling

The idea is that there is some scheme of assigning priority, and jobs with the highest priority run first. The previously described methods are all internal priorities assigned by the OS, but priority can also be assigned by external ~≈forces≈~, referred to as external priorities. This following priority hierarchy is used on the SEASnet machines:
Students ≤ Professors ≤ System Administrators

Issues with Scheduling

Scheduling is hard, in general, because the scheduler usually lacks complete information about what process should be running when, but at the same time we cannot give the scheduler more information because we want it to be super fast and use as little resources as possible. Therefore, schedulers are always made to suck a little, at best, and be awful at worst. This is all for real time scheduling, which breaks down into two parts:

Hard Real Time Scheduling

In this type of scheduling, deadlines cannot be missed. For example, the scheduler for nuclear power plants, brakes in a car, if they fail, then something VERY bad happens. In this case, we opt for predictability over performance. Methods for this include turning off the cache (so there are never cache misses, which could cause failures), using polling and no interrupts, and no preemption.

Soft Real Time Scheduling

In this type of scheduling, deadlines can be missed, sometimes. For example, a video player can miss a couple frames here and there, but overall it will get the video onto the screen (even if it misses a frame here and there). Unlike in hard scheduling, we keep all performance boosts that were turned off for reliability's sake. Some soft real time scheduling strategies include earliest deadline first (ignoring jobs for which the deadline has passed), or rate-monotonic scheduling where the earliest deadline is used, but priorities are also assigned based on the frequency of a task.

Synchronization

When there are many processors being scheduled (and not just 1), there are always going to be places where there is a series of jobs need to be executed serially. For example, let's say we have two processors running, processor 1 has jobs A, B and C and processor 2 has jobs a, b, and c. If job B needs job a to finish before it can start, that's something the scheduler needs to keep track of! While job A and a can run at the same time, if job A finishes before a finishes, then B should not start until a is done, but how do we do this? Atomicity, discussed below.

The main reason this is such an issue, though, is because of the race conditions that occur if there is no synchronization. The standard example is of a bank account, with the ability to withdraw and deposit. If the same bank account is being withdrawn from and deposited to at the same time, the account balance is at risk of being incorrect if these two actions overlap (a race condition occurs). We therefore need to have the withdrawal occur atomically, so that if we start a withdraw job, nothing else can happen to the balance until the job is done.

How To Implement Atomicity

Case 0:

There is only 1 CPU as well as no interrupts or signals. Using this method we are done: the bank balance is safe.

Case 1:

There is only 1 CPU but there are interrupts or, at the application level, signals.

For example:

int signal_arrived;
void handle_sig(int sig) { signal_arrived = 1; }

for(i = 0; i < 100000; i++)

if (signal_arrived) break;

else compute(); ...

After the 'for' loop, we set signal_arrived = 0.

In this example there are two instructions that give room for a signal coming in to cause a race. This isn't a problem if we handle two signals as one. But this still needs improvement. Instead we will implement it this way:

Instead, we'll do it this way:

for(i = 0; i < 100000; i++){
pthread_sigmask(/* block ALRM signal */);

if(signal_arrived) break;

pthread_sigmask(/* unblock ALRM signal */);

}

In this signal handling a similar blocking effect happens because the signal handler could be reentrant and we don't want that to break. We must find the critical sections in our code because there can be at most one active critical section at any given time.

For example, we must watch out for compiler optimizations:

int sa;
sa++;
print(sa);

In assmbly:
movl sa, %eax
movl %eax, %ebx
incl%ebx
pushl %eax
call print
movl %ebx, sa

In the case of this example compilers often cache memory into registers because they are very smart. But how can we tell the compilerr to be "stupid" for a given variable? By changing int sa into a static int volatile sa.

A volatile keyword means that the compiler will not optimize that variable, which will give us proper serialization.

Case 2:

Atomicity with multiple CPU's

To lock around a specific section of code we need to create a mutex_t using the functions lock(/*argument specifies which object to lock */ mutex_t x) and unlock(mutex_t *)

Every critical section for an object 'o' must lock(&o->mutext) first and then unlock(&o->mutext) afterwards. Also if we try to lock when the object is already used, it will wait until the lock is freed.

typedef int mutex_t; void lock(mutex_t *m) {
while(*m) continue;
*m = 1;
}
void unlock(mutex_t *m) { }

Rules of Use

Owners of a lock must not lock again
Only owner of a lock can unlock
It is best to give up a lock as quick as possible - the longer a lock is held the worse overall performance is
Race is still possible - we need the contents of lock to be a single system call

Although atomic instructions on x86 are designed for this, they are slow. For example, lock incl x. Two calls to this will reliably add 2 to x. The solutions available to us are very specific but we are looking for something more generic. All we want is a simple lock() and unlock() with some sort of safety measure in the middle.

For example the call xchgl %eax, mem copies from eax to mem and mem to eax automatically:

L1:
movl $1, %eax
xchgl %eax, (%ebx)
testl %eax
jnz L1

In this example if eax is 0, then we have the lock. Otherwise the lock was already taken. Unlock is okay in this case because the stores of integers are atomic (on x86 this happens on a multiple of 4). This means mutex_t needs to be volatile too.