Access to time/CPU
Many applications want the CPU
If the number of applications is less than the number of CPU's, there is no problem, but when the number of applications becomes larger than the number of CPU's, issues start to arise.
Are applications cooperative?
Threads are like processes, except that they share memory, file descriptors and access rights. Each thread has its own virtual CPU and stack, but the idea is to give a thread minimal resources. Threads are useful in breaking down a large program into sequential smaller programs without worrying about security.
The standard method for creating a new thread is:
int pthread_create(pthread_t *thread, const pthread_attr *attr, void *(*start_routine)(void*), void *arg)
Threads can be joined so that the program may resume sequential execution (usually to avoid race conditions) by the following function, which waits for the specified thread to terminate:
int pthread_join(pthread_t thread, void **retval)
Threads are terminated with the following function:
int pthread_exit(void *retval)
It is instructive to note the similarities between these functions, and the system calls, fork(), sys_wait(), and _exit(), that perform the equivalent functions for processes.
When scheduling the CPU, both a dispatch mechanism (often OS or machine specific) and scheduling policies(shareable among operating systems) are necessary.
Assuming just one CPU is available for use for the time being, a typical picture of what CPU usage looks like is as follows:
It is evident that the CPU is always doing something, even when neither thread is being run. There is a small amount of overhead that comes from running the scheduler. In transitioning between running threads, there are actually two transitions, one from Thread 1 to the Scheduler, and the other from the Scheduler to Thread 2. The threads are running in user mode, while the scheduler runs in kernel mode, so this pattern actually resembles a system call. The second transition, from Scheduler to Thread 2, resembles the return from interrupt (RTI) instruction.
A cooperative thread is one that periodically volunteers to give up the CPU so that another thread can run. This is done via yield(). One way to ensure thread cooperation is to run yield in every loop iteration or recursive call. This method is unfavorable, however, because it requires the user to modify all of their code to include these yields. A second method of cooperation that prevents code modification is to force a thread to give up the CPU every time a system call is made (i.e. every time there is a transition to kernel mode). This method presents issues in buggy applications and in code that has loops with no system calls.
The best way to regain control of a thread that has taken over the CPU is with a clock. This clock serves as a timer interrupt, and can be set to trap after a specific time interval has elapse (e.g. 10ms). The motherboard sends the CPU a signal which causes the CPU to trap as if an INT instruction had been executed. The kernel saves the threads state, and the scheduler is allowed to run. The following example demonstrates why this can be harmful from the threads point of view.
+ The scheduler would be called less, and therefore there would be less overhead
- Latency goes up
+ Latency goes down
- More scheduling overhead
As small as the time it takes the scheduler to run.
Admission Policy: Which processes or threads will be admitted to the system in the first place?
Which processes will reside in RAM?
Which runnable threads in RAM have a CPU?
A single scheduling algorithm won't do well in all situations. Multiple scheduling algorithms are necessary. One possibility is to use different scheduling algorithms for long and short term scheduling. Another is to divide applications into classes (e.g. system apps, interactive apps, batch user apps, student apps, etc.).
With real time scheduling, timely behavior is part of correctness.
In the case of hard real time scheduling, it is crucial that applications meet their deadlines. A real life example is car breaks; if they fail to work as soon as the pedal is stepped on, the results can be catastrophic. Hard real time scheduling prioritizes predictability above CPU performance. Applications are expected to take the same amount of time to run each time they run. Non-optimized bubble sort is an excellent example of such an application. Hard real time scheduling requires that the cache be disabled, because it makes applications faster in an unpredictable way. It also uses polling, not traps.
In soft real time scheduling, some deadlines can be missed at some cost (similar to the lateness penalty of this course). There are several scheduling policies that fall into this category. One of these policies is to run the application with the earliest deadline first, unless that deadline has already passed, in which case it does not get executed. One issue with this is that the application with the earliest deadline may take longer to execute than some of the other applications, and those applications won't get executed because their deadline will pass while the longer application with the earlier deadline is running. Another example policy is called rate-monotonic scheduling, in which higher priority is assigned to tasks that run more frequently. Soft real time scheduling algorithms tend to be more forgiving because they don't always have to work.
The metrics that are used to measure scheduler quality are taken from manufacturing. They are summarized in the following example:
When quantifying the performance of a scheduling algorithm, it is common to look at average times, as well as worst case times. Some commonly used terms are:
The following are some examples of different scheduling algorithms.
Here, the job with the earliest arrival time is chosen to run next.
As seen in the example, this algorithm is very fair. However it suffers from the convoy effect: one slow application can make many little applications stack up behind it.
Here, the job with the shortest run time will be selected to run next. We make the assumption that the run times of applications are known.
The shortest job first algorithm minimizes wait time, but it has a fairness problem. If shorter jobs keep arriving, longer jobs can starve.
In priority scheduling, each job is associated with a priority. High priority jobs have a lower priority number (e.g. A job with priority 1 has the highest priority, while a job with priority 5 is less important).
In Linux, priority is called 'niceness'. For example, the commands nice emacs will run emacs with a higher niceness than normal. Niceness can also be specified: emacs -n7 will add 7 to emacs niceness, and emacs -n-7 will subtract 7 from emacs niceness (it will now run 'meaner'). Linux niceness affects scheduling in the OS.
Priority scheduling falls into two categories: external priorities set by the user, and internal priorities set by the scheduler. First come first serve scheduling is actually a priority scheduler, with the priority set as arrival time. Similarly, shortest job first is a priority scheduler with priority set as run time.
Preemptive scheduling allows the CPU to switch context before the job its running has finished. A preemptively scheduled CPU may look something like this:
Preemptive scheduling minimizes wait time, even more so than the shortest job first algorithm, but it introduces a significant amount of overhead due to context switching.