There are many ways to write a scheduler. The various policies that we will see later only represents a small fraction of the possible methods. What is important is that we balance their various trade-offs and choose the best method for our needs. But how do we quantify the effectiveness of a scheduling policy in the first place?
One way is to use well-defined metrics. Just as we have taken inspiration from the auto industry in designing schedulers, we can also model our policies and metrics similarly. In this case, we can think of a CPU like a single factory, and the processes like cars to be built.
First, some assumptions. In this model, we adhere to these rules:
Here is a pictorial representation of some metrics we can derive from this model:
The sequence of events proceeds as thus:
These metrics are used to measure efficiency for a single process. Modern CPUs must juggle hundreds of processes, so it would be more useful to look at some aggregate metrics instead. Some of these are:
A scheduler chooses what metrics to prioritize. We often see a waterbed effect develop when choosing policies. For instance:
Scheduling policies are created based on how much each metric is valued in the scheduler. A policy is a tactic by the scheduler to determine the order in which processes run. While it would be ideal to improve all metrics, the waterbed effect causes one metric to worsen as another improves. Depending on the needs of the processes, different policies are used to determine how a scheduler should be designed.
A scheduler should address the problem of starvation, in which some processes may be constantly denied resources from the CPU. It is possible that a process may never finish after it is started.
Some common policies are as follows:
Example:
Having the CPU switch processes also takes an overhead time; this is denoted by δ.
What does each wait time look like?
Process A: First process, so wait time is 0
Process B: Total time from A + Context switching;
Total wait time is 4 + δ
Process C: Total time from B + Context switching;
Total wait time is 5 + 2δ
Process D: Total time from C + Context switching;
Total wait time is 13 + 3δ
Average wait time: 5.5 + 1.5δ
Average turnaround time: 10.5 + 1.5δ
Average response time: 7.5 + 1.5δ
Pros:
There are different ways of gauging which item to schedule first. The convention by which a process is measured against another in order to determine order is its priority. A low-valued priority can mean that it is scheduled first or that it is scheduled last. For convention, we will say that a lower value priority means that it will be scheduled first. Priority does not necessarily have to consist of a single factor that determines whether a process is scheduled before another, such as run time. It can comprise of several different aspects. For instance, the Linux scheduler has the following formula to compute priority:
Priority = niceness + total CPU time + (- time since the last run)
Niceness is a utility property that is used to determine a priority level to assign to a process, though it is not the only factor. Its general default value is 0. A user can change a process’s niceness through the nice command to improve performance on a specific process. Because the scheduler relies on the total CPU time and the time since the last run, it is not entirely a best-effort scheduler, and a process with a lower niceness may not necessarily beat out a process with a higher one.
For systems in which process turnaround may mean life or death rather than just inconvenience (for instance, a nuclear power plant), a best-effort scheduler is not enough. Such systems often employ a hard real-time scheduler. The exact implementations vary and are unimportant; in general they have these characteristics:
As opposed to hard real-time, soft real-time systems can tolerate some missed deadlines, but at the expense of degraded user experience. An example usage is in smartphones. Here are some characteristics of soft real-time systems:
An example application would be watching a video on a smartphone. The phone must handle
periodic frame rendering and background wi-fi scanning as thus:
Ideally, every frame and scan are performed on time. Occasionally missing one of two blocks
may degrade experience, but is not destructive. Some policies might include:
Synchronization is the biggest problem in a multi-threaded application. We need a way for processes to join together at some point, and to make sure that they do not execute the same parts of a program at the same time.
Unfortunately, the default behavior for processes is unsynchronized, which leads to race conditions. This is an issue because race conditions are notoriously hard to debug. It is difficult to catch the problem automatically, given that they are so rare and intermittent. It is possible to run code thousands of times without fail only for it to break on the 1001th time due to a race condition. We want to be able to avoid race conditions, but we also want to do so using an efficient, non-intrusive, and clear method.
Consider an example in which Bill Gates and Melinda Gates are accessing their bank account simultaneously at different locations. Bill makes a deposit and Melinda withdraws like thus:
Bill:
balance += deposit;
Melinda:
balance -= withdraw;
There is a race condition here. Hypothetically, it is possible for both to read the same balance at the same time, and depending on whose transaction occurs last, one of them will be nullified. For instance the following logical sequence of events may be executed:
int bill_temp = balance + deposit;
int melinda_temp = balance - withdraw;
balance = bill_temp;
balance = melinda_temp;
This is a problem with multi-threading. Even in a single-threaded application however, the existence of signals can also cause the same problems.
Now that we know the problem, we need some way to resolve it. One possible option is to use locks. Generally, locks can be used to guarantee that only a single agent can access a resource until that resource is unlocked for others to use. For multi-threaded applications, locks are used to protect critical sections, or an area of code that accesses shared resources in a way that is vulnerable to race condition.
Locks are nice conceptually, but they have certain pitfalls associated with them. Suppose a bank wants to be able to audit its accounts, which may take a long time. If it obtains a global lock, then during that time no customers will be able to access their money. What about if we just lock on a per-account basis? Then actions like transfer which requires multiple locks may give rise to deadlocking issues, in which two agents refuse to yield their locks to each other.
Many approaches to synchronicity will be covered in the next lecture, Lecture 9. For now, we just want to define some terms. When we say we want synchronicity, in essence we simply want to achieve an abstraction for serializability. Consider the following:
For an observable behavior to be serializable, we should be able to explain events as if they were done sequentially. A possible sequence of events for the above might be:
This explanation works, because it does not change the ultimate outcome of the events. Even if two events happened simultaneously, we can pretend that one happened before another as long as they don’t affect each other.
One way to obtain serializability is to use atomicity. Atomicity is the assurance that an action either occurs in full or doesn’t occur at all. This excludes actions that only perform a portion of its job, such as an interrupted download leaving a half-complete file.