CS111

Scribe Notes for 1/27/16

by Byron Chien, Nathan Chou, Nathaniel Chu, and Jason Liu

Signals

Motivating Example:
Last lecture we discussed gzip, which needs to do two things at once:

create the compressed file
remove the uncompressed files

What if a signal arrives while creating a compressed file? Until gzip is done making the compressed file, we set up a signal handler to remove the compressed file if a signal is received. After the compressed file is created, we tell the signal handler not to remove the compressed file anymore.

But what if a signal arrives while we're removing the uncompressed files? We don't want to receive any signals until we're done with this critical section. If a signal is received before "foo" is unlinked, both the compressed and uncompressed files would be removed from the directory. We need to inform the Operating System that we do not want to receive any signals during that section.

// compress foo into foo.gzip, and remove foo afterward
int ok_to_remove = 1;
// gzip is compressing...
// gzip is done compressing!
ok_to_remove = 0;
// don't want signal to come here
unlink("foo");
// ready for signals

void handle_interrupt(int sig) {
    if (ok_to_remove)
        unlink("foo.gz");
}

Blocking Signals

We can achieve this by blocking signals: when it's time to send a signal, the OS marks a signal as "pending" until it's ready to be delivered.

Relevant system calls include:

int sigprocmask(int how, const sigset_t *restrict set, sigset_t *oset);
int pthread_sigmask(
    int how,                      // SIG_BLOCK, SIG_SETMASK, SIG_UNBLOCK
    const sigset_t *restrict set, // new signals to block/set/unblock
    sigset_t *restrict oset       // function will store previous signals here
);

(Note that the set argument uses the restrict keyword. This keyword here indicates that the pointers cannot be the same and are assumed to have different values.)

sigset_t is essentially a word with each bit representing whether or not that signal is blocked.

how indicates whether to treat set as specifying additional signals to block, the exact set of signals to block, or signals to unblock.

The previous set of blocked signals will be stored in oset, if it is not NULL. To help remember the difference between set and oset, note that set is declared const.

Note that if a signal were unblocked, the signal could arrive before the signal sets are modified.

We might use pthread_sigmask to block signals during a critical section like this:

sigset_t ss;
sigemptyset(&ss); // clears the set
sigaddset(&ss, SIGINT); // add SIGINT to the set
pthread_sigmask(SIG_BLOCK, &ss, 0); // block signals in the set (SIGINT)
// HANDLE CRITICAL SECTION HERE
pthread_sigmask(SIG_UNBLOCK, &ss, 0);

By blocking the interrupt signal, the program can now safely handle the critical section.

These two system calls (sigprocmask and pthread_sigmask) do the same thing, but sigprocmask can only be used in single-threaded programs (though it is faster).

Let's see how signals are handled by a multi-threaded program. Recall that threads share memory (unlike processes), which allows them to potentially interfere with each other - it's like having actors (threads) wielding swords running around a stage. If we set up a signal handler (handle_interrupt), what happens when a signal for the process arrives? Do all threads receive the signal?

In some systems, we can directly designate a thread to receive all the signals, like having a stage manager for all the actors. This way the other threads (actors) don't have to worry about signals. But POSIX does not do this (by default); instead, it arbitrarily picks a thread (that didn't block the signal) to receive the signal.

Signal masks can't be shared across threads (e.g. one thread might be running a critical section while another is not), so threads can use pthread_sigmask to block signals just for that thread. However, if we just block signals before a critical section, the signal will still be received after the critical section anyways, and the thread would have to decide how to handle it. This is why having a "stage manager" to handle all the signals would help.

To implement this "stage manager", we need to have "actor" threads pass their signals to the "stage manager" thread, which can be done with pthread_kill.

void handle_interrupt(int sig) {
    if (pthread_equal(pthread_self(), stage manager)) {
        // we're the stage manager! handle the signal
    } else {
        // forward the signal to the stage manager
        pthread_kill(stage manager, sig);    
    }
}

A Side Note

We must be careful when calling functions from the signal handler. If an "unsafe" function is called, the signal can interrupt the execution of the unsafe function. An example of this would be calling fprintf in the signal handler. fprintf calls malloc, so if malloc is interrupted, it will be called reentrantly (i.e. before malloc has finished, it will be called again from the signal handler) and mess up the heap.

// DON'T DO THIS! fprintf IS NOT A SYSTEM CALL!
void handle_interrupt(int sig) {
    fprintf(stderr, "Interrupted!!\n");
}

To be safe, the signal handler should only call asynchronous functions. Another example of an unsafe function is exit, which is unsafe since it accesses the IO buffer. Instead of using exit, we should use the safe function _exit in our signal handler instead. A list of functions deemed safe by POSIX can be found in the man pages for signal.

sig_atomic_t volatile globv;
void handle_interrupt(int sig) {
    global = 1;
}

A more conservative option to handle signals in threads is to only allow the setting of global variables inside signal handlers. This implementation would allow the thread to receive the signal and wait until the program is ready to handle it. However, this would also require modifications scattered throughout the program in order to check the state of the global variable.

There is one more issue to consider, involving threads, signals, and devices. Say a signal arrives while we're executing a slow system call (e.g. read()ing from a storage device), which could hang for a long time. We jump to the signal handler and run it, but then what happens?

Ideally we want to resume reading where we left off, but this introduces complexity since, for example, we would have to save how much we've read before the signal. For simplicity, Unix will return from read() (or any slow system call) with errno set to EINTR if it is still reading when a signal arrives. If we want to be sure that read() actually completes, we would need to have code that calls read again upon a EINTR:

while(read(0, buf, 100) == -1 && errno == EINTR) continue;

Scheduling

Scheduling refers to the task of dividing the CPU between processes. Processes request a scheduler for CPU time. If available, the scheduler allocates CPU to the process.

Here are some questions to consider:

Which processes are going to be admitted to the system?
Which processes are going to be in RAM?
If there are more threads than CPUs, which ones get CPU time, when, and for how long?

These questions become more and more difficult to answer as there is more going on in the OS. For example, when there are lots of threads and thread types, or perhaps if certain processes take a long time to run. This is an issue of scaling.

Different thread types can allow for a multi-level priority system:

system
interactive (typing)
batch

A process with more priority will get scheduled earlier. Priority may be external (set by the user) or internal (set by the operating system). An example of an external reason for priority is a program with a close deadline.

To change a process's priority in linux, there is the system call: int nice(int incr), which adds incr to that process's niceness. Niceness is an int from -19 to 19, and by default it is set to 0. A lower niceness number indicates that the program is more important and should be scheduled earlier.

Another way to add to a process's niceness is the nice command: nice <program> adds 10 to the niceness of a program.

Real-time scheduling is when performance starts to matter.
A hard real time scheduler has unchangeable deadlines that must be met. In this case, predictability trumps performance. To make sure we consistently meet these hard deadlines, we are willing to accept performance losses in favor of predictability. For example, we might drop caching in a nuclear reactor, because although caching can greatly increase performance, it is too unpredictable.
A soft real time scheduler is allowed to miss its deadlines, but at a quality cost. An example of an application with a soft deadline is a Youtube video: It should put a new frame on the screen 30 times per second, but if it misses a frame once in a while, the video will still play, but will have lower quality.

Let's examine some possible scheduling policies.

Earliest Deadline First:

Schedule the process with the with the closest deadline to run.
If a deadline can't be met, throw the job away.
Simple and efficient!

But this isn't perfect. If there's a process that needs a ton of CPU time but has the closest deadline, it can tie up resources and starve all other jobs of CPU time.

Scheduling has two parts: a set of policies (theory) that determines how the next job is scheduled, and a set of dispatch mechanisms (practice) that is how the scheduler controls the program counter.

The dispatch mechanisms for CPU scheduling for the OS is made up of cooperative multitasking and preemptive multitasking. The kernel can be used for scheduling; when a process needs to do a system call the kernel schedules the CPU to another process.

Cooperative multitasking is when the operating system never needs to initiate the context switch, because every thread issues a system call every so often, which allows another process to be scheduled. However, if a program doesn't issue any system calls, it will never do a context switch. One way to fix this issue is to require programs to periodically use system calls that don't do anything, such as yield() or schedule(), to give up the CPU to another process. Thus, cooperative multitasking only works well if you can trust programs to behave well and accept these restrictions.

Another method is Preemptive Multitasking, in which the operating system decides when to take control from one task and give it to another. One criteria that can be used for preemptive multitasking is run time. For example, the hardware clock can trap the CPU after a certain interval to switch control of the CPU. For this to work, we need to have a good time interval between context switches. If the interval is too short, a lot of processing time is wasted, but if the interval is too long, we have to wait a lot (ex. If the interval was 10 seconds, when we typing ^C we would have to wait that long for a response). 10ms works well in most cases.

Now let's consider cooperative processes that need to do I/O, particularly with other slow devices. For example, a process that needs to read and write to disk often needs to wait a long time -- milliseconds, even! In these cases, you'll often see code that looks like:

while(!ready())
    continue;

While waiting for the device to return we are wasting CPU time; we should let other processes run instead of just busy waiting.

One method we can do this is by polling. If ready() is false, automatically switch to another process. Check intermittently until the device is actually ready.

We could also use blocking. Consider a function wait_until_ready(). This new system call arrange for execution to resume once the device is actually ready.

while(!ready())
    wait_until_ready();