CS111

Scribe Notes for 10/16/2008

by Andrew Freer and Peter Freiling

Signals: change to abstract machine

Introduction

A signal can occur between any pair of instructions. When this happens, a special function is called, known as a handler. As the Operating System cannot always know the best action to take for a specific application, this handler is specified by the application itself.

There are two basic handler function models supported by Unix:

Simple Model: the handler is passed only one argument: an integer which corresponds to the signal type.
Complex Model: the handler is passed additional arguments which give the user info about the signal, e.g. the process id of the child that changed state (signal SIGCHLD).

A simple implementation of using the 'signal' function to specify signal handling behavior is as follows:

#include <signal.h>

void handle_power_failure(int sig){
	// ... save state ...
	_exit(2);
}

int main(void){
	signal(SIGPWR, handle_power_failure);
	// this will enable handle_power_failure() to run
	// between any pair of instructions once signal arrives
	// ...do other program stuff...
}

Questions about Signals:

Q1) Does the handler apply only once, i.e. if the signal comes more than once will the specified signal handler continue to be called?

A1) In older versions of Unix (e.g. version 7), this was exactly how the 'signal' call worked. This however, introduced race conditions as applications scrambled to reset the signal handler once it had been called, as well as save the signal handling state and take care of any other important things before they were interrupted.

void handle_power_failure(int sig){
	signal(SIGPWR,handle_power_failure);
	// recursively renew signal handler -> race conditions!
	//
 ...do other stuff...
}

This was decided to be undesirable in later versions of Unix, and now handlers typically stay in force, although you still have the option of specifying otherwise.

Q2) Can signal handlers be interrupted?

A2) As hinted at in the previous answer, yes in old versions of Unix (version 7 for example). However, typically today signals are blocked while in a handler and held for later.

Q3) What happens to signal handlers after fork/execvp?

A3) For calls to fork, all signal handlers are cloned. For calls to exec, established signal handlers revert to SIG_DFL.

Controlling Signals in your Application

Each signal has as its action a default handler function. Some common responses are to dump core, ignore the signal, and terminate the application. You can change this action with a call to 'signal' as shown above. The function signature is:

sighandler_t signal(int signum, sighandler_t handler);

The signal() system call installs a new signal handler for the signal with number signum. The signal handler is set to sighandler which may be a user specified function, or either SIG_IGN or SIG_DFL. The signal() function returns the previous value of the signal handler, or SIG_ERR on error.

As an application, you are allowed to send signals to your child processes. This is implemented by the following system call.

int kill(pid_t pid, int sig);

Note: For a process to have permission to send a signal it must either be privileged (under Linux: have the CAP_KILL capability), or the real or effective user ID of the sending process must equal the real or saved set-user-ID of the target process. In the case of SIGCONT it suffices when the sending and receiving processes belong to the same session.

Signal Handling Mistakes

While there is no right way to handle signals, there are wrong ways. Take for example the following code:

int ouch;
void handle_SIGINT(int sig){
	printf("Interrupted\n"); // Gets inconsistent state and dumps core
	ouch = 1;
}

int main (void){
	signal(SIGINT, handle_SIGINT);

	//...do other program stuff...

	printf("Howdy\n"); // can make a call to malloc() which handles delicate memory processes
}

In this segment of code, the application makes a call to printf() within the signal handler. Unfortunately, printf() is not an "asynchronous-safe" system call. printf() occasionally makes calls to malloc(), which handles delicate memory processes involving pointers. If printf() or any other unsafe system call is called and interrupted, the process is left in an inconsistent state, causing the program to crash. The general solution to this is to not do it. "Asynchronous-safe" system calls include but are by no means limited to:

chdir()
close()
dup()/dup2()
_exit()
getpid()
kill()
open()

Some common (and often incorrect) assumptions made about signals:

System calls are indivisible (if signal arrives during async_safe system calls, handler won't trigger until after, since they can be executed "right away")
Machine instructions are indivisible
Machine instructions are issued in the same order as source code (crucial assumption, often violated in practice).

For the last bullet, it is important to note that, in order to make programs run faster, compilers "optimize" away the order in which some of your instructions are set. Take for example the following section of code:

int x, y;
int a[SIZE];

void f (void)
{
  int i;

  for (i = 0; i < SIZE; i++)
    a[i] = x + y;
}

Your compiler, in its infinite wisdom will likely perform the following optimization:

int x, y;
int a[SIZE];

void f (void)
{
  int i;
  int temp;

  temp = x + y;                  /* incorrect code motion */
  for (i = 0; i < SIZE; i++)
    a[i] = temp;
}

If your code changed the values of x and y asychronously, through a signal handler say, these two pieces of code would produce different results. Anticipating this problem, a new keyword was added to the C language known as 'volatile'. When this is placed in front of a variable, e.g. 'volatile int x;', no optimizations are allowed on any instruction containing the variable. Unfortunately, use of the 'volatile' keyword will slow the code, and in practice some compilers continue to violate this restriction.

Signals in Action

Signals can be used to great advantage. For instance, take the following segment of code. This code executes a program within itself. However, it wants to allow only a set amount of time for the program to execute before it calls an error function. This is done using a user defined signal. Two user defined signals allowed, and are designated SIGUSR1 and SIGUSR2. For a complete list of signals and their default behaviors, click here

void handle_signal(int sig){error();}// NOTE: error() never returns

void printdate(void){
	pid_t p = fork();
	if (p < 0) 
		error();
	else if (p == 0){
		char * const arg[3] = {"/bin/date", "-u", 0}'
		signal(SIGUSR1, handle_sig);
		execvp(args[0], args);
		error();
	}
	sleep(5);  // Allow 5 seconds for the program to execute
	kill(p, SIGUSR1) // Send signal to program in order to trigger error() function
	// this will fail if child has exited, which is okay and can be ignored

	//...do other program stuff...
}

Notable problem in above code: even if child executes quickly and successfully, parent process will still sleep for 5 seconds!

Signal Summary: Pros and Cons

[+] Can manage processes better
[+] Has better performance than polling
[+] Programs are more robust, as they can now respond to unusual events
[-] Processes are less isolated, violates system design principles
[-] Processes can be interrupted at any time
[-] Signal Handling is notoriously buggy

Scheduling (and Signals)

Cooperative Scheduling

In cooperative scheduling, every process yields the CPU now and then. This means a process cannot loop for a long amount of time, i.e. continue doing things that will allow it to maintain control of the processor. Events that allow other processes to be scheduled are:

_exit()
yield()
Almost any system call

This form of scheduling is fairly common in embedded applications, where the system has limited resources and all the apps are "trustworthy".

Preemptive Scheduling

This form of scheduling relies on a timer interrupt to gain control of the CPU. It acts like a call to yield() every 10ms or so, depending on the system.

[+] Need not trust the application to yield the system to us
[-] Applications have to be ready to lose control of the CPU at any time (same problem, to some extent, as signaling!)

Timer Interrupt Mechanism

The key to the timer interrupt mechanism is the clock mechanism that causes the interrupt is separate from the CPU. It fires periodically, and at these points the CPU acts as if the signal was a SIGINT, so it transfers control from the application to a kernel trap. Now, as the kernel, we can restart the program or allow other applications to run as we see fit. When switching between applications however, the internal state must be saved or restored as appropriate, much how registers are saved on the stack in a function call.

This does not completely eliminate the infinte loop problem, if the looping application is currently the only application. Thus, signals are used to mitigate this.

The CTRL-C signal allows users to terminate processes
The kill() function allows processes to terminate child processes
SIGXCPU is sent when the total time spent by the processor executing the process's instructions exceeds the allowed value

There is also the question of which process to run? This question occurs whenever we have only N CPUS and desire to run >N processes simultaneously. Things to consider as far as scheduling go are mechanisms and policies.

Mechanisms	Policies
For dispatches	Priorities, Constraints, Rules

Luckily for us, scheduling is a well-studied issue in the real world. For instance, scheduling is a major issue in factory assembly lines. However, there is one difference between OS scheduling and all other forms of scheduling: the OS must arbitrate quickly, especially since the scheduling takes up time on the CPU that other processes could be using to run!

Scheduling Scale

Long Term:
      e.g. Which processes are admitted to the system?
Medium Term:
      e.g. Which processes reside in RAM?
Short Term:
      e.g. Which process gets CPU time?

Threads (Intro)

Threads are like processes, except there share the same memory space, share file descriptors, and share a common owner. There is no protection between threads, and they share everything they can. This poses some a big synchronization issue: every thread must be written carefully to avoid data structure corruption. However, the efficiency bonus and the ability to break a big application into small sequential programs outweighs this challenge.

There is an alternative to threads: event-driven programming. In this form of programming, your program is divided into numerous small chunks. When the OS gets an event, it calls a chunk. These chunks are called "call-backs", and are triggered by external events. These chunks never block (sleep() or read() calls for example) and are guaranteed to finish quickly. Event driven programming is much simpler than threads to explain and no synchronization primitives are needed. However, you must break up all long loops, and the process gets complicated when multiple CPUs are available.

Cooperative Processes

Cooperative processes yield every now and then, allowing critical system resources to be freed for use. I/O models (API) become one of the following:

Busy waiting: while (device is busy) continue;
Polling: if (device is busy) yield();
Blocking: while (device is busy) yieldandwait(device); <--This tells the OS not to schedule me if device is still busy