Operating System Access to Time (Signals, Threads, Scheduling)

Class Notes for 01/28/2008

by Jagannath Coimbatore Premkumar, Michael Wilson, Kunal Bindal

Motherboard and Time

Motherboard uses a clock generator which produces the system clock signal to synchronize the various components on the board itself. This clock signal is fed into the CPU and it interrupts the CPU at regular intervals. When an interrupt occurs, the following happens:

  1. CPU traps the interrupt
  2. CPU transfers control to the kernel via a interrupt vector
  3. Kernel takes control and can do some or all of the following:
    1. Checks on the I/O and other devices
    2. Often, decides to schedule a process which forces the kernel to save the registers of the current process and restore the registers of another process.
  4. Kernel then returns from the interrupt.

The interval of time at which the system clock interrupts the CPU is adjustable. A smaller interval implies more overhead for the CPU, but the I/O devices get their statuses checked frequently. A larger intervals implies a sluggish kernel that takes long time to make decisions. Designers tradeoff between the two to decide on an optimum interval for a given application the system is targeted for. A 10ms interval is a typical value.

The methodology described above uses timer interrupt as a means of context switch (swapping processes). Other ways to swap processes are:

The two high-level views of context switches we saw can be classified as:

  1. Cooperative Scheduling - relies on every process yielding control to the kernel within a reasonable amount of time. Cooperative scheduling assumes that the processes will not loop.
  1. Preemptive Scheduling - relies on the timer interrupt

Kernel “Insight”

A high level view of how a kernel may handle process in a multi-process system is presented below:

for(;;) {

    Choose a process that is not blocked;

    Load that process state into the CPU;

    ret (run process);

    // continues until timer interrupt (or) any other interrupt occurs (or) a trap

    for each process ‘P’ that has become ready

        Unblock(P);

}

Data Structures that are required for this are:

  1. Process Descriptors (To store what the process P is blocked for)
  2. Table for process descriptors
  3. Auxiliary tables for resources that the processes are waiting for.

Signals

A signal is a form of interprocess communication in POSIX-compliant operating systems.

Some examples in Linux:

POSIX: A vendor-independent standard for Unix and Unix-like systems. It stands for "Portable Operating System Interface", and includes documentation regarding signals.

In C, the signal header file is <signal.h>.

Some signals contained in the specification:

            SIGINT   Terminal Interrupt

            SIGXCPU  CPU time limit exceeded

            SIGSEGV  Segmentation Violation (invalid address, out of range)

            SIGBUS   Bus error (e.g. trying to load a word not a multiple of a word)

            SIGILL   illegal instruction

            SIGFPE   Floating Point exception (also includes integer divide by zero, and -2^31/-1)

            SIGIO    I/O error interrupt

            SIGPIPE  Attempt to write on a pipe with no reader

            SIGHUP   Hangup (when logging out)

            SIGTSTP  Terminal stop signal

            SIGSTOP  Stop

            SIGCONT  Continue the process

            SIGUSR1  User defined signal 1

            SIGUSR2  User defined signal 2

            Example that will generate SIGPIPE:

            Program hello:

                  for(i=0; i<=10000000; i++)

                        printf("hello %d\n", i);

            In the terminal, type:

            $hello | head -n 10    

            'head' only reads & writes the first 10 lines, then exits.

Each signal has a default action:

Handlers

A handler is a function that is executed asynchronously when a signal arrives. Since it interrupts the normal flow of execution, it can be called between any pair of instructions. If a handler is not defined for a particular signal, a default handler is used. The only two signals for which a handler cannot be defined are SIGKILL and SIGSTOP.

An example function with a handler that prints the date:

  1.             void handle_USR1(intsignal){
  2.                         error();
  3.             }
  4.             void printdate(void){
  5.                   pid_t p = fork();
  6.                   if(p < 0)
  7.                         error();
  8.                   if(p==0){
  9.                         char* const args[2] = {“/bin/date”, 0};
  10.                         signal(SIGUSR1, handle_USR1);
  11.                         execvp(args[0], args);
  12.                         error();
  13.                   }
  14.                   sleep(5);
  15.                   kill(p, SIGUSR1);
  16.                   if(waitpid(p, &status, 0) <0) //”reaping” the child process
  17.                         error();
  18.                   if(!WIFEXITED(status) || SEXITSTATUS != 0)
  19.                         error();
  20.             }

In the example, the parent sleeps while waiting for the child. One way to avoid this is through a signal sent by the child to parent, called SIGCHLD. By default, the signal is ignored and a zombie is created.

Signal handlers are extremely vulnerable to race conditions, since another signal could come in at any time. In general, it is best to have a simple handler function. For maximum portability, it is suggested that a signal handler:

The signal function pairs the signal with its handler. The handler function may be set to SIG_DFL (default) or SIG_IGN (ignore):

            void* signal(int sig, void (*func)(int))

Zombie Processes

A zombie process is a dead process which has completed executed but still has an entry in the process descriptor table. The entry in the process descriptor table is cached so to allow the process that created the process can later read its exit status. The parent can read the child's exit status by executing the wait system call, at which stage the zombie is removed. The wait call can be executed in sequential code, but it is commonly executed in a handler for the SIGCHLD signal, which the parent is sent whenever a child has died.

After the zombie is removed, its process ID and entry in the process table can then be reused. However, if a parent fails to call wait, the zombie will be left in the process table. In some situations this may be desirable, for example if the parent creates another child process it ensures that it will not be allocated the same process ID. As a special case, under Linux, if the parent explicitly ignores the SIGCHLD (sets the handler to SIG_IGN, rather than simply ignoring the signal by default), all child exit status information will be discarded and no zombie processes will be left.

To remove zombies from a system, the SIGCHLD signal can be sent to the parent manually, using the kill command. If the parent process still refuses to reap the zombie, the next step would be to remove the parent process. When a process loses its parent, init becomes its new parent. Init periodically executes the wait system call to reap any zombies with init as parent.

Introduction to threads

A process is a program in execution. A process generally includes the processors registers, program counter, process stack, which contains temporary data (such as method parameters, return addresses, and local variables), and a data section, which contains global variables

Threads are a way for a program to fork (or split) itself into two or more simultaneously (or pseudo-simultaneously) running tasks. Threads and processes differ from one operating system to another but, in general, a thread is contained inside a process and different threads of the same process share some resources while different processes do not.


 

Process

Threads

Address Space

Different

Shared

File descriptor ( Diffe

Different

Shared

Process instructions

Different

Shared

Signals and signal handlers

Different

Shared

current working directories

Different

Shared

registers

Different

Different

State(Blocked, Ready, Zombie)

Different

Different

Stack

Different

Diffreent

Thread ID

Different

Different

Signal Mask

Different

Different

POSIX standard supports the following functions for thread creation

·       int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void*), void *restrict arg);

·       int pthread_join(pthread_t thread, void **value_ptr);

The pthread_join function shall suspend execution of the calling thread until the target thread terminates, void pthread_exit(void *value_ptr);

The pthread_exit function shall terminate the calling thread and make the value value_ptr available to any successful join with the terminating thread.

SCHEDULING

Scheduling is a fundamental operating-system function. Good scheduling heuristics are important for optimal resource usage . For example, CPU is one of the primary computer resource. Thus, its scheduling is central to operating-system design.

Scheduling Criteria

Different CPU-scheduling algorithms have different properties and may favor one class of processes over another. In choosing which algorithm to use in a particular situation, we must consider the properties of the various algorithms. Many criteria have been suggested for comparing CPU-scheduling algorithms. The characteristics used for comparison can make a substantial difference in the determination of the best algorithm.

The criteria include the following:

·       CPU utilization

We want to keep the CPU as busy as possible. CPU utilization may range from 0 to 100 percent.

·       Throughput

If the CPU is busy executing processes, then work is being done. One measure of work is the number of processes completed per time unit, called throughput.

·       Latency or Turnaround time

From the point of view of a particular process, the a important criterion is how long it takes to execute that process. The interval from the time of submission of a process to the time of completion is the turnaround time Turnaround time is the sum of the periods spent waiting to get into memory, waiting in the ready queue, executing on the CPU, and doing I/O.

·       Waiting time

The CPU scheduling algorithm does not affect the amount of time during which a process executes or does I/O; it affects only the amount of time that a process spends waiting in the ready queue. Waiting time is the sum of the periods spent waiting in the ready queue. Response time: In an interactive system, turnaround time may not be a the best criterion.

·       Response time

Often, a process can produce some output fairly early, and can continue computing new results while previous results are being output to the user. Thus, another measure is the time from the submission of a request until the first response is produced. This measure, called response time, is the amount of time it takes to start responding, but not the time that it takes to output that response.

·       Turnaround time

The turnaround time is generally limited by the speed of the output device. We want to maximize CPU utilization and throughput, and to minimize turnaround time, waiting time, and response time. In most cases, we optimize the average measure.

However, in some circumstances we want to optimize the minimum or maximum values, rather than the average. For example, to guarantee that all users get good service, we may want to minimize the maximum response time. For interactive systems (such as time-sharing systems) minimizing the variance in the response time is more important than minimizing the average response time. A system with reasonable and predictable response time may be considered more desirable than a system that is faster on the average, but is highly variable. We will look in more details of the scheduling algorithms in the next class.

References

  1. Wikipedia: Signal(computing)
  2. Wikipedia: <signal.h>

3.     http://en.wikipedia.org/wiki/Zombie_process

4.     http://en.wikipedia.org/wiki/Thread_(computer_science)

5.     Operating System Concepts - Silberschatz Galvin