In Unix system, everything is a file or a process (In fact, processes are actually also files!).
In a physical machine that runs a bunch of processes, each of the processes thinks it has total control of the machine, that means, it thinks it owns the ALU, a set of registers, the memories and the I/O devices. However, how can all processes utilize all system resources while there is only limited resources?
In fact, each process runs on a virtual machine, rather than the real physical machine.
ALU | ALU |
Registers | Registers * |
Memory (RAM) | Physical memory** |
I/O devices (system call) | Emulated I/O (kernel implimentation) |
Virtual machine (pretended) | Physical machine (real) |
---|
A process can use the ALU whatever way it wants, as if it owns the ALU. ALU, as a unit in charge of arithmetic computation, does not require any information to be stored on any media between two jobs. That is, the ALU is stateless. All processes can use the ALU by sending a collection of instructions to the ALU and waiting for the result.
Notice that when a context switch from process A to process B happens before the ALU returns the correct result to process A, the ALU will simply discard all computations it finished, rather than find a way to store the half-finished result. It is too clumsy to store rather than throwing away everything and restart!
The set of registers of a process is the same set of registers that resides inside the CPU, *only if the process is currently running. If a process is not running, its set of registers is stored somewhere in memory. The registers are loaded from memory into CPU when a process starts to run; register states are copied from CPU to memory when a process pauses running.
A process thinks itself having all the memories on a physical machine, however, that is not the case. That is only true when the current process is running and accessing the memory.
On a virtual machine, I/O devices are accessed via system calls. However on a real machine, the behavior of emulated I/O devices are determined by the kernel implimentation.
The kernel is the liar behind the scene who tells every process that the whole system belongs to that process only. In order to be successful enough that none of the processes can realize that it is fooled, the kernel has to maintain a table of lies (or process descriptors) so that his lies will never be noticed.
A process table is basically an array of process descriptors. Each process descriptor should store important data about a process. So what are exactly stored inside it?
Register states need to be saved per process in order to restore execution
ALU state does NOT need to be saved because of the reason mentioned above
Program memory does NOT need to be saved. Instead, a pointed to the memory location is saved in one register
File descriptor table needs to be saved in order to control the files of a process
File descriptor table is a structure stored in a process descriptor. In Linux/Unix, the file descriptor is of type "int"; in other operating systems, a typical file descriptor will be a "pointer to a struct fildes".
Each file descriptor actually points to another location indicating a file. A file descriptor will point to a structure that has a "tag" part and a pointer part. The "tag" part tells whether it is a pipe, and the pointer part points to the real location of the file buffer.
In this case, the pipe buffer cannot accept more data until a reader reads the data from the buffer.
Possible solutions:Make buffer bigger
This causes another problem: what if we have a lunatic write who never stops? Our memory can be exhausted!
Discard new data / discard old data
Both approaches can silently cause trouble because of loss of data. However, if we have to choose between the two, it is easier to throw away new data!
Suspend / hang the writer until buffer frees
This is the approach adopted by Linux/Unix. However, there's still a problem in this case: what if nobody's ever reading from this pipe? The writer will be suspended indefinitely.
Kill the writer (SIGPIPE)
Extreme approach! Optionally, we can ignore SIGPIPE, and that will cause the write system call to fail with errno set to ESPIPE.
The reader will be suspended. If it is detected that no one will ever write to this pipe, the read system call will return an EOF.
A named pipe is just like a regular pipe, but with a name.
It can be created by:
$ mkfifo /tmp/pipe
-> a file
This pipe can be used by its name.
Create a pipe but never use it
When a user creates a named pipe, the kernel has no way to determine whether a user will ever use the pipe or not. That could possibly lead to a "file descriptor leak" (leaking a pipe).
Deadlock
Deadlock is a circular pattern with read/write operations that will induce a situation where each process is waiting for some other processes to finish working first.
Assume we have a simple parent process that writes to sed via a pipe, p1, at the same time, it is reading from sed via another pipe, p2. On sed's part, it sends three characters into the writing end of p2 for each character it reads from p1. If both pipes have the same buffer size, the deadlock situation will occur.
First, when the program writes, sed will read from p1; at the same time, it will write to p2. Notice that our parent process is not reading at this time.
Then, when the buffer in p2 is full, sed will hang, stop reading from p1;
Finally, both p1 and p2 will have their buffer full, producing a situation which both ends of the processes are suspended. A deadlock occured.
Suppose we have the following situation:
$ (rm bigfile && grep interesting) < bigfile
Will the grep
command work?
The answer is yes. The reason is that files can be accessed via file descriptor table at a lower level than
filenames. In other words, Linux/Unix allow to access a file without a name (think about pipe). Also, after
the rm
command, only the filename of a file is removed; all its data is still sitting on the disk.
The file is completely wiped from disk after the program exited.
Signals are messy, complex and not so easy to use. However, signals are still a central part of an operating system, for the following reasons:
Asynchronous I/O (aread(), get SIGIO)
Error in code (divided by zero, FP overflow, invalid instruction)
In a big program that possibly problematic, we don't want a total crash for every little bug!
Impatient user typing in Ctrl+C / infinite loop (SIGINT)
Inpending power outage (SIGPWR)
Notify parent process when child exits (SIGCHLD)
(instead of using p = waitpid(-1, &status, WNOHANG)
<- bad!)
User left (SIGHUP)
In situation like this:
while (fork())
continue;
We need a way to terminate this nasty process (SIGKILL)
React in a fixed amount of time
alarm(20);
<- (SIGALRM in 20 seconds)
$ kill -SIGKILL [process #]
$ kill -STOP [process #]
and $ kill -CONT [process #]
Signals are received in kernel. Upon the delivery of a signal, the kernel may run a default subroutine, ignore a signal, or execute a user-defined procedure. That user-defined procedure is called a signal handler.
A user-defined handler can be related to a signal via a system call:
sighandler_t signal(int signo, (void)(*sighandler_t)(int))
This system call takes in an integer, indicating a signal number, and a function pointer to the signal handler function. Then it returns a sighandler_t as return value.
At a lower level, a signal handler can be invoked between any two instructions in assembly
code. For example:
movl ...
<- if signal arrives, call sighandler
addl ...
<- if signal arrives, call sighandler
ret