CS 111 Lec 6 Scribe Note

01/25/16

by Zhanyang Li




Operating System Organization Continued

Processes and Files

In Unix system, everything is a file or a process (In fact, processes are actually also files!).

In a physical machine that runs a bunch of processes, each of the processes thinks it has total control of the machine, that means, it thinks it owns the ALU, a set of registers, the memories and the I/O devices. However, how can all processes utilize all system resources while there is only limited resources?

In fact, each process runs on a virtual machine, rather than the real physical machine.

ALU ALU
Registers Registers *
Memory (RAM) Physical memory**
I/O devices (system call) Emulated I/O (kernel implimentation)
Virtual machine (pretended) Physical machine (real)

ALU

A process can use the ALU whatever way it wants, as if it owns the ALU. ALU, as a unit in charge of arithmetic computation, does not require any information to be stored on any media between two jobs. That is, the ALU is stateless. All processes can use the ALU by sending a collection of instructions to the ALU and waiting for the result.

Notice that when a context switch from process A to process B happens before the ALU returns the correct result to process A, the ALU will simply discard all computations it finished, rather than find a way to store the half-finished result. It is too clumsy to store rather than throwing away everything and restart!

Registers

The set of registers of a process is the same set of registers that resides inside the CPU, *only if the process is currently running. If a process is not running, its set of registers is stored somewhere in memory. The registers are loaded from memory into CPU when a process starts to run; register states are copied from CPU to memory when a process pauses running.

Memory

A process thinks itself having all the memories on a physical machine, however, that is not the case. That is only true when the current process is running and accessing the memory.

I/O

On a virtual machine, I/O devices are accessed via system calls. However on a real machine, the behavior of emulated I/O devices are determined by the kernel implimentation.


Kernel Memory

The kernel is the liar behind the scene who tells every process that the whole system belongs to that process only. In order to be successful enough that none of the processes can realize that it is fooled, the kernel has to maintain a table of lies (or process descriptors) so that his lies will never be noticed.

Process descriptor table

A process table is basically an array of process descriptors. Each process descriptor should store important data about a process. So what are exactly stored inside it?

File descriptor table (I/O)

File descriptor table is a structure stored in a process descriptor. In Linux/Unix, the file descriptor is of type "int"; in other operating systems, a typical file descriptor will be a "pointer to a struct fildes".

Each file descriptor actually points to another location indicating a file. A file descriptor will point to a structure that has a "tag" part and a pointer part. The "tag" part tells whether it is a pipe, and the pointer part points to the real location of the file buffer.




What can go wrong with pipes?

Named pipe

A named pipe is just like a regular pipe, but with a name.

It can be created by:
$ mkfifo /tmp/pipe -> a file
This pipe can be used by its name.

Problems with a pipe:




Orthogonality question

Suppose we have the following situation:
$ (rm bigfile && grep interesting) < bigfile
Will the grep command work?

The answer is yes. The reason is that files can be accessed via file descriptor table at a lower level than filenames. In other words, Linux/Unix allow to access a file without a name (think about pipe). Also, after the rm command, only the filename of a file is removed; all its data is still sitting on the disk. The file is completely wiped from disk after the program exited.




Signals

Signals are messy, complex and not so easy to use. However, signals are still a central part of an operating system, for the following reasons:

Uses of signals

Receive signals

Signals are received in kernel. Upon the delivery of a signal, the kernel may run a default subroutine, ignore a signal, or execute a user-defined procedure. That user-defined procedure is called a signal handler.

A user-defined handler can be related to a signal via a system call:
sighandler_t signal(int signo, (void)(*sighandler_t)(int))

This system call takes in an integer, indicating a signal number, and a function pointer to the signal handler function. Then it returns a sighandler_t as return value.

At a lower level, a signal handler can be invoked between any two instructions in assembly
code. For example:
		
	movl ...
			 <- if signal arrives, call sighandler
	addl ...
			 <- if signal arrives, call sighandler
	ret