CS 111 Scribe Note Lecture 4 (Fall 2014)

Goals

Ease of interface/use/simplicity: should be easy to use
Reliability/robustness/durability: should handle abnormal input, errors, situations.
Efficiency: should be efficient
Mutability/flexibility: change in one module does not affect the other modules
Security: protect user and user's data

Fundamental Abstractions for Systems

1. Memory API

Memory is the system component that remembers data values for use in computation. All memory devices fit a simple abstract model that has two operations, named WRITE and READ:

e.g.
    *p = v //write
    (*p)   //read

Issues to be considered:
- total size
- word size (e.g: x86 8-bit word): Memory is often accessed in fixed length, like bytes, words (a small integer number of bytes, typically 2, 4, or 8), lines (several words), and blocks (a number of bytes, usually a power of 2)
- speed (latency & throughput)
- volatility: A volatile memory consumes energy to store information, if its power supply is interrupted, it forgets its stored information. A non-volatile memory does not require energy to keep information stored.
- linear vs associative addressing: Linear addressing is addressing by its location, while associate addressing is addressing by the virtue of the data/content stored.
- coherence (atomic r/w access): Read/write coherence means that the result of the READ of a named cell is always the same as the most recent WRITE to that cell.

2. Interpreters API

Interpreters are the active elements of a computer system, they perform the actions that constitute computations. Examples: disk controller, Python.

e.g.
       v      =      f          (p)
       |             |           |
    answer     interpreter   program

Issues to be considered:
- Efficiency (hardware support coded)
- Interrupts: Interrupts catch the attention of the interpreter, rather than the program.
- Tey should be "saft"
Cons
- Require hardware support (virtualized processors)

3. Linked API - Message Passing API

A communication link provides a way for information to move between physically separated components.(e.g. I/O bus)

    send(link_name, buffer of data)
    recv(line_name, buffer of data)

Design OS as an Object-Oriented Program

Classes for I/O device, bus, Memory, Interpreter...Each fundemental parts, plus other functions(the rood not taken)
- Downsides:
- - C++ has more pointer tracing, more flexible and more complex for compliers, which may take a long time
Using Message Passing
- Network is abstract on way
- Kernel can be anywhere
- Downsides:
- - Buffer Storage: needs serialization (tough on pointers)
  - Performance issues (needs copy)
Using Link API
- Downsides:
- - requires HW support (virtualizable processor)

How Virtual Machine Work

ALU & Registers =>user code access actual ALU & Regs at full speed
Primary RAM =>want full speed access to some physical memory and defy access to other
I/O device =>very slow so that it is ok to make it a little slower
Other Resource: time => may let interpretors give up after 1s running

Layering

We divide OS into layers

Wedding Cake Diagram

Ring Diagram

A wedding cake/ring diagram shows system privilege layer.

Ring 0	kernel
Ring 1	devices
Ring 2	daemons
Ring 3	app

In Linux: Ring 0 ~ Ring 2 => Ring 0, Ring 3 = Ring 3

Instructions

Instructions can be divided into two sets:

1. Priviledged instructions

These are dangerous instructions, only kernel can execute them, applications cannot execute priviledged instructions. Applications ask the kernel to execute these intructions through system calls.

2. Unpriviledged instructions

Normal instructions, can be executed by applications.

Operating System & Virtualizable Processor

Let you support a process (namely, program) running in isolation (on a virtual interpreter). The program thinks of it as a standalone program running on a virtual machine.

In Unix-like systems, fork() is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. Fork is the primary (and historically, only) method of process creation on Unix-like operating systems.

To create a process:

  pid_t fork(void);       (in c/c++)

fork() clones the current process (registers, stacks, ...) fork() returns 0 in child process, and child process' process id in parent process, or -1 if fork() fails.

To destroy a process: (2 ways)

  (1) _noreturn void exit(int); // no return attribute
  //exit status is a 8-bit integer, 0~255
  //EXIT_SUCCESS = 0; EXIT_FAILURE = 1;
  (2) _exit(int);

(1) is a library function, it flushes output buffers.
(2) is a system call, it calls int 0x80 and kernel will take over and destroy the process.

They immediately terminates the current process, close all file descriptors belonging to the process, and return the exit status according to the given integer value.

Following is a code that shows how fork() works:

int main(void){
	pid_t p = fork();
	if (p == 0)
		printf("I'm a child\n");
	else
		printf("I'm a parent\n");
}

Suppose we have one physical CPU, and more than one processes. Each of these processes consider itself as the only process running on the system. How can the OS manages those processes at the same time? We use a process table (lives in memory) to store information for a process like: process id, register values (eip, esp, eax, ebx, ...), exit status, and so on. When the OS wants to switch to another process, the values in registers are stored into the process table. And when OS reruns a process, it first load all the register values from the process table and put them back into registers.

while calling fork() to create a child process:

Find an empty space in process descriptor table and get pid.
Copy parent's register & stack into child's register & stack in process descriptor.
write over parent->regs->eax = child pid
write over child->regs->eax = 0

How process scheduling works

Suppose a process does a read()

e.g.
    read(fd, buffer, 1000);

Then receive an interrupt signal: int 0x80
OS save current process in register (process table)
Resume some other process from the table

How does process access to memory?

Each process has its own view of memory; they may share some part of memory if there is not a collision.

How exit() works in a program

  pid_t p = fork();
  _exit(2); //SYSCALL

  int main(void){
    return 0;
  } //  => exit(0);

pid_t waitpid(pid_t p, int *status, int flag); //destructor for process
                    |         |           |
                pid of      location     normally, 0,
                waited-for  to store     WNOHANG
                process     its status

Return value: on success, returns the process id of the child process; if WNOHANG was specified and one or more child(ren) specified by pid exist, but have not yet changed state, then 0 is returned; if error occurs, -1 is returned.
Stores the exit status of the child process to *status.
For the flag, 0 means that the parent process has to wait for the child process to be finished, while WNOHANG means do not wait.
Any child process that exists becomes a zombie process, exit() doesn't remove that process' entry from process table. The finished process will wait until waitpid() is called on it, then its entry in the process table is cleaned up and can be reused by another process.

A fork() process may work in this way...

    fork();          //- parent process
      |
   run_child();      //- child process
      |
    exit();          //- child process
      |
   waitpid();        //- parent process
      |
   run_parent();     //- parent process
      |
    exit();          //- parent process

A fork() BOMB!

linear:

while (fork() == 0)
  continue;

parent -> child -> grandchild ->......

tree:

while(true)
  fork();

parent  -> child   ->  gradchild
                   ->     ...
        ->  child  ->     ...

It will return -1 with an errno(..) once the process table is full.

Pipe

-> Bounded queue of bytes stored in the kernel memory

To invoke the pipe syscall:

int pipe(int fd[2])
              |
           two file descriptors (typically 1 read 1 write)

the reading process hangs if no data yet
A file descriptor (FD) is an abstract indicator for accessing a file, an integer in Linux

CS 111 Scribe Notes Lecture 4 (Fall 2014)

OS Organization

Prepared by Yanbin Ren, Yijia Liu, Qiuhan Ding, Zhaoxing Bu