CS 111 Lecture 4 Scribe Notes (Winter 2012)

Prepared by Mary Chau, Tarry Chen, and Jessica Kain for a lecture given by Professor Paul Eggert on January 23, 2012

Table of Contents

  1. Hard Modularity
    1. Client/Service Organization
    2. Virtualization
    3. Virtualizable Processor
  2. Application’s Point of View
  3. Hardware Trap
  4. Virtualization is Only One-Way Protection
  5. Why Would a Process Not Run?
  6. Ways for Applications to Create and Destroy Processes

Hard Modularity

Hard modularity is obtained by breaking the system into separate modules where the modules cannot violate interface boundaries.

Unfortunately, we cannot simply use function calls to form interface boundaries. Instead, there are two methods that are commonly used in which hard modularity is accomplished: Client/Service Organization and Virtualization.

Client/Service Organization

If properly implemented, this structure allows us to achieve parallelism by taking each module and placing them onto separate computers. With the modules completely separated from each other, this ensures interface protection as well as an improvement in performance. To exemplify this organization, we will use the factorial example that was introduced to us in previous lectures. It should be noted that there is no parallelism in the below code since the client does not run another process while waiting for the server to return.

An example of a client/server version of Factorial:
Client code (Caller):

send(fact_name, (m) {"!", 5});  // m is the message type
a = get_response(fact_name);
if (a.response_code = = OK)
        print(a,val);
else
        return error();

Server Code (Callee):

for(;;){
        receive(fact_name, request);
        if(request.opcode = = "!"){
                n = request.val;                // compute n!
                response = (m){OK, n!}
        }
        else
                response = (m) {NG, 0};
        send(fact_name, response);
}

Summary of the advantages/disadvantages?

Advantages Disadvantages
+ hard modularity: client and server protected from each other -performance is bad (ie. more CPU cycles)
+ client and server can be on different hosts -more complicated
While some operating systems are built with the client/service organization, most are actually built with method II, virtualization.

Vitualization

A virtual machine uses one real machine to create another machine and emulate it. This method is used to protect the trusted code in the primary machine by placing and executing untrusted code in the "virtual machine".

How to go about this? Start by writing an x86 emulator, which will allow us to emulate the actions of another computer using our own, right down to the assembly level. A quick note though: The source code to this emulator is not exact and you have to "hand wave" bits and parts of it.
Source code to emulator:

int emul(int start_ip){
        int ip;
        for(;;){
                char ins = mem[ip++];
                switch(decode(ins)){
                        case pop:
                
        }
}
We can then add the factorial instruction to our emulator and run the designated program inside the emulator. From here, the client would be located in the emulated program while the server would be located in the actual emulator. Due to this separation, the client is never in control. In C, this can be done as follows:
r = emul(fact's code addr)
switch(r){
        case STACK OVERFLOW:
                ...
        case TOO_MANY_INSNS:
                ...
        case OK:
                get return value of 'fact'
}
emulation_diagram
Figure 1
A representation of the relationship between a real machine (R) and a virtual process (v), which confines the disastrous code and prevent it from messing up the real machine.

What are the advantages/disadvantages?

Advantages Disadvantages
+ Hard modularity: virtualized code is in a safe box - Extremely slow due to the emulation process
+ Real and virtual machines do not need to have the same architecture - More CPU cycles are necessary to run any virtualized process

Virtualizable Processor

We can solve some of the performance problems with emulation by using processors that support virtualization. These processors give you control over what an emulated processor can do. In this particular scenario, we assume the emulated machine's architecture is similar enough to the real machine's architecture such that they can use the same processor. Some points about virtualizable processors.
  1. The real machine needs to take over when the virtual machine issues "privileged instructions" such as 'halt', 'inb', 'outb', 'int', etc.
    These "privileged instructions" lets you break out of the virtual machine into a real machine which we don't want to happen!
  2. The real machine takes over after a time interval.
  3. Limiting memory access to locations in the virtual machine.
Otherwise, run at full speed. So far, we have been looking at things from the kernel's point of view. Now, let's look from the applications point of view.

Application View of Virtualization

An example of a normal computation is as follows:
        a = b*b - c*c;          // full speed

An example of a system call is as follows:
        write(1, "helloworld \n", 13);

If a program tries to issue a system call, the following registers will be loaded with the values shown below:
%eax    system call #
%ebx    argument 1
%ecx    argument 2
%edx    argument 3
%esi    argument 4
%edi    argument 5
%ebp    argument 6

Hardware Traps

In general, a hardware trap occurs when a privileged instruction is called which then passes control to the emulator. When the hardware trap is performed, the above values are pushed onto the stack. Using these hardware traps offers more transfer of control protection.
To implement the write function via a deliberate crash, need to call a privileged instruction! More specifically, use the interrupt instruction, INT. For the implementation of the write function, we need to generate the necessary machine code. NOTE: The following implementation is "hand-waved".
ssize_t
write (int fd, char const *buf, size_t bufsize){
        ...
        asm("int 128");
}
INT 128 causes a hardware trap and pushes onto R's stack the following values:
ss:             stack segment
esp:            stack pointer
eflags
cs:             code segment
eip:            instruction pointer
in_trap_diagram
Figure 2
Shows how Intrap Service Routine is called by INT 128.

RETI vs. RET

The code in the real machine will invoke RETI. More specifically, ISR returns RETI and it is a more heavy weight instruction than RET. RET runs much faster than RETI.

Important Notes:

Virtualization is Only One-Way Protection

Virtualization, with typical virtualizable hardware, is much faster than the client/server OS model. Modern day operating systems are designed such that each process is independent to each other and each has its own "virtual machine". The process sees its own view of the stack, memory, variables, etc. The OS then takes care of how the real machine handles all these little virtual machines have access to its resources, such as ALU, registers, primary memory (RAM), and I/O registers. The ALU and registers have direct access to the virtual machine whereas primary memory (RAM) only has selective direct access and the I/O registers have indirect access.

General OS Architecture
virtualization1

Figure 3
The User Level contains N number of isolated processes that communicates with the Kernel Level through instructions such as INT and RETI. At the Kernel Level is the Real Machine that contains matter such ALUs, registers, RAM, and I/O registers. Each process has direct access to the ALUs and the Registers, selective direct access to the Primary Memory (RAM) and indirect Access to the I/O registers. Selective direct access means that it depends on the kernel to decide whether the process can have direct access to the Primary Memory. Indirect Access means that the data will never be directly accessed by any process at the User Level.

Traditional Linux Layering:
virtualization2

Figure 4a
A representation of the organization of privileged instructions in relation to the application level, kernel level and hardware level. Noted in the traditional Linux layering, privileged instructions such as LP, ADD, and ST are available for the application while others like HALT, INB, and OUTB can only be called by the applications through the kernel using system_call(INT). Privileged instructions that were available to application are also in the available in the kernel level. The thickened line between the application layer and the kernel and hardware layer specifies how ABI (Application Binary Interface) is defined.

Modern Linux Layering:
virtualization3

Figure 4b
A representation of the modern Linux layering, in which privileged instructions (such as INB and OUTB) that were only available at the kernel level can now be called at the application level as well. The thickened line between the application layer and the kernel and hardware layer specifies how ABI (Application Binary Interface) is defined. In contrast to the ABI of the transitional Linux layering, the ABI of the modern Linux layering is more complicated due to the increased number of the instruction available to the application layer.

We can have multiple layers of abstraction too! x86 has four layers.

virtualization4

Figure 5
A representation of x86 machines’ the multi-layering.

Why Would a Process Not Run?

process_table
Figure 6
A visualization of how process table is represented in the kernel.

On a single core machine, at most one process can run, and as such, we need to make sure that functions that go between switches of different process contexts are as efficient as possible. For example, the getpid() function should only save to the %eax register and does not need to go through a full context switch (saving state of other registers, etc.), as the function will return to the same process that called it.
The caveat to this is that functions that require the protection of memory is trickier, but for now we will skip this. We will address this later in the quarter.

Ways for Applications to Create and Destroy Processes

To Destroy:
  1. A process can issue a privileged instruction 'halt'
  2. exit(23)
This terminates the process and stores the process descriptor so that the other processes can view its exit status.

To Create:

p = fork();
This clones the process and has different behavior depending on the conditions. You can have one process inquire about another process's exit status as well:
pid_t waitpid (pid_t p; int *status, int options){
        int i;
        q = waitpid (p, &i, 0);
}
This code: