CS111 Lecture 4 Scribe Notes

CS111 Spring 2014

By: Karan Kajla and Marko Vojvodic

How to Organize an OS

We can organize the OS as an Object Oriented (OO) system by choosing a OO language (Java, C++, Objective C, etc) to implement it.
- However, this is not a good way to create an OS. Why?:
  - Such OO languages are too heavyweight and slow for an efficient OS.
Solution: The Linux kernel is written in C and applications are written in whatever language the developer chooses.
- This is because safety and security are the primary concerns when creating an OS.
- Additionally, using an OO design does not provide the OS with hard modularity, which leaves doors open for security vulnerabilities and bugs.
Two Approaches to Hard Modularity:
- The Client Server Approach
- Virtualization
  - Virtualization allows the OS to protect the hardware by only running applications on virtual machines (sandbox).
  - Sandbox analogy: Applications are like children, and the OS is like the parents. The children are only allowed to play in the sandbox, which is surrounded by a wall, preventing them from leaving into the harmful world.
Implementing Virtualization:
- The simplest approach to implementing virtualization in an OS is to use an interpreter.
  - The interpreter should implement all of the instructions (in most cases x86) that are needed to run any application. This means that the interpreter should implement just the basics.
- An example of a virtual instruction pointer implementation:
  
  int main (void)
  {
  
  int ip = 0;
  
  for(;;) {
  
  switch(mem[ip++]) {
  
  case ... :
  
  .
  
  .
  
  .
  
  }
  
  }
  }
- There are several advantages to using an interpreter:
  - Applications can be designed for another hardware and they will still be able to run properly on the OS.
  - Applications can't escape their sandboxes so the OS stays secure.
  - There can be several virtual machines running on a single physical machine (this complicates things in the future), and the interpreter is always in control over the applications.
- However, it is a trade off, so there are also many disadvantages to using interpreters:
  - Implementing an OS using an interpreter is slower (by up to 10x if implemented naively, and up 1.2x if done properly and efficiently).
- One of the ways we can implement virtualization through hardware is by utilizing a Virtualizable Processor.
  - A virtualizable processor is a processor designed to allow virtualization.
  - The processor supports all of the non-privileged (ordinary) instructions (which are run at full speed).
  - The processor also supports the privileged instruction set:
    - These privileged instructions will not be able to run unless they are being run in privileged mode (kernel mode) and will result in a crash if not executed from privileged mode.
    - The way the execution modes are designated in the processor is through an extra register (default value: 1) which denotes whether the instructions are being executed in privileged mode or not.
  - Having this distinction between the processors modes create hard modularity in the OS.
  - How can we implement these processor modes in the software?
    - Can we partition the memory locations into privileged and non-privileged?
      - No, because non-privileged applications can simply jump to the privileged locations in order perform privileged instructions.
    - Solution: We can have specific instructions to set and clear the processor register which denotes privilege?
    - In this case, we need to have Protected Transfer of Control between application and kernel.
      - This means that no application can execute privileged code.
      - We want to be able to securely and efficiently transfer control to the OS (kernel) whenever we want to execute privileged code.
    - What happens when an unprivileged application executes a bad unprivileged instruction?
      - The OS will execute a trap, stating that the instruction the application is attempting to execute cannot be executed.
      - There are 256 types of traps (which are stored entries in a trap vector) that designate the error and have a pointer to the machine code which executes in privileged mode after the trap.
    - The convention in Linux is to execute a trap (execute an invalid instruction deliberately) in order to get the kernel's attention.
    - INT(char)
      - Interrupt takes a single byte argument which allows you to generate any of the 256 available traps.
      - In Linux, 128 (0x80) is used to denote a non-malicious interrupt used to call the kernel's attention.
    - Executing an Interrupt:
      - Before the interrupt is executed, register %eax must contain the number designated to the system call you want to invoke.
      - %ebx must contain the first argument to the system call(%ecx, %edx, %esi, %edi, %edp, etc. must contain the other arguments if the function has any).
      - The system call will eventually return a value in the %eax register.
    - Interrupt: A Hardware Perspective
      - When a trap is invoked, the hardware pushes a stack segment (ss), the user stack pointer (esp), eflags (which indicate privilege level), the code segment (cs), the instruction pointer (eip), and the errorcode.
      - Then the RETI (privileged) instruction is executed, which returns from an interrupt by doing the inverse of what is mentioned above and then resumes the process that was executing prior to the interrupt (returns from kernel mode to user mode).
Representing the Kernel Structure
- There are several levels of the kernel and two primary kernel structure types.
OS Implementation with a Virtualizable Processor
- We can use a process: a program in execution in an isolated domain.
  - Can a process be a stand-alone (running without an operating system) program running in a virtual machine (sandbox)?
  - What could go wrong if we implemented processes this way?
    - A malicious program could issue only non-privileged instructions and never run a system call, so the system will never trap and never surrender control to the kernel (the trusted program).
    - This would allow the program to essentially take control of the CPU forever.
    - Solution: We can change the hardware so that a trap occurs every so often (10ms) so that a process cannot constantly maintain control of the CPU.
    - The program could also access secrets (memory) of another program which is dangerous.
    - Solution: We can use virtual memory to map the memory of different processes to different physical memory locations. This will prevent processes from accessing each other's memory.
    - The program could even access I/O devices (disk, memory, etc) and corrupt or overwrite information.
    - Solution: We can make inb, insl, etc. privileged instructions so that they can only be executed in kernel mode. This will introduce the cost of context switching.
  - Process Organization:
    - Each process will have its own virtual registers %eax, %esp, etc. which will be the same as real (physical) registers when the process is actually running and will need to be saved when the process is not running.
    - Each process will also need to have its own allocation of memory which is associated with it.
    - These process attributes are kept track of in a process table which lives somewhere in memory.
    Process Table
- Context Switching
  - The Kernel can perform context switches by copying the appropriate values for the process to be run into the proper table entries and then storing the appropriate values for the process being stopped into memory.
  - Applications should also be able to create a new process table entry (fork). This must be done using system calls.
  - POSIX/UNIX System Calls to Manipulate the Process Table:
    - Noreturn void _exit(int): This function is used when the current process no longer needs to run. It is deleted from the process table.
    - pid_t fork(void): This function creates a new entry in the process table which looks like the process that called the function.
      - Fork will return twice: once in the parent process context and once in the child process context.
      - The return value is the pid of the child process in the parent process (success), 0 in the child process (success), and -1 in both contexts if a child process cannot be forked.