Written by Theresa Tong
We can use a function call, to begin with. However, an arbitrary call can lead to the caller violating the contract, what it was supposed to do, and instead corrupting the callee's data as well. This can lead to the whole system crashing if a particularly bad call crashes.
Alternatively, instead of a call, we can make the function a part of the instruction set at the hardware level. The solves the problem of unexpected function calls, since we are dealing with the low level instruction calls, on the same level as addl or mov. While this could work on a personally designed system, it would be impossible to implement this instruction elsewhere unless you managed to convince the large companies that make hardware, like Intel, to add your instruction. In short, not likely.
So if we want the ability to add this functionality to a large range of machines, we likely have to choose another method. One of which is...
An interpreter is, simply put, a computer program that executes instructions. In short, we are emulating a machine with software to execute hardware instructions. Using one has its advantages and disadvantages.
The forth way is to virtualize the process completely. In short, to build a virtual machine. Software such as VMWare do exactly this, allowing users to run different types of OS and instructions with a single physical machine.
With a virtual machine, even if the applications do something they aren't supposed to, they can't corrupt the physical machine emulating the virtual computer. In this way, we have a layer of protection. The reason being that the application running the virtual machine does not have access to prvileged instructions.
There are two types of instructions, privileged and non-privileged instructions. Non-provileged instructions can be executed by any applications, and don't allow access to the underlying structure of a system, so even if sometihng goes wrong, the application can't damage the data of other applications or the OS. A privileged instruction, on the other hand, can only be executed in kernel mode, because it does have access to the data, scheduling, and any other functions the OS has.
Some examples of unprivileged instructions
1| mov 2| add 3| ret 4| mult
And some of privileged instructions
1| outb 2| inb 3| int 4| insl
We can actually use this setup to allow programs to tell the kernel to execute priveleged instruction for it as part of a function. When an application tries to execute and invalid instruction, it causes a trap. This is, in fact, how system calls work. A trap is an interrupt, which gives control to the kernel and the kernel then goes to the interrupt table to see what should be done. For example
pushl file int 0x80 add $3, sp
The int 0x80 instruction generates an interrupt, and the kernel would go to the table and look up the information at 0x80. The table contains pointers to code to be executed depending on what kind of interrupt was sent.
In addition, when after a trap is triggered, the current state of all the registers also have to be saved. This is because the trap may be because we are switching over processes, in which we need to save the registers of the current process for use later when the child process returns.
%eax system call # %ebx argument 1 %ecx argument 2 %edx argument 3 %esi argument 4 %edi argument 5 %ebp argument 6
In short, the application gives control over to the kernel, in this way protecting the rest of the system from any bugs in the application. However, if the kernel were to have a bug, then there's trouble, as it is running at the highest level already. Undefined kernel behavior could crash the machine.
We can think of the system as layered, where each layer can only access things in the same layer. In order for them to communicate with the other layers, they use system calls or instructions.
The kernel itself is broken down into memory management and I/O devices, both of which have their own system calls and instructions to communicate with the hardware and applications. This layer structure is for the x64. The x86 is slightly different. It uses a ring-structure (microkernel) format.
Individual processes see only thie own stacks, variables, etc, and each of them continue as if they were the only process running and had the entirety of all the resources available. In reality, this obviously isn't true, but using virtual memory it can pretend it does. In reality, the OS is in charge of swtiching between different processes and storing registers and other things in cache or memory each switch.
The processes talk to the kernel through calls with INT and vice versa with RETI. The kernal controls resources such as ALUs, registers, RAM, cache, I/O devices and so forth. When an application runs, it has it's own %eax register, but it shouldn't be able to see other application's %eax registers.
So then, what happens when we switch processes? Something similar to when a system call is used. The state of the current registers are saved and then put into a process descriptor table.
| %eip | %eax | registers... ----------------- 1| | --------------- 2| | --------------- 3| | --------------- 4| | --------------- 5| | ---------------
So after a process returns and it's time to switch back to the previous process, the kernel goes to the table and looks up what the registers were when that process was last running. Obviously saving the state of all registers guarantees accuracies, but if we were in the middle of time consuming computation, like those with floating points, saving all of the registers has a large overhead and eats up time and CPU. As such, usually only those registers needed for that system call be saved.
So how do we create new processes or destroy old processes?
Let's start with destroying processes, because that is the simplest.
1. Use the privileged instruction "halt" 2. exit(23) 3. _exit() <--this isn't recommended for use
This will kill the process and store the process descriptor in the table so that the exit status can be viewed by other processes.
The parent process can fork and spawn a child process. This child process can then execute whatever is needed, store its results in the register, and return. In the parent process, the returned value is the ID of the child process. Inside the child process itself, fork() returns 0. Or else it returns -1 if there was an error or not enough resources.
A process can also check on the exit status of other processes.
pid_t waitpid(pid_t pid, int *status, int options) <-pid is the ID of the proess waiting and stores its exit status into status.
Now, the problem is when a process finishes, but nothing ever calls wait on it. A process it such a state is called a zombie process; it has exited but still exists. A process continues to exist, just in case someone needs to view the exit status, until waitpid is called on it. However, it cannot be reclaimed and just consumes space if waitpid is never called. If the parent that spawned a process exits without ever waiting on a child process, what happens is the child is re-parented to process 1. Process 1 is the init process, which is automatically started at boot. There's a loop in init calls waitpid(-1, &i, 0). -1 is a special flag that means wait for ANY child process. This is known as reaping the zombies, so after this loops, all child processes should have exited properly.