CS 111Scribe Notes for 4/13/10by William Lu and Adam LeWinterOrthogonalityWhy orthogonality?Why orthogonality? Orthogonality is important because we want an interface that is simple, complete (specify any point in 3D space), and combinable (being able to mix any combination of interfaces and have the system still work). You can think of orthogonality as represented by the image below. The x-axis represents file API, the y-axis represents process such as fork, wait, etc., and the z-axis represents memory. Any combination should make sense. Example:Choice 1: How Mechanisms Can Access (Model) OS ResourcesA) You can think of the OS resource like an object. Application deals with references to those objects rather then the object itself.In C: struct pte {...} //process description The upside to this is that the structure is simple and fast. The downside is that there is no protection against bad user programs and they can trash data structures. Race conditions also emerge because two program can modify the same structure inconsistently. B) The OS resource can be referenced by an integer number. pid_t (some integer type) This method uses opaque identifiers and is safer than A because the OS must interpret the identifiers. It is also a more flexible approach because you don't need to change applications every time you make a change to the kernel. However, this method is slower and much more complex. Example: int fd = open("/dev/null", O_RDONLY, 0); The system call open has th following description: int open (const char *pathname, int oflag, mode_t mode); The system call open returns a file descriptor that has all the information you need about the file you just opened. The integer value represents an index on a table of open files for the current process. The file descriptor can be modeled as follows: So, using the above open call and image, our entry in the table has file descriptor number 17 and would have read only permissions. That means when we call any system call on the file such as read(17,buf,bufsize); , the OS would go to the process table->process descriptor->file descriptor table->file descriptor to get the information on what to do (in this case return EoF because of the "/dev/null" directory always returns EoF on a read call) and execute. There are a few flags that can be used in the open call to specify permissions on the file.
All the flags can be logical OR together to get unique integer patterns to tell the file what to do. When you do a Side Noteint open(char const *name, int flags, ...); The ... means first two arguments must be defined type specified, but then caller can pass whatever it wants. You have to specify the correct type and order as the callee or the program will crash. umaskWhen the caller asks for a permission in a system call, it is done in an octal number. The umask is defined in the process descriptor table and is a per process operation (different for each process) and logical AND's the umask 1's compliment with the octal number sent by the caller to set the file permissions. The umask can ONLY take away permissions, not add them. umask is an important protection mechanism because you need to create the file with the correct permissions from day 1. If you change the permissions of the file later and the file is already open by another call, the permissions for the open file do not change. umask gets its default value from the process it was created from. The original definition is created for process 1 by the kernel. You can change the umask for each process by accessing it in the process descriptor table.Example umask = 022; caller asks for 777; The process will have permissions 755 which amounts to rwxr-xr-x. This means the owner is the only one that can write to the file and that group members and other users can only read and execute the file. ProcessesProcesses are also controlled by syscalls just like files have open to create them and close to destroy them, processes have fork to create and exit to destroy them.
Files: Process syscalls:pid_t getpid(void); //return your own pidpid_t getppid(void); //return your parent's pid pid_t waitpid(pid_t p, int * status, int option); //wait on your child's process to finish waitpid returns child process id or -1 on failure waitpid also takes in p (process id of the child), the pointer to where the exit status will be, and option (ex. WNOHANG) Note: we only allow waiting on child process to avoid deadlock int execvp(char const *file, char* const *argv) the argv argument holds a char array of arguments (ex char*[] {"date", "-u", NULL}) the function always returns -1 if it returns. This is because if the function returns it has failed (file is not available or arguments aren't recognized) If the function succeeds it blows everything away, your global variables, local variables, registers, everything and starts a new process given by the file and the commands passed in the argv Now lets try to create an example function that sorts inputs and outputs Here's a start:
int sortio(void)
execvp("/bin/sort", (char*[]) {"sort" , NULL});
}
This is not going to work because it blows everything away and we still want to keep our current process. We conclude we want the function call to execvp in the child process
Now that we have the sortio function lets take a look at the sort function itself. Well the parts of the code that actually called by execvp inside sort.c: the call points to some assembly code just before the main of sort.c, this area is called crt0 Then it pulls the args from the previous call (this is why you never want to have too many arguments kernel has to copy all of them).
int main(int argc, char**argv){
.
.
.
return 0; //this is actually the argument to the exit call
}
there are many different ways to exit a process _exit(n); //notice this underscore this is a quick exit don't clean up just exit exit(n); //this is a clean exit, clean up your memory then exit (ex. flush output buffer) Now onto forking a fork clones the current process (now referred to as the parent process) the child has all the properties of the parent except for: -pid -ppid -file descriptions are shared and their file descriptors are copied -accumulated execution times -file locks -pending signals (ex. ctrl+c) exec actually destroys the current process except for the stuff mentioned above Fork and Exec are essentially opposites of one another, but at the same time exec is almost always called after fork. This has spawned a new school of thought that actually combines the two fork+exec to create a new function spawnvp. Windows was the one to adopt this school of thought, and this is one of the big reasons why porting code from linux to windows drastically reduces performance. |