Scribe: Christine Kuo
Sections: Orthogonality | Processes | Files(directory_files_and_attributes "/usr/src/kernel/fs")
result = (("ufs" 209361271 "eggert") ("README" 209361284 "eggert"))
This is slow now (in 24.2).
Here's why: The way Emacs implements this primitive is to call the following functions at the C level.
DIR* d = opendir("/usr/src/kernel/fs"); //ptr into directory while((dp = readdir(d, ...)) { //while data stx is not null char buf[1024]; strcpy(buf, "/usr/src/kernel/fs/"); //ending slash is impt strcat(buf, dp->d_name); //dp->d_name is "ufs" struct stat st; lstat(buf, &st); //convert st to Lisp form }
What's wrong? Let D = the length of the string "/usr/src/kernel/fs".
There is an overhead in calling strcpy, strcat on string.
Can partially rectify this by hoisting string functions above the loop.
DIR* d = opendir("/usr/src/kernel/fs"); char buf[1024]; strcpy(buf, "/usr/src/kernel/fs/"); dirlen = strlen(buf); while((dp = readdir(d, ...)) { strcpy(buf+dirlen, dp->d_name); struct stat st; lstat(buf, &st); //convert st to Lisp form }
The cost is O(D*N) = length of directory name * # of files in directory = O(N).
B/c each call to lstat passes in filename & interprets each directory; work proportional to D.
Instead of lstat, use fstatat, interprets paths relative to directory pointed to by extra file descriptor argument.
DIR* d = opendir("/usr/src/kernel/fs"); int fd = dirfd(d); char buf[1024]; strcpy(buf, "/usr/src/kernel/fs/"); dirlen = strlen(buf); while((dp = readdir(d, ...)) { strcpy(buf+dirlen, dp->d_name); struct stat st; fstatat(fd, dp->d_name), &st, O_SYMLINKS_NOFOLLOW); //fd points to directory //convert st to Lisp form }
Orthogonality = ability to decompose problem into independent axes that do not interfere with each other.
Orthogonality problem in classic Unix design:
The primitives used to access the file content should be orthogonal to the primitives used to specify the file name. In Emacs, the orthogonality was breaking down b/c file names were read out of a directory; this can make code slow or hard to read.
Goal is an orthogonal operating system!
Tools:
Virtualizable processor
Classic Unix model
Process = program in execution, running in an isolated domain = a virtual computer with a subset of abilities of the host computer it is running on.
How do we specify how processes work?
API for dealing with processes (for application programs) + associated intuition (as "data structures")
pid_t fork(); //pid_t is a handle for a process sys/types.h: typedef long pid_t; //This must be a signed integer type */
But if we just needed a process ID, we could just do:
pid_t e = 97
But this is different from:
pid_t c = fork(); //clones current process //yields the pid of the clone if parent process, 0 if you're the child, -1 if not enough resources if(c == 0) { //do what the child should do } else { //in the parent }
bool printdate(void) { pid_t p = fork(); if(p<0) return 0; if(p==0) { //in child, can run date command execvp("/usr/bin/date", (char**){"date", "-U", 0}); //only returns if fails exit(126); //nonzero exit status to indicate execution did NOT work //exit so child does not run parent code if execvp fails } //back in parent //if do return 1, parent & child run in parallel, printdate & date outputs interleaved int status; while(waitpid(p, &status, 0) < 0) continue; //continues if interrupted return WIFEXITED(status) && WEXITSTATUS(status)==0; //true if normal exit }
One way is to create process IDs on our own:
waitpid(97, &st, 0)
Zombie processes
init.c: //reap the zombies while(waitpid(-1, ~~~)) continue; //beware: -1 means wait for any child to exit
Signal alarm interrupt
while(waitpid(p, &status, 0)<0) continue; //waitpid would hang and never exit //want to insulate caller even if this happens in callee signal(SIGALRM, donothing); // void donothing(int sig) {} alarm(5); //please wake me up in 5 seconds & deliver a signal, will call a function asynchronously //BUGGY! If alarm returns before waitpid is called, would hang forever... if(waitpid(p, &status, 0) < 0 && errno==EINTR) { //date is looping, alarm went off //must get rid of subsidiary process kill(p, SIGINT); //sends Ctrl+C signal to child waitpid(p, &status, 0); //politely exit date zombie //if do not trust date (ill behaved), use SIGKILL, but avoid bc extreme & prevents clean up //usually bit vector keeps track of signals and processes }
Inside kernel, there is a process table indexed by process id that keeps process info:
exit status, zombie flag, parent pid, start loc of RAM, size of RAM, ip, registers, uid, gid
In kernel, to resume pid 97:
reti //exit kernel mode, enter user mode, set ip to 97's ip near top of stack
10 rows in process table all filled with info
If 3 is running, a lot of entries are junk (e.g. ip is not up to date)
If 10 is not running, it is a virtual process (info saved in process table)
How to represent inside a kernel?
Process table includes set # of file descriptors that point to info about opened file
Want orthogonality. Does fork(); clone file descriptors?