Scribe: Christine Kuo
Sections: Orthogonality | Processes | Files(directory_files_and_attributes "/usr/src/kernel/fs")
result = (("ufs" 209361271 "eggert") ("README" 209361284 "eggert"))
This is slow now (in 24.2).
Here's why: The way Emacs implements this primitive is to call the following functions at the C level.
DIR* d = opendir("/usr/src/kernel/fs"); //ptr into directory
while((dp = readdir(d, ...)) { //while data stx is not null
char buf[1024];
strcpy(buf, "/usr/src/kernel/fs/"); //ending slash is impt
strcat(buf, dp->d_name); //dp->d_name is "ufs"
struct stat st;
lstat(buf, &st);
//convert st to Lisp form
}
What's wrong? Let D = the length of the string "/usr/src/kernel/fs".
There is an overhead in calling strcpy, strcat on string.
Can partially rectify this by hoisting string functions above the loop.
DIR* d = opendir("/usr/src/kernel/fs");
char buf[1024];
strcpy(buf, "/usr/src/kernel/fs/");
dirlen = strlen(buf);
while((dp = readdir(d, ...)) {
strcpy(buf+dirlen, dp->d_name);
struct stat st;
lstat(buf, &st);
//convert st to Lisp form
}
The cost is O(D*N) = length of directory name * # of files in directory = O(N).
B/c each call to lstat passes in filename & interprets each directory; work proportional to D.
Instead of lstat, use fstatat, interprets paths relative to directory pointed to by extra file descriptor argument.
DIR* d = opendir("/usr/src/kernel/fs");
int fd = dirfd(d);
char buf[1024];
strcpy(buf, "/usr/src/kernel/fs/");
dirlen = strlen(buf);
while((dp = readdir(d, ...)) {
strcpy(buf+dirlen, dp->d_name);
struct stat st;
fstatat(fd, dp->d_name), &st, O_SYMLINKS_NOFOLLOW); //fd points to directory
//convert st to Lisp form
}
Orthogonality = ability to decompose problem into independent axes that do not interfere with each other.
Orthogonality problem in classic Unix design:
The primitives used to access the file content should be orthogonal to the primitives used to specify the file name. In Emacs, the orthogonality was breaking down b/c file names were read out of a directory; this can make code slow or hard to read.
Goal is an orthogonal operating system!
Tools:
Virtualizable processor
Classic Unix model
Process = program in execution, running in an isolated domain = a virtual computer with a subset of abilities of the host computer it is running on.
How do we specify how processes work?
API for dealing with processes (for application programs) + associated intuition (as "data structures")
pid_t fork(); //pid_t is a handle for a process sys/types.h: typedef long pid_t; //This must be a signed integer type */
But if we just needed a process ID, we could just do:
pid_t e = 97
But this is different from:
pid_t c = fork(); //clones current process
//yields the pid of the clone if parent process, 0 if you're the child, -1 if not enough resources
if(c == 0) {
//do what the child should do
} else {
//in the parent
}
bool printdate(void) {
pid_t p = fork();
if(p<0) return 0;
if(p==0) {
//in child, can run date command
execvp("/usr/bin/date", (char**){"date", "-U", 0}); //only returns if fails
exit(126); //nonzero exit status to indicate execution did NOT work
//exit so child does not run parent code if execvp fails
}
//back in parent
//if do return 1, parent & child run in parallel, printdate & date outputs interleaved
int status;
while(waitpid(p, &status, 0) < 0) continue; //continues if interrupted
return WIFEXITED(status) && WEXITSTATUS(status)==0; //true if normal exit
}
One way is to create process IDs on our own:
waitpid(97, &st, 0)
Zombie processes
init.c: //reap the zombies while(waitpid(-1, ~~~)) continue; //beware: -1 means wait for any child to exit
Signal alarm interrupt
while(waitpid(p, &status, 0)<0) continue;
//waitpid would hang and never exit
//want to insulate caller even if this happens in callee
signal(SIGALRM, donothing); // void donothing(int sig) {}
alarm(5); //please wake me up in 5 seconds & deliver a signal, will call a function asynchronously
//BUGGY! If alarm returns before waitpid is called, would hang forever...
if(waitpid(p, &status, 0) < 0 && errno==EINTR) {
//date is looping, alarm went off
//must get rid of subsidiary process
kill(p, SIGINT); //sends Ctrl+C signal to child
waitpid(p, &status, 0); //politely exit date zombie
//if do not trust date (ill behaved), use SIGKILL, but avoid bc extreme & prevents clean up
//usually bit vector keeps track of signals and processes
}
Inside kernel, there is a process table indexed by process id that keeps process info:
exit status, zombie flag, parent pid, start loc of RAM, size of RAM, ip, registers, uid, gid
In kernel, to resume pid 97:
reti //exit kernel mode, enter user mode, set ip to 97's ip near top of stack
10 rows in process table all filled with info
If 3 is running, a lot of entries are junk (e.g. ip is not up to date)
If 10 is not running, it is a virtual process (info saved in process table)
How to represent inside a kernel?
Process table includes set # of file descriptors that point to info about opened file
Want orthogonality. Does fork(); clone file descriptors?