Scribe: Christine Kuo
Sections: Orthogonality | Processes | Files(directory_files_and_attributes "/usr/src/kernel/fs")
result = (("ufs" 209361271 "eggert") ("README" 209361284 "eggert"))
This is slow now (in 24.2).
Here's why: The way Emacs implements this primitive is to call the following functions at the C level.
	DIR* d = opendir("/usr/src/kernel/fs"); //ptr into directory
	while((dp = readdir(d, ...)) { //while data stx is not null
		char buf[1024];
		strcpy(buf, "/usr/src/kernel/fs/"); //ending slash is impt
		strcat(buf, dp->d_name); //dp->d_name is "ufs"
		struct stat st;
		lstat(buf, &st);
		//convert st to Lisp form
	}
What's wrong? Let D = the length of the string "/usr/src/kernel/fs".
There is an overhead in calling strcpy, strcat on string.
Can partially rectify this by hoisting string functions above the loop.
	DIR* d = opendir("/usr/src/kernel/fs");
	char buf[1024];
	strcpy(buf, "/usr/src/kernel/fs/");
	dirlen = strlen(buf);
	while((dp = readdir(d, ...)) {
		strcpy(buf+dirlen, dp->d_name);
		struct stat st;
		lstat(buf, &st);
		//convert st to Lisp form
	}
The cost is O(D*N) = length of directory name * # of files in directory = O(N).
B/c each call to lstat passes in filename & interprets each directory; work proportional to D.
Instead of lstat, use fstatat, interprets paths relative to directory pointed to by extra file descriptor argument.
	DIR* d = opendir("/usr/src/kernel/fs");
	int fd = dirfd(d);
	char buf[1024];
	strcpy(buf, "/usr/src/kernel/fs/");
	dirlen = strlen(buf);
	while((dp = readdir(d, ...)) {
		strcpy(buf+dirlen, dp->d_name);
		struct stat st;
		fstatat(fd, dp->d_name), &st, O_SYMLINKS_NOFOLLOW); //fd points to directory
		//convert st to Lisp form
	}
Orthogonality = ability to decompose problem into independent axes that do not interfere with each other.
Orthogonality problem in classic Unix design:
The primitives used to access the file content should be orthogonal to the primitives used to specify the file name. In Emacs, the orthogonality was breaking down b/c file names were read out of a directory; this can make code slow or hard to read.
Goal is an orthogonal operating system!
Tools:
Virtualizable processor
Classic Unix model
Process = program in execution, running in an isolated domain = a virtual computer with a subset of abilities of the host computer it is running on.
How do we specify how processes work?
API for dealing with processes (for application programs) + associated intuition (as "data structures")
pid_t fork(); //pid_t is a handle for a process sys/types.h: typedef long pid_t; //This must be a signed integer type */
But if we just needed a process ID, we could just do:
pid_t e = 97
But this is different from:
	pid_t c = fork(); //clones current process
	//yields the pid of the clone if parent process, 0 if you're the child, -1 if not enough resources
	if(c == 0) {
		//do what the child should do
	} else {
		//in the parent
	}
	bool printdate(void) {
		pid_t p = fork();
		if(p<0) return 0;
		if(p==0) {
			//in child, can run date command
			execvp("/usr/bin/date", (char**){"date", "-U", 0}); //only returns if fails
			exit(126); //nonzero exit status to indicate execution did NOT work
			//exit so child does not run parent code if execvp fails
		}
		//back in parent
		//if do return 1, parent & child run in parallel, printdate & date outputs interleaved
		int status;
		while(waitpid(p, &status, 0) < 0) continue; //continues if interrupted
		return WIFEXITED(status) && WEXITSTATUS(status)==0; //true if normal exit
	}
One way is to create process IDs on our own:
waitpid(97, &st, 0)
Zombie processes
init.c: //reap the zombies while(waitpid(-1, ~~~)) continue; //beware: -1 means wait for any child to exit
Signal alarm interrupt
	while(waitpid(p, &status, 0)<0) continue;
	//waitpid would hang and never exit
	//want to insulate caller even if this happens in callee
	signal(SIGALRM, donothing); // void donothing(int sig) {}
	alarm(5); //please wake me up in 5 seconds & deliver a signal, will call a function asynchronously
	//BUGGY! If alarm returns before waitpid is called, would hang forever...
	if(waitpid(p, &status, 0) < 0 && errno==EINTR) {
		//date is looping, alarm went off
		//must get rid of subsidiary process
		kill(p, SIGINT); //sends Ctrl+C signal to child
		waitpid(p, &status, 0); //politely exit date zombie
		//if do not trust date (ill behaved), use SIGKILL, but avoid bc extreme & prevents clean up
		//usually bit vector keeps track of signals and processes
	}
Inside kernel, there is a process table indexed by process id that keeps process info:
exit status, zombie flag, parent pid, start loc of RAM, size of RAM, ip, registers, uid, gid
In kernel, to resume pid 97:
reti //exit kernel mode, enter user mode, set ip to 97's ip near top of stack
10 rows in process table all filled with info
If 3 is running, a lot of entries are junk (e.g. ip is not up to date)
If 10 is not running, it is a virtual process (info saved in process table)
How to represent inside a kernel?
Process table includes set # of file descriptors that point to info about opened file
Want orthogonality. Does fork(); clone file descriptors?