CS 111 Lecture 5 Scribe Notes

By Eric Shih and Mark Ketenjian

Definition of Orthogonality

What is orthogonality? According to Wikipedia, orthogonality is “a system design property which guarantees that modifying the technical effect produced by a component of a system neither creates nor propagates side effects to other components of the system.”

In other words, we should decompose our problem into independent axes which don’t interfere with each other.

One of our goals in creating an operating system is to maintain orthogonality as best as we can.


Problems with Orthogonality

GNU Emacs directory–files-and-attributes function:

In the newest version of the GNU Emacs file system, there is an inefficiency with how files are looked up (e.g. reading from "/usr/src/kernel/fs") using this function.

When we want to access the file, we call

DIR *d = opendir(“/usr/src/kernel/fs”);
char buf[1024];
strcpy(buf,”/usr/src/kernel/fs/”);
while(dp = readDir(d, … )) 
{
        // After readDir call, dp-> d_name = “README”;
        strcat(buf, dp->d_name);
        // buf now has our complete file name
        struct stat st;
        lstat(buf, &st);
        // convert st to Lisp form
}

Why is this slow? The bottleneck lies in the lstat call. The time complexity is O(D*N), where D is the length of the directory name and N is the number of the files in directory.

How can we optimize this and get O(N) complexity? The solution:

Replace

lstat(buf, &st);

with:

int fd = dirfd(d); // Get the file descriptor of current directory
fstatat(fd, dp->d_name, &st, O_SYMLINK); // Much faster!

This way we already start at the directory given by fd and don’t have to recursively determine the file location through lstat.

In this example, orthogonality between file names and file contents breaks down. Though they are supposed to be kept separate, in this case, file names become something you read out of a file. This causes many possible problems, including our inefficiency.

Tools for Orthogonality

The Classic Unix Model has the following setup for orthogonality:

In the remainder of these notes, we will deal primarily with processes and how they are implemented and also touch on files.


Orthogonality: Processes

In computing, a process is a program in execution in an isolated domain, meaning that it shouldn’t be able to mess with other processes. We can think of a process as a virtual computer with a subset of the total system. How do we specify how processes work and preserve orthogonality? We need

The API for Processes:

Fork

pid_t fork(); – spawns a process, pid_t is a handle for a process in sys/types.h: typedef long pid_t;

pid_t must be a signed integer type.

Fork clones the current process and

Examples using fork:

1. Basic use of fork()

pid_t c  = fork();
if(c == 0)
    // Do what the child should do
else
    // In parent

2. Parent keeps spawning child processes which immediately exit.

while(fork())
  continue;
exit();

3. Parent always spawns a child process and immediately exits. Creates a chain of child processes

while(!fork())
  continue;
exit();

4. Creates as many processes as you can if there is nothing to stop it

while(1)
    fork();

5. Subroutine to print the time of day using fork

bool printday(void)
{
  pid_t p = fork();
  if(p < 0)
      return false;
  if(p == 0)
  {
      execvp(“/usr/bin/date”, (char**) {“date”, “-u”, NULL});
      exit(126); // large numbers to indicate problems with exec by convention
  }
  else
  {
      // back in parent
      int status;
      while(waitpid(p, &status, …)  < 0)
        continue;    
      // We’ve waited for the child process to finish
      return WIFEXITED(status) && (WEXITSTATUS(status) == 0);
  }
}

How can we mess things up?

Data Structure for Processes: Process Table

How does this table work? When the kernel wants to resume some process, it uses a special machine instruction: reti (return from interrupt)

If we have ten processes on a single core machine, we have 10 filled rows in the process table. What if a process is currently running? Many of the entries in a currently running process are unknown because they are being updated as the process is running. The process table is an important part of how virtualization works.


Orthogonality: Files

The API for Files

For security and virtualization reasons, every process has their own set of file descriptors, and nobody else can see your file descriptor list.

Does fork clone file descriptors? Yes! This is an example of how our ideal situation of orthogonality does not work entirely, since file descriptors and processes interact in the process table.