CS111 Lecture 5 Scribe Notes - Winter '13

Alex Martinez, Nir Edelman


Table of Contents

GNU Emacs example

(directory-file-and-attributes "/usr/src/kernelt/fs") is a lisp primative that prints out a list of files and attributes in directory.

Result:

("ufs"      209361271 "eggert"
 "README"   20935     "eggert"
              .
              .
              .             )

The problem is that this is slow now. Here's why:

DIR *d = opendir("/usr/src/kernel/fs");
while(dp = readdir(d, ... )
{
      dp->d_name = "ufs";
      char buf[1024];
      strcpy(buf, "/usr/src/kernel/fs/");
      strcat(buf,dp->d_name);

      struct stat stb;

      lstat(buf, &st);

      (convert st to Lisp form) //NOT IMPORTANT
}

From the above code, we can start by asking two questions:

  1. How does lstat work?
  2. Is lstat our problem?

For our first question, lstat takes 2 arguments:

  1. The file path of the file path that is being inquired
  2. A structure where data about the file will be stored.
lstat returns a negative value on failure. [src]

The cost of the file path is as follows:

cost = O(D * N)

      D = length of directory name
      N = number of files in this directory

We want cost = O(N).

Orthogonality

xyz-axis

Orthogonality is defined as the relation of opposition between things at right angles. [src]

The idea is that we want systems that decompose a problem along "different axes" (where the axes don't interfere with one another). For example, we can have our x-axis be file names, and our y-axis be file content. In operating systems, our goal is to try to maintain orthogonality as best we can.


In the previous example, orthogonality is starting to break down. In order to preserve orthogonality, we could rewrite the code as such:

DIR *d = opendir("/usr/src/kernel/fs");
int fd = dirfd(d);
dirlen = strlen(buf);
while (dp = readdir(d, ... )
{
      dp->d_name = "ufs";
      strcpy(buf+dirlen,dp->d_name);
      struct stat stb;
      fstatat(fd, dp->d_name, &st, O_SYMLINK_YIELD);
      ...
}

where we have replaced lstat with a brand new function, fstatat(fd, dp->d_name, &st, O_SYMLINK_YIELD), where:

TOOLS

The following tools help with orthogonality:

Virtualizable processor

This portion is based off of the classic UNIX model.

We can start by defining our orthogonal axes:

Q: How do we specify how processes work?

A: API - for dealing with processes + associated intuition (as "data structures")
            - for application programs

pid_t fork(): takes no args, and returns a process id.

pid_t is a handle for a process.

Under sys/types.h:

typedef long pid_t;    //This must be a signed integer type

pid_t c = fork();

NOTE: we can call the original process the parent process and the clone a child process.

Here is an example of using fork():

pid_t c = fork();

if (c == 0)
{
     // do what the child wants
}
else
{
     // in parent
}

Q: What would the following 3 examples do?

while (fork())
      continue;
exit();

A: parent keeps creating children, and the children keep exiting

while(!fork())
      continue;
exit()

A: parent creates child and exit;
    child forks and exits;
    continues to fork and exit

while(1)
      fork();

A: Creates a giant tree of processes.


Here is a quicklist of functions:

Let's say we want to print the time of day using 'fork':

bool printdate(void)
{
      pid_t *p = fork();
      if (p < 0) return 0;
      if (p == 0)
      {
          execvp("/usr/bin/date", (char**) { "date",      // execvp is a sys call
                                              "-u",
                                              0 });
          exit(126);     // 126 and 127 deal with exec* failures
      }

      int status;

      while (waitpid(p, &status, 0) < 0)     // if child process fails, call it again
          continue;

      return WTFEXITED(status) && WEXITSTATUS(status)==0;
}

Possible errors

What are some things that can go wrong with virtualizable processes?

Example of a zombie creator:

while ((p = fork()) >= 0)
      if (!p)
          exit();

// over quota
exit();       // all zombie children are re-parented to process 1

The kernel keeps zombies until it knows no one will ever be interested in them again. If the parent exits, all zombie children are reparented to process 1, as handled in init.c:

init.c:
      // Reap the zombies
      wait(waitpid(-1, ...,...)) // wait for any of my children to exit
          continue;

Another possible error: Let's go back to the 'date' example. Suppose 'date' loops, and we want it to work even then?

One solution: give 5 seconds to loop, then return some sort of signal.

signal(SIGALRM, donothing);   // void donothing(int sig) {}
alarm(5);       // wake up in 5 sec, deliver a signal
while (waitpid(p, &status, 0) < 0)
     continue       //buggy! a race is here

if (waitpid(p, &status, 0) < t0 && errno==EINTR)
{
      // alarm went off
      kill(p, SIGINT);     // similar to hitting ctrl+c

      // function: int kill (pid_t, int), where int is a signal number

      waitpid(p, &status, 0);     // 'date' can defend itself from kill()! This waitpid makes sure 'date' stops
}

If it receives muliple signals to kill, the processor will queue them up.

Below is a data structure to implement this process table:

  <----pid

      12 exit status
      1 zombie?
      parent process id
      start location of RAM
      size of RAM
      register instruction pointer
      register
      register
      register
      register
      user id + group id
      ... ...
      ... ...

Suppose we have ten processes on a single core machine.
If process 3 is currently running, then many of the entries in the process table for process 3 are junk.

File system

-int open(char const *file_name, int flag, extra info) : creates a file descriptor upon success. From the users app's point of view, the file descriptor is just an nonnegative intger.

-int close(int) : closes the file descriptor passed to it.

-int dupe(int): clones a file descriptor

Every process has their own set of file descriptors. Going back to the process table:

  ...     ...     ...  
info about file fd[0]
info about file fd[1]
info about file fd[2]
  ...     ...     ...  

Let us finish with a question on orthogonality: does fork() clone file descriptors?

FIND OUT NEXT TIME!