CS 111 Lecture 6 Scribe Notes - Winter 2013

Scribes: Lily Bao, Yeon Joon Jin, David Meng, Matthew Kwong

Table of Contents

  1. Unix Files
  2. Pipes
  3. Interrupts and Signals

Unix Files

The BIG Idea of Unix: everything is a file

The fundamental way to get data is to use:

  1. read(fd, buf, sizeof buf)

Advantages of using read for everything:

  1. It's simple
  2. It's orthogonal
  3. It really simplifies a lot of things

Disadvantages of using read for everything:

  1. Certain devices may not make sense with read
  2. e.g. it doesn't make sense to call lseek on the keyboard

To solve this problem, there are two types of files: stream-oriented and random access

Stream-Oriented (like keyboard)

Random Access (like disk)

spontaneous data generation

request/response

(in principle) infinite input

storage

finite capacity

Commands like read work for both stream-oriented and random access files

Commands like lseek work only for random access; they immediately fail if called with a stream-oriented file.

By convention, the first three file descriptors are reserved:

        0: stdin

        1: stdout

        2: stderr

Process Table

Remember the process table approach from last lecture. What can go wrong with this approach?

  1. You write to a closed file descriptor.

                close(1);

                i = write(1, "x", 1);

        //write will fail and return a negative number

                if (i < 0)

                        print(strerr(errno));

  1. If you close and open a file multiple times, you can end up with aliases

                int fd = open("foo", O_RDONLY);

                close(fd);

        int fd1 = open("bar", O_RDONLY);

                //fd1 could be the same int as fd, since fd is closed

        read(fd, buf, sizeofbuf);                //this could read from bar

  1. You open too many file descriptors, and have file descriptor leaks
  1. traditionally, the number of open file descriptors is limited to the number of columns in the process table

for(i=0; i<N; i++) {

                int fd=open(file[i], O_RDONLY);

                if (fd<0)

                        error();

                read_and_copy(fd);

        }

  1. The device vanishes

fd = open("/dev/usb/flash01", O_RDONLY);        

//open a flash drive in usb port

        //e.g. physically remove flash drive or try to read from a wireless network

        //returns -1, with special errno (not standardized)

                read(fd, buf, sizeof buf);

  1. Race condition with unlink (a syscall that removes ordinary files)
  1. The kernel remembers the file was opened and keeps a pointer to it. This way, unlinkers can't mess with readers. The point is to keep unlink orthogonal to read

fd = open("/tmp/foo", O_RDONLY);                        //foo is an ordinary file

        unlink("/tmp/foo");

                read(fd, buf, sizeof buf);                        //this succeeds!

You can make use of the fact that unlink and read are orthogonal:

Creating a temporary file approaches

  1. Use a system call

fd = mktempfile();                //not visible to any directory

        write();

        read();

        close(fd);                        //reclaims resources

  1. Use a library function-

        int mktempfile(char* name, size_t size) {

                do {

                        generate_random_file_name(name, size);

                } while((fd = open(name, O_CREAT|O_RDWR|O_EXCL, 0600)) < 0 && ok_errno(errno));

                //make sure you'll stop looking on a bad error

                return fd;

        }

Pipes

Advantages to using pipes

  1. two commands can run in parallel
  2. no space needed to hold temporary data
  3. we get much better caching

Disadvantages of using pipes

  1. no checkpoint or restart
  2. doesn’t work well for multiple readers
  3. the second command can’t lseek
  1. pipes are inherently stream-oriented

How does the pipe work if we have a | b?

In the file descriptor table,

a's right (output) points to the write end of the pipe object

b's left (input) points to the read end of the pipe object

the pipe object is like a bounded buffer

Process Table with Pipe

What can go wrong?

  1. b reads, but pipe is empty
  1. in this case, read will just wait
  1. a writes but the pipe is full
  1. a hangs and waits
  1. a writes, but b closed its end of the pipe
  1. different things can happen
  1. write returns -1, sets errno=ESPIPE
  2. a gets SIGPIPE signal and gets killed
  1. this is the default behavior
  1. b reads, but a has closed its end of the pipe
  1. read returns 0 (EOF)
  1. a’s done generating output, but forgets to close its end of the pipe
  1. read will hang forever

To implement a pipe:

Given 3 processes sh (shell), b, a, where shell calls a | b

Implementing a fork that is nested in the following way:
Pipe Forking

SH forks B then B forks A

int fd[2];

pid_t bp = fork();

if(bp == 0)

{

        pipe();

        pid_t ap = fork;

        if(ap == 0)

        {

                dup2(fd[1],1); //stdout of A is now writing to

    //the write end of the pipe

                /*run code for A*/

        }

        else

        {

                dup2(fd[0],0); //stdin of B is reading from the

    //read end of the pipe

                /*run code for B*/

        }

}

Interrupts and Signals

How should the kernel deal with running out of power?

  1. Take a snapshot of RAM and registers (copy to disk)
  1. When power is restored, continue programs as if power never went out
  2. Problems with this
  1. Network connections won’t work
  2. Takes a long time to take a snapshot of RAM, potentially inconsistent snapshots
  3. Time sensitive transactions won’t work
  4. Certain sensitive sessions shouldn’t be restored - security issues
  1. Need to inform user processes of this situation
  1. Polling method
  1. use a file “/dev/power”
  2. If you read “!”, power is low. If you read “[_]”, power ok.
  3. Problems:
  1. A checking mechanism needs to be added to every single program that cares about power
  1. Pain to program
  2. Chews up CPU time
  1. Signals method
  1. Kernel “magically” inserts a check into your program when power is low (through Signal Delivery)
  2. E.g. “fix_situation” function would be a signal handler, delivered to the program

Valid HTML 4.01 Transitional