CS 111 Lecture 6 Scribe Notes - Winter 2013

Scribes: Lily Bao, Yeon Joon Jin, David Meng, Matthew Kwong

Unix Files
Pipes
Interrupts and Signals

Unix Files

The BIG Idea of Unix: everything is a file

The fundamental way to get data is to use:

read(fd, buf, sizeof buf)

Advantages of using read for everything:

It's simple
It's orthogonal
It really simplifies a lot of things

Disadvantages of using read for everything:

Certain devices may not make sense with read
e.g. it doesn't make sense to call lseek on the keyboard

To solve this problem, there are two types of files: stream-oriented and random access

Stream-Oriented (like keyboard)	Random Access (like disk)
spontaneous data generation	request/response
(in principle) infinite input	storage
	finite capacity

Commands like read work for both stream-oriented and random access files

Commands like lseek work only for random access; they immediately fail if called with a stream-oriented file.

By convention, the first three file descriptors are reserved:

0: stdin

1: stdout

2: stderr

Process Table

Remember the process table approach from last lecture. What can go wrong with this approach?

You write to a closed file descriptor.

close(1);

i = write(1, "x", 1);

//write will fail and return a negative number

if (i < 0)

print(strerr(errno));

If you close and open a file multiple times, you can end up with aliases

int fd = open("foo", O_RDONLY);

close(fd);

int fd1 = open("bar", O_RDONLY);

//fd1 could be the same int as fd, since fd is closed

read(fd, buf, sizeofbuf); //this could read from bar

You open too many file descriptors, and have file descriptor leaks

traditionally, the number of open file descriptors is limited to the number of columns in the process table

for(i=0; i<N; i++) {

int fd=open(file[i], O_RDONLY);

if (fd<0)

error();

read_and_copy(fd);

}

The device vanishes

fd = open("/dev/usb/flash01", O_RDONLY);

//open a flash drive in usb port

//e.g. physically remove flash drive or try to read from a wireless network

//returns -1, with special errno (not standardized)

read(fd, buf, sizeof buf);

Race condition with unlink (a syscall that removes ordinary files)

The kernel remembers the file was opened and keeps a pointer to it. This way, unlinkers can't mess with readers. The point is to keep unlink orthogonal to read

fd = open("/tmp/foo", O_RDONLY); //foo is an ordinary file

unlink("/tmp/foo");

read(fd, buf, sizeof buf); //this succeeds!

You can make use of the fact that unlink and read are orthogonal:

Creating a temporary file approaches

Use a system call

fd = mktempfile(); //not visible to any directory

write();

read();

close(fd); //reclaims resources

Use a library function-

int mktempfile(char* name, size_t size) {

do {

generate_random_file_name(name, size);

} while((fd = open(name, O_CREAT|O_RDWR|O_EXCL, 0600)) < 0 && ok_errno(errno));

//make sure you'll stop looking on a bad error

return fd;

}

Pipes

Advantages to using pipes

two commands can run in parallel
no space needed to hold temporary data
we get much better caching

Disadvantages of using pipes

no checkpoint or restart
doesn’t work well for multiple readers
the second command can’t lseek

pipes are inherently stream-oriented

How does the pipe work if we have a | b?

In the file descriptor table,

a's right (output) points to the write end of the pipe object

b's left (input) points to the read end of the pipe object

the pipe object is like a bounded buffer

Process Table with Pipe

What can go wrong?

b reads, but pipe is empty

in this case, read will just wait

a writes but the pipe is full

a hangs and waits

a writes, but b closed its end of the pipe

different things can happen

write returns -1, sets errno=ESPIPE
a gets SIGPIPE signal and gets killed

this is the default behavior

b reads, but a has closed its end of the pipe

read returns 0 (EOF)

a’s done generating output, but forgets to close its end of the pipe

read will hang forever

To implement a pipe:

Given 3 processes sh (shell), b, a, where shell calls a | b

Implementing a fork that is nested in the following way:
Pipe Forking

SH forks B then B forks A

int fd[2];

pid_t bp = fork();

if(bp == 0)

{

pipe();

pid_t ap = fork;

if(ap == 0)

{

dup2(fd[1],1); //stdout of A is now writing to

//the write end of the pipe

/*run code for A*/

}

else

{

dup2(fd[0],0); //stdin of B is reading from the

//read end of the pipe

/*run code for B*/

}

Interrupts and Signals

How should the kernel deal with running out of power?

Take a snapshot of RAM and registers (copy to disk)

When power is restored, continue programs as if power never went out
Problems with this

Network connections won’t work
Takes a long time to take a snapshot of RAM, potentially inconsistent snapshots
Time sensitive transactions won’t work
Certain sensitive sessions shouldn’t be restored - security issues

Need to inform user processes of this situation

Polling method

use a file “/dev/power”
If you read “!”, power is low. If you read “[_]”, power ok.
Problems:

A checking mechanism needs to be added to every single program that cares about power

Pain to program
Chews up CPU time

Signals method

Kernel “magically” inserts a check into your program when power is low (through Signal Delivery)
E.g. “fix_situation” function would be a signal handler, delivered to the program

CS 111 Lecture 6 Scribe Notes - Winter 2013

Table of Contents

Unix Files

Pipes

Interrupts and Signals