Scribe notes brought to you by Alina Chen & Matthew DeCoste

The Big Idea of Unix: the File Model

Table of Contents

› What can go wrong with reading and writing files
› Creating temporary files
› Locking Files
› Pipes a | b
› Interrupts and Signals
› Summary

File: an array or stream of bytes
Files are represented by file descriptors.
You can read from it (e.g., read(fd,buf,sizeofbuf))
Apparently read isn't used enough, but let's move on.

Pros:	Cons:
simple	not everything is conveniently modeled as a file
orthogonal: doesn't interfere with other structures	e.g., keyboard: `read(kb,buf,sizeofbuf)` makes sense - read bytes from what was typed & put into buffer `lseek(kb,10000,SEEK_SET)` doesn't make sense - position the offset of the keyboard 10000 bytes?

As a result, there are 2 kinds of files:

Stream-oriented (e.g., I/O, keyboard)	Random access: (e.g, disks)
spontaneous data generation	request/response approach
infinite input	local storage
`lseek` doesn't work here	finite capacity

The read function works on either type of file.

What can go wrong with reading and writing files

Writing to a closed file descriptor:

Example code	What happens
`close(1); int i = write(1,"x",1); if (i < 0)` `print(strerror(errno));`	write fails, returns -1 to `i` `strerror(errno);` stores details

Reading from a closed file descriptor (closing, opening, then reading):

Example code	What happens
`int fd = open("foo", O_RDONLY); close(fd); int fd1 = open("bar", O_RDONLY); read(fd, buf, sizeofbuf);`	`read` is actually reading from `bar` this is known as aliasing

File descriptor leaks:

Example code	What happens
`1 for (i=0; i<N; i++) 2 { 3 int fd = open(file[i], O_RDONLY); 4 if (fd < 0) 5 { 6 error(); 7 read_and_copy(fd); 8 } 9 }`	file descriptors were never closed your process will run out of resources/memory

Removable devices (USB flash drive):

Example code	What happens
`int fd = open("/dev/usb/flash01", O_RDONLY);` physically remove flash drive at this point in execution `read(fd, buf, sizeofbuf);`	`read` returns -1 with special errno (about lack of flash drive)

Accessing unlinked file (not visible in file system directory):

Example code	What happens
`int fd = open("/tmp/foo", O_RDONLY); unlink("/tmp/foo"); read(fd, buf, sizeofbuf);`	this succeeds: the kernel still has the file open, even though it was removed from the file system the file is still on disk, so it is still possible to read from it

Creating temporary files

NOTE: multithreading will mess this all up so we're assuming this is a single-threaded process.

Let's try using a system call...ideally it would work like this:

int fd = mktempfile(); //file won't be visible to file system/directory
...
close(fd); //reclaim memory

...Unfortunately, this system call doesn't actually exist.

We can try using a library function instead, but it won't be safe.
Here's a first version of the code:

1	 int mktmpfile(void)
	
2  {
	
3      int fd = open("/tmp/foo", O_RDWR | O_CREATE | O_TRUNC, 0600);	//trunc replaces old files with this file name; 0600 is "-rw-r-----"
	
4      if (fd >= 0) //as opposed to 0666 ("-rw-rw-rw-") due to security issues
	
5          unlink("/tmp/foo");
	
6      return fd;
	
7  }

The unsafe features are the O_TRUNC call (because it deletes any possible previous /tmp/foo file and the fact /tmp/foo does appear in the directory for a brief moment. This can cause a race condition with multiple function calls since the file would already be in existence during one of the calls.

Knowing these issues, here's a safer version of the code:

1  int mktempfile(void)
	
2  {
    
3      struct stat st;
    
4      if (stat("/tmp/foo", &st) == 0)
	
5      {
    
6          errno = EEXIST;    //EEXIST means the file already exists
    
7          return -1;
    
8      }
    
9      char foo[sizeof "/tmp/foo" + sizeof(pid_t)*8];
    
10     sprintf(foo, "/tmp/foo%lld", (long long) getpid());    //attaches pid to name of temporary file
    
11     int fd = open ("/tmp/foo", O_RDWR | O_CREATE | O_TRUNC, 0600);
    
12     if (0 <= fd)    //code in the real world will leave this line out to optimize speed
    
13         unlink("foo");
    
14     return fd;
	
15 }

The race condition still exists between the first condition checking and the rest of the code, but it is narrower than before. The code also attaches the pid to the name of the file, which means multiple processes can call this function. Also, processes can create more than 1 temporary file even though they have the same pid because the name is unlinked after each file is created.
However, there is a problem, from the point of operations. Since the files are "invisible," we cannot see where the disk space is allocated to after unlinking. This means you don't know how much disk space you have left.

How is this relevant? The scenario given in class was something like this: You're writing a research paper while this function is being called multiple times in the background. When you finally click the Save button, a popup appears stating "No more disk space," although your directory hasn't changed from when you first started this paper. This occurs because the invisible files exist in your directory, and thus are taking up space, even if you can't see them.

Well, we can think of a solution to that. We remove the unlink code, so now all temporary files are visible.

Proceed to remove lines 12-13:


	    if (0 <= fd)
    
        unlink("foo");


	from the previous code.

Well now, what wasn't a problem before is now a problem! A process now cannot have multiple files, because they will have the same name.

Let's try fixing this...how about a random name generator?

1  int mktempfile(char *name, size_t size)
	
2  {
    
3      struct stat st;
    
4      do {
    
5          generate_random_file_name(name, size); // doesn't exist, but you can create it pretty easily
    
6      } while(stat(name, &st) == 0);
    
7      int fd = open(name, O_CREAT | O_RDWR | O_TRUNC, 0600);
    
8      return fd;
	
9  }

Welp. Now we have the same race condition as our first attempt...again. This time they appear during the while loop condition and open function call (line 6-7).

Our current set of primitives can't solve this. There's this new flag called O_EXCL. Basically, if the file exists, the call will fail.

We'll also replace the while loop with a different one so that we can avoid this failed call scenario.

1  int mktempfile(char *name, size_t size)
	
2  {
    
3      struct stat st;
    
4      int fd;
    
5      do {
    
6          generate_random_file_name(name, size);
    
7      } while((fd = open(name, O_CREAT | O_RDWR | O_EXCL, 0600)) < 0 && errno == EEXIST); // can also check other errnos for this loop
    
8      return fd; // we can unlink later
	
9  }

As you can see, it's difficult to find a robust solution, due to the waterbed effect - when you fix 1 problem, another appears in a different area.

There might be another way to solve this, but it's tricky and not generally used for this type of problem. In any case, this system is called locks. Here's a brief section about it.

Locking files

Locking primitives exist in UNIX/POSIX, but are advisory and rarely used. Shells don't use these. Databases do. If you control almost everything, you can use this.

Use fcntl(fd, cmd, p), where cmd is replaced by:

FSETLK: get a lock, fails if locked
F_SETLKW: set a lock, wait for unlocking
F_GETLK: checks if it is locked

The placeholder p is for:

struct float *
short, l type (regions)


			l_len
			l_short

There's more locks. These are called leases, where the process holding the lease (use F_SETLEASE(arg) to set it) is notified when a file tries to open or change the file size (truncate). The arg can be one of the following:

F_RDLOCK: calling process is notified when the file is opened for writing or is truncated; can only be placed on a read-only fd
F_WRLOCK: calling process is notified when the file is opened for writing or is truncated; can only be placed on a write-only fd
F_UNLOCK: remove lease from file

The rest are on the man page.

Pipes a | b

"It-sa me! Mario!"

Pipes are bounded buffers. If they were unbounded, they could potentially take up the whole disk space.

Pros:	Cons:
a and b run in parallel	no checkpoints or restarts
no space required to hold temporary data	multiple read-ins are a pain
better caching: doesn't need to go into disk (by redirection) and then be pulled off later	b can't use `lseek`

Here's what they might look like in a diagram:

Another thing to note: redirection (< >) precedence is higher than pipe precedence.

What can go wrong when using a pipe

Scenario:	What happens:
b reads an empty pipe	b waits
a tries to write to full pipe	a waits
a writes, but b closed its end of the pipe	any of the following is possible: `write` returns -1, errno is set to `ESPIPE` default: a gets a `SIGPIPE` signal: works for lazy programs non-default: some applications will use up CPU if written in a loop (e.g., `while(...) printf("error");` - `printf` keeps failing
b reads, but a closed its end of the pipe	any of the following is possible: `read` returns 0 (EOF) this is normal behavior - it's how the closed read signifies that read is done
a is done generating output, but forgets to close its end of the pipe	`read` hangs forever

There are 3 configurations we can consider a pipe a|b is implemented in the shell. They are described by the tree diagrams below. Each node is the parent of the one below it.

Pipes are implemented something like this, which rules out the first possibility.

1  int fd[2];
	
2  int ap = fork();
	
3  pipe(fd);
	
4  int bp = fork();

Also, we want the exit status of the pipe to be set to the exit status of b. Doing this with the 2nd possible configuration would be too much work, so we use the 3rd configuration. The rewritten code looks like this:

1  int fd[2];
	
2  int bp = fork();
	
3  if (bp == 0) //child
	
4  {
	
5      pipe(fd);
	
6      int ap = fork();
	
7      if (ap == 0) //grandchild
	
8          ...run a...; //clone fd[1] into STDOUT (1)
	
9      else
	
10         ...run b...; //clone fd[0] into STDIN (0)

Use dup2(i,j) to clone. Remember to close your pipes after you're done using them.
For example, if a exits and b reads(0,...), then the program can hang if b forgets to close fd[1].

Interrupts and Signals

Signals are ways to force code to respond to run-time situations.

Here's an example: you're up late one night, writing some last-second code, when the power goes out. Luckily, your desktop is attached to a UPS, so it hasn't completely lost power yet. Unfortunately, your electron-guzzler of a PC will drain the UPS in a matter of minutes. Ideally, this is where your computer comes in to save the day. The kernel needs to take a snapshot of RAM and store it on disk (for recovery after power is restored). When the power does eventually return, you would hope the kernel would restore everything to its rightful place, and that your various programs (and precious code) will return in exactly the same state as before.

So, will it? Well, not exactly. Obviously, network conditions and connections will not be the same upon revival, which could mess up some programs ~~(cough EA games cough)~~. There are also security implications if too much sensitive information is restored onto the screen in a new, insecure context. However, it's not the kernel's responsibility to address each and every possible consequence of this necessary action. At some point, user processes must be informed of the issue and allowed to take appropriate action.

So how do you alert user processes?

Essentially the question is how to place new code into the already-existing user code to make it handle a new situation. There are two ways to do this:

Pulling Solution (continuously reading from status file)	Pushing Solution (receiving, handling signals)
Continuously reads from file. May run code depending on load	Handles signals sent to it by kernel with `signal_handler(int sig)`
Very inefficient: continuous loads by programs interested in status	Efficient: only runs extra code when told to
Pain to program repetitive loads	Pain to keep track of possible race conditions
Eats up CPU time	Less taxing on CPU

Signals are essentially like the kernel inserted the pulling (polling) check for you, but only when the value is actually of use.

Summary

› A file is an array or stream of bytes. They are represented by integer identifiers called file descriptors. Files can be either stream-oriented (like receiving input from the keyboard) and random access (like those read off disk).
› Creating a temporary file is a process that must be handled with care to account for a variety of possible errors.
› Files can be locked, but it is purely a suggestive system that is not really enforced.
› Pipes are bounded buffers that allow you to redirect output from one process into the input of another.
› Interrupts are used to alert code to a change that urgently requires its attention.