Scribe Notes Lecture 6

Scribes: Jayendra Jog, Adam Jones, Kaitlyn Cason, Yilei Zhang

Operating System Revisited: Processes and Files

We want to organize an operating system that supports processes and files. How do we implement it?

We want to build programs/processes that operate on fake machines.

Last time, we saw what organization looks like from a user's view. But what about in the kernel mode?
We want the files that we're showing to applications to be pretend files.
This is because all of the files and processes will be treated as files since the underlying hardware doesn't differentiate between processors or files.
From the user's point of view, a process file is a program that's running on a machine with the following properties:

Arithmetic logical unit (ALU): this does basic arithmetic such as adds, subtracts, bit-shifting, etc.
registers
memory: in the form of RAM and primary memory for loading and storing. For efficiency reasons, the memory might not be kept in RAM when it doesn't need to be accessed (virtual memory)
I/O: in order to have files running on the computer

In reality, the actual machine doesn't work exactly how the user thinks it works.

Property	What the user might think happens	What actually happens
ALU	It does basic arithmetic at a very low level like adds, subtracts, bit-shifting, etc.	Same as what user thinks
Registers	Registers keep track of a small amount of data for very fast access	Simulates registers by saving the ones not needed in memory
Memory	All data goes to the RAM and comes back as efficiently as possible	Commonly accessed data is kept on the RAM and less commonly accessed data might instead be kept in physical memory. This is a tradeoff that might decrease efficiency, but it makes it easier to store data. This is at a high level how virtual memory works.
Input Output	Input and output instructions are called through syscalls	Kernel handles everything and controls how all input and output occurs

Kernel Memory

How does memory in the kernel work?

Kernel memory is usually very secure and is typically unaccessible to the user's code. The kernel memory needs to keep track of all processes that it's attempting to simulate. These simulations need to be kept track of so that future simulations are consistent with past simulations. Generally, a process table is used to keep track of all of this information.

Process Table: This is arranged as an array where the entries are arranged as indices. Each data entry is called a process descriptor. This is used to keep track of all the processes that are trying to run and keep enough info about those processes so the kernel can respond to those processes in an intelligent way. The process descriptor itself is a fairly long data structure that contains a copy of the user visible registers and the rsp, as well as a file descriptor table that has 1024 entries.

The user will have access to a copy of the simulated processes user registers. This can be seen in rti in x86 or the newer sys exit, which grabs a register copy out of RAM and then executes the correct process.

The process table will contain things about the process that you need to need to know in order to execute the process. There are, however, some things that the process descriptor doesn't contain:

ALU: Since it doesn't have any state, and it doesn't need to be stored into the process table. It's easier to simply throw away the calculations and redo them later when context switches occur instead of storing them.
Program Memory: This is too big to store into kernel memory. It is instead encapsulated into a 64-bit pointer stored as a register in the process descriptor.

The process descriptor has a file array of 1024 entries with a pointer in each spot that can be accessed indirectly as long as the index is specified.

Handles

Introduction

Linux/Unix uses ‘int’ as a handle for an open file. It requires file descriptors to know access the files. Other OSes uses a struct filedes* ‘pointer’ instead. The advantage of having an ‘int’ as a file descriptor is that there's a greater ability to move things around. The advantage of using a struct is that the struct provides more information, better performance, better control, and type checking. It is, however, less portable and violates orthogonality.

Definition of a file descriptor

A file descriptor is a pointer to file description. File descriptions are a user-defined string that can be assigned to a file or folder. They can be passed between processes.

Pipe

What can go wrong here?

1. Writing to a pipe that is full. All readers are busy doing something else. These are designed for unreliable applications that run quickly. Solutions:

Make the buffer bigger: can exhaust kernel memory
Discard new data: this is cheaper, but less reliable due to data loss; it also makes the write system call return right away without changing the state of the pipe
Discard old data by having the write call fail and return an error
How Linux/Unix does it: suspend the process that's writing to the pipe

2. Writing to a pipe that has no reader.

With the Linux solution, the process will suspend forever, because we cannot hang the writer since the pipe will never be emptied.

$ cat /etc/passwd | :

In this example “:” does nothing but exit with 0.

Solutions:

Kill the writer: signal SIGPIPE (We can optionally ignore SIGPIPE by letting write fail with errno==EPIPE. However, we do need to check the following.)

        if(printf (“Hello”) < 0)

If no one is reading from a pipe on creation, then no one will EVER read from a pipe. This is a staple of Linux. Optionally, a process can ignore the SIGPIPE signal and writes failure with errno == ESPIRE. Good practice is checking the return values of all system calls.

3. Read from an empty pipe

There's no writer, so it will hang forever. The convention is for read to return 0, which stands for EOF.

Named Pipe Demo

So far we have been talking about ordinary pipes. There are also named pipes that live in the file system. Anyone can open the pipe and read or write to it.

$ mkfifo /tmp/pipe
$ cat /tmp/pipe > out &
$ echo Hello > /tmp/pipe

Pipe Leakage

Memory leaks can occur if the file descriptors are never used. Suppose we have a program that creates a pipe and doesn't use it:

int ok = pipe(fd) == 0
for(;;) {
    //something else happens
}

What would happen? Nobody else can access the pipe. Only your file descriptor table mentions the pipe. So, essentially you've created a buffer in the kernel, but the kernel doesn't know you're not ever going to write to it. You've wasted system resources (this is akin to a memory leak). This is called a file descriptor leak.

You can create a pipe and mistakenly set it up so that there is a writer who will never write anything to the pipe.

$ while :; do ; done | cat

A parent process needs to close its own write end when the child process closes, otherwise it will go into an infinite loop. Same with reads.

Sorting Example

Parent process talks to and from a sort function with two pipes. Set up with fork(), pipe(), pipe(), dup2() a ton, execvp() for sorting. Parent process sends copy of data to sort. Sort then ships the sorted copy back to the parent. This works because sort doesn't write any data until it has read everything it needs to.

For a program like sed, reads and writes can both happen, and cause both processes to hang. You'd have two full pipes, waiting for each other to make space for each to write, leading to a deadlock. This is a race condition. If sed is writing more than its reading, then the pipe going to the parent will fill up, and then the pipe going to the child will fill up and cause both processes to freeze.

File descriptors access files at a lower level than file names. Can grep a file after it has been removed. Example: pipes are nameless files. A file won't be removed until all file descriptors pointing at it go away, and its name is removed from the file system.

Orthogonality Question

The idea in question is: You've got a big file. You remove it, but then grep something. The grep command will still find even though bigfile was removed.

$ rm bigfile | grep "interesting" < bigfile;

Why? File descriptors access files at a lower level than file names. You could access a file without a name. (ex. Pipes do this! You have nameless files from pipe). By end of ‘;’ the data is then removed.

A file won't be removed until all the file descriptors pointing at it have been deleted. All the readers must leave. Treat files a bit like pipes.

Signals

It is worthy to note that there are operating systems in which signals have been discarded. Why? Because signals can be troublesome.

What's the point?

Asynchronous Events in I/O

For example, Linux's aread(). This returns before aread is actually done reading! Instead, a signal is sent when aread() actually finishes.
Signal: SIGIO

Error in Your Code

What if you want to try to do something the machine doesn't (normally) let you do?
Examples: \0, FP overflow
Instead of crashing the program, have a signal handle the error.

Impatient User or Infinite Loop

This is if you want to wake up the program that is stuck, or have it try again.
Or for an infinite loop, the default is to crash, but we want a softer exit for the more heavyweight programs.
Signal: SIGINT.

Impending Power Outages

If you want to save state quickly before an impending power outage, you have a few milliseconds before it dies.
We can do a lot of instructions with this!
Signal: SIGPWR.

Children Management

You want to check on a program's children occasionally. But a loop for this is too slow. We want a faster response for when this child exits.
This way, when one of your child exits, we can do something about it right away.

p = waitpid( -1, &status, WNOHANG );

Signal: SIGCHLD.

User Went Away

Signal: SIGHUP

Shell Bombs

Someone has made a fork bomb.
How do we solve this? SIGKILL
Note: SIGKILL can't be caught or ignored. This gives the program no option but to terminate.
What about the bomb's forked children? Use the shell command:

$ kill -KILL -23916

Here, if the parent's PID is 23916, if you make that process number negative, the command will kill the parent and its offspring.
Signal: SIGKILL

Alarms

This is with a shell command

$ alarm(20)

Signal: SIGHALRM

In general, the uses of signal is:

break out of a loop
timeouts
important, unusual, unexpected events
kill a process or suspend

To suspend a process (where 29 is the PID):

$ kill -STOP 29

To resume a process (where 29 is the PID):

$ kill -CONT 29

Note that some Linux systems have a SIGEOF signal. This event (an EOF) isn't unusual enough for Linux, so we don't see it here. We expect that you should ordinarily be able to deal with a finished file. However, determining an "unusual" event is really just a judgement call.

Receiving Signals

We have a function for this!

sighandler_t signal( int, sighandler_t )

This function returns a pointer to the old handler. And takes in an integer of the signal number, as well as a pointer to a function that will handle that signal.

sighandler_t is a void(*sighandler_t)(int)

AKA: This is a function that takes an int (the signal number) and returns void. This is a sort of callback mechanism.

Signal handlers can be called at anytime in the process once initiated, whenever the system feels like it.

For example, somewhere in our code we want to handle signal 29 with function f:

signal(29, f);
x = y + 1;
x = x + 1;

It's possible that right before the second piece of arithmetic, the machine will call f(29).

Basically, in Eggert's words, we see something happening at the machine level. If you have a bunch of instructions, between any pair of instructions there is some extra secret (up until now) code that essentially says, "If I feel like it, let's call f(29)."

When we consider this, then a signal handler can be invoked between any pair of instructions.

Back to our example above: if our handler f modifies the variable x, then x will be modified in the middle of an expression!

Signal handling is a great way to introduce race conditions into your code!

Atomicity of Functions

Want gzip to act an as atomic operation (all or nothing).

Here's the command:

$ gzip foo

Here's some standard code for gzip:

    gzip.c
    fd = open("foo", O_RDONLY);
    fo = open("foo.gz", O_WRONLY);
    while(compress(fd, fo))
    	continue;
    close(fo);
    unlink("foo");

This is pretty standard. But this isn't atomic.

If you were to ^C in the middle of this process, you'd end up with a partially created foo.gz file that wasn't properly compressed.

How do we fix this? We need to account for interrupt signals and do some cleanup.

Our first guess might look something like this:

    gzip.c
    fd = open("foo", O_RDONLY);
    signal(SIGINT, cleanup);
    fo = open("foo.gz", O_WRONLY);
    while(compress(fd, fo))
      continue;
    close(fo);
    unlink("foo");
    
    static void cleanup(int sig) {
      unlink("foo.gz");
      _exit(1);
    }

This would indicated a failure, and cleanup is done. But this is incomplete. Why? We have a race condition.

If you ^C right as you call the signal, you might be removing a foo.gz that doesn't exist!

What about when the signal comes after you've successfully made and compressed foo.gz? We don't want to delete the copy of the data!

We have a proposed solution (right as Eggert was wrapping up):

      gzip.c

      static void cleanup(int sig) {
          unlink("foo.gz");
          _exit(1);
      }

      main() {
      fd = open("foo", O_RDONLY);
      signal(SIGINT, cleanup);
      // CTRL-C RACE HERE
      fo = open("foo.gz, O_WRONLY");
      while (compress(fd, fo))
          continue;
      close(fo);
      signal(SIGINT, SIG_DEL);
      unlink("foo");
      // RACE HERE fixed?
      }

Note: This scenario will be continued in Wednesday's lecture.