Signals, Scheduling, and Threads
Last time:
- O_TRUNC - Used to trucate existing file
- O_CREAT - Used to create file if it doesn't exist
In the last lecture, the professor mentioned two tabs used with open: O_TRUNC, and O_CREAT. O_TRUNC is used to truncate existing files, while O_CREAT is used to creat a file if it doesn't exist.
Unix solution: O_EXCL
O_EXCL is a flag that causes the syscall to fail if the file already exists. O_EXCL is an example of a non-orthogonal flag: it only makes sense if used with O_CREAT. It doesn't work well on its own.
Example:
fd = open("/tmp/foo", O_RDWR | O_CREAT | O_EXCL, 0666);
If fd = -1, that means open has returned an error. errno = EEXISTS if the file already exists.
A sort program would use something like this:
while( (fd = open("/tmp/foo", O_RDWR | O_CREAT | O_EXCL, 0666)) == -1 && errorno == EEXIST)
sleep(1);
Sidenote: temporary files
If we only want to use temporary files, why not have a syscall for creating them? Here's a suggested syscall:
fd = opentmp(flags);
Benefits of opentmp:
- OS can determine the file location
- OS can put the temporary files on a faster device (this improves performance)
- Since temporary files are nameless, the OS can clean up afterwards automatically
- OS does not need to make the file persistent
However, it is harder for related processes to share temporary files. In addition, it is also harder to manage resources, because usual tools for managing files do not apply to temporary files.
Unix/linus aproach:
- opentemp() is a library call implemented on top of open()
- This is an othogonal approach
- mkstemp(): unix's way of making temporary files
File locking (in Unix):
A function for locking files:
fcntl(fd, cmd, P)
- fd is a file descriptor.
- cmd is a command to be done regarding a lock on a file. Here are three possible commands for cmd:
- F_SETLK: Attempt to obtain a lock. fcntl fails if it is already locked
- F_GETLK: Find out about the existing lock(s)
- F_SETLKW: Attempt to obtain a lock. If the file is already locked, wait and try again
- P is of type struct flock*. flock has the following fields:
- l_type is the type of lock. Possible values are:
- F_RDLOCK: read lock
- F_WRLOCK: write lock
- F_UNLOCK: no lock
- l_start is the starting location of the lock
- l_len is the length of the lock
- l_whence is the seek flag for l_start (SEEK_SET, SEEK_CURR, SEEK_END)
- l_pid is the process of the lock owner
File locking properties:
- Write locks are exclusive, meaning only one file can have a lock on a certain section of a file.
- Read locks on the same part of a file can coexist.
- Locks are advisory: They don't affect whether or not a read or write will fail.
- Another solution is to make locks mandatory:
- Pro: This catches buggy programs
- Con: Slows down read/write a bit
- Con: Make standard access a pain
- Con: Complicates all programs
When using F_SETLKW, there is a potential issue of running into a Deadlock.
An example of a dead lock is with two files, A and B, and two processes, 1 and 2:
Process 1:
Step 1. lockw( file A )
Step 3. lockw( file B )
Process 2:
Step 2. lockw( file B )
Step 4. lockw( file A )
The system will catch this potential error. When the processes reach step 4, it will notice the situation and will fail. The syscall will return -1 with errno = E_DEADLK.
Another example:
gzip FOO (gzip &temp files)
1. creates an empty file FOO.gz
2. compresses ( read from FOO, writing to FOO.gz)
3. closes FOO.gz
4. removes FOO
If there is an error on write, FOO.gz is removed and an error is reported
Possible problems:
- Other processes could access FOO or FOO.gz
- Solution: assume programs do the locking properly
- Two gzips running in parallel on the same file(s)
- Asynchronous events (e.g. Power outage) leave junk
- Solution: use signals to tell gzip to clean up
Interrupts vs. Signals
Interrupts:
- Hardware signals handled at a very low level
- CPU consults interrupt vector
- Kernel does its thing:
- Snapshots current process
- Handles interrupt
- Restarts process
Signals: Particular way a kernel notifies a process of an asynchronous event.
Possible Signal mechanisms:
- stick it into a "file": /dev/power
- the file has a size of 1 byte:
- 0 means there is no power
- any number between 1-254 indicates the number of milliseconds left of power
- 255 means that power is okay
- applications must poll every 100 ms
- "power fail" thread
- each process has a "power fail" thread that tries to read /dev/power
- read hangs if you have power
- read returns if power has failed
- when the read returns, processes know the power has failed
- this creates more threads to manage
- there will be a problem of thread synchronization within processes
- Signal handling in Unix
- major change to our abstract machine
- between any pair of instructions, a signal can occur
- when interrupting a signal handler itself, there are two choices:
- signal handlers can be written to be reentrant (See "What is Reentrant?" below)
- signals can be blocked while a signal handler is running
- the second option is the default
- signal causes one of your program functions to be called, which is executed before the next instruction
Example:
void handle_child_signal( int sig ) { error(); }
void printdate( void ) {
pid_t p = fork();
if(p < 0) error();
if(p == 0) {
char* const args[] = {"/bin/date", NULL};
signal(SIGUSR1, handle_child_signal); // the only problem handled here is a verrry slow disk
execvp(args[0], args);
error();
}
sleep(5); // normally, the child will have exited:
// it's a zombie (data structure representing a dead child for parent to waitpid)
kill(p, SIGUSR1);
if(waitpid(p, &status, 0) < 0) error();
What happens to /bin/date when it gets a signal?
- associated with each process is a table of signal handlers:
- PWR: exit(226) //depends on signal
- INT: exit and dump core
- USR1: handle_child_signal()
- CHLD: ignore
After the execvp, the signal handlers for USR1 and CHLD are changed back to default values.