Scribes: Jason Chen
Lecture Date: February 29, 2012
Last lecture, we left off by talking about using indirection to accomplish various tasks. Most notably, this is helpful in establishing symbolic links. However with indirection, we sacrifice speed. For example, lseek(fd_____) and read(_____) both suffer from speed issues when indirection is introduced. Three seeks will result in the program running three times slower. Furthermore, if bad things were to occur it would be difficult to fix the problem.
In order to deal with this issue, we attempt to design systems that are fault avoidant.
//this is an example of a bad boot script
mknod("/dev/null", magic numbers) //only root can run it
open("file", O_CREAT|O_RDWR|O_CONFIG,0644, 1024*1024*1024)
>/dev/null
chmod 666 /dev/null
//this is an example of a good boot script
mknod("/dev/null, c, 3 19) //this creates a special file
ls -l /dev/null
//which prints out
crw-rw-/.../dev/null
We also use named pipes
$mkfifo /temp/mypipe
$ls -l /tmp/mypipe
prw-rw-rw.../tmp/mypipe
$cat /tmp/mypipe > foo &
$sed s/a/b/.../etc/passwd >/tmp/mypipe
This results in a relation with a read only file system in /usr/. However, this leads to some problems.
1. Link counts don't work in a read only file system.
2. And all filesystems are yoked together.
In order to deal with this problem, we consider the use of a mount table which divides the structures into various categories. Mounted tables are typically used to record currently active file systems.
Ultimately, in order for a file system to be considered robust, we have a few goals:
1.Durability - the ability to survive limited hardware failure
(e.g. loss of power)
2. Atomicity - changes to files are either made or not made
Performance - (has two competing goals)
3. High throughput
4. Low latency
It is very tempting to cheat and often very lucrative to do so as you can often save a lot of money.
Say we are attempting to create a text editor that will either save the changes we made to it or return the old information when it crashes. We first have some underlying assumptions about the structure of the program.
GOLDEN RULE OF ATOMOCITY states that you NEVER overwrite your only copy on disk.
Our test phrase that we enter:
Fore score and seven years
should have the Fore replaced by Four in this situation
Four
We also have a few simple assumptions.
Lampson-Sturgis assumptions
1. Storage writes may fail but a later read will detect the bad block (a failed write to block x may corrupt block y)
2. Stored blocks can decay spontaneously (later reads will detect)
3. Errors are rare.
A way we are able to solve the problem is by using three copies of the write. Although two can speed up the process, we use three to be safer.
However a problem exists in the fourth row with the NEW, ?, and the OLD. Since there are two possibilities, we impose a few set of rules:
1. We use majority rule to repair data
Basically if there is unknown data we use the data that is the majority out of the three copies at the time.
2. We use #1 if all three disagree
Such as in row four.
Suppose we want rename("d/a", "e/b") to be atomic.
1.Increment link count - write inode of file
2. Write dest dir data block
3. Write ssrc dir data block
(potential to crash at this point if there is a storage leak)
4. Clear link count
In order to prevent crash we use the function detect fsck which we pass occasionally to catch leaks.
File system correctness invariants
(Disaster if compromised) 1. Every block is used for exactly one purpose.
(Disaster if compromised) 2. All reference blocks are initialized to data appropriate for its type.
(Disaster if compromised) 3. All referenced data blocks are marked used in bitmap.
(Just a mere leak) 4. All unreferenced data blocks are marked free in bitmap.