Performance
We want to come up with a model for failures; this model is crucial because it will affect our design. We want the model to be simple and corresponds to reality.
For now, let us worry about a simple model that has no battery and the only failure it has is losing power.Suppose you have a write_sector, what happens if you lose power in the middle of writing write_sector?
1. It might still work!
2. Bad sector results because it is writing in process.
To model this:
To read:
if F1 = F2
return F1
else
F3
We want less than 3 copies
Lampson-Sturgis Assumptions
Rename system call: rename("a","b")
1. Read inode 357 into memory
2. Read data block 952736 into RAM
3. Change "a" to "b" in RAM
4. Write data block out to disk
5. Update inode time stamp
6. Write inode to disk
What can go wrong?
Suppose there is a crash between steps 4 and 5, after reboot, data will change but inode will not be updated.
CS111 Assumptions
Single data block writes are atomic; either all blocks write successfully or none gets written, all-or-nothing.
rename("d/a", "e/a"): Renaming a file that used to be in directory d to a file in directory e.
We want to do step 1 before step 2 because if we do step 2 first and it crashes, the file "a" would disappear.
The advantage of doing step 1 before step 2 is that we can keep track of the file, but the disadvantage is if it crashes between steps 1 and 2, it will lose the link to the data, which is bad.
The solution would be to do something between steps 1 and 2.
1. Write 1012
2. Write "921 with a link count of 2"
3. Write 952736
4. Write "921 with a link count of 1"
This solution would cause memory leak sometimes and we can still lose data if it crashes between the new steps 1 and 2.
The solution to the case above would be to switch the order of steps 1 and 2.
1. Write "921 with a link count of 2"
2. Write 1012
3. Write 952736
4. Write "921 with a link count of 1"
If it crashes between the new steps 1 and 2, it would cause memory leak but will not lose data; having memory leak is better than losing the whole data.
Invariants for our system
(1) Every block in the file system is used for exactly one purpose.
(2) All referenced blocks are properly initialized with data appropriate for their type.
(3) All referenced data blocks are marked used in bitmap.
(4) All unreferenced data blocks are marked as free in bitmap.
Consequences of Violation
Violating (1), (2) or (3) would cause a disaster. Violating (4) would cause memory leak, so (4) is the only candidate to be removed.
Two ideas for Building Robust File Systems
1. Commit record
A separate record that is used to mark whether an atomic action is done or not done.
File system code for doing commits, which appears to be atomic from a user's point of view:
BEGIN
Pre-commit phase (invisible to upper level)
COMMIT or ABORT (make changes visible)
Post-commit phase (invisible to upper level)
END
2. Journaling
Append-only file system: The downside of this system is that it wastes storage, but the advantages are that there is no link count problem, it solves many inconsistency problems and it can avoid seeks if it is mostly writing.
Write-head log
Write-behind log