CS 111: Operating Systems Lecture 13 - File System Robustness
By Steven Pham
Lecture Data: February 27, 2013
For a robust file system, we have a few goals:
1. Performance: System with high throughput and low latency
2. Durability: Data survives failure and is enduring
3. Atomicity: Correctness of data - changes made to a file system are completed or
not completed.
Durability Issues:
When considering durability, many possible issues could occur. In some cases, the keyboard, CPU, RAM, etc. could die or screw up the underlying failure.
In an ideal world, we would want to address all such possible issues to ensure the complete durability of our system. Realistically, however, we must limit our analysis to a scope of limited failure. Mainly, we only want to consider the problems most likely to occur that will cause damage.
Battery Example
Must consider a model for failure that corresponds to reality and is simple:
- No battery
- Only failure to worry about: loss of power
Say we want to edit a file:
What if we lose power when attempting to edit the file and write to the disk sector?
In the event of a loss of power, a trap occurs telling processes to stop immediately.
Possibilities at this point:
1. The power supply has
a small amount of power remaining to finish task
2. Bad sector results
since task cannot finish
Under these assumptions, we can assume the data contents to follow this model:
Atomicity Solutions?
- More file space
- Never overwrite your only or last copy
- Keep many backups
Backups example
In a file system with backups, we write to F2 first then overwrite the old F1. Using markers, we can keep track of which file is currently being modified.
F2 = Y
Marker = F2
F1 = Y
Marker = F1
With markers:
Using this approach, we can assure atomicity by checking the file marker and ensuing status of each file. While this method works, we have to utilize too many write sectors
and might want a more efficient approach.
Time append example:
Alternatively, we can try appending the time stamp:
Using this approach, we write the data to the file and the nappend the current time.
The only problem is that the current time value could be completely garbage or a wrong value from a bad write and the resulting value could be incorrect.
Classic Solution:
In the classic solution approach, we use three file copies for safety and complete atomicity with staggered writing to all 3 copies:
If a crash happens, we run this code:
if F1 = F2
return F1
else
return F3
Utilizing this code, should a crash happen before fully copying the data to all 3 files, the files can be successfully reverted to old or new data accordingly.
This approach requires 3 copies to be held on disk, however, and this can be a lot of memory space just to simply assure atomicity for one file on the file system.<
Lampson-Sturgis Assumptions:
1. Storage writes may
fail and can corrupt written or nearby sectors
2. Storage can decay
spontaneously
3. A later read can
detect the bad sector
4. Errors are
relatively rare
5. Repairs can be done
in time
Example: rename(“a”, “b”)
1. Read inode 357 into
memory
2. Read data block 52
into RAM
3. Change “a” to
“b” in RAM
4. Write data block out
to disk
5. Update inode time
stamp
6. Write inode to disk
(Doesn’t exactly follow Lance-Sturgis assumptions)
Say instead we wanted to rename file a to bb - rename("a", "bb"):
In this case, if we can't find directory memory in allocated space, we have to go find empty space to copy the entire directovery over.
Rename ("d/a", "e/a")
Actions:
Read inodes 357 and 362 as well as data blocks 52 and 1012 into RAM
Write link count 2
Write updated data block 1012
Write link count 1
Write updated data block 52
Write updated inode 362
Write updated inode 357
If we crash between writes to data blocks, link count will not be updated. In this case, files must be removed or dangling pointers will remain.
Invariants for file systems:
1. Every block in the file system is used for exactly one purpose
2. All referenced blocks are properly initialized with data appropriate for their type
3. All referenced data blocks are marked as used in bitmap
4. All unreferenced data blocks are marked as free in bitmap
Consequences for
violation of invariants:
1. Disaster
2. Disaster
3. Disaster
4. Memory leak