Performance - High throughput (ability to handle lots of requests per second) and low latency
Atomicity - When you make a change to the file system, changes are either done or not done, never partially done.
Durability - Data survives failure of the underlying system (only limited failure)
What could go wrong with durability?
What failures are we worried about?
We want limited failure, within reason
Errors that are most likely and/or most damaging
We need a model for failures
Suppose during write_sector, we lose power
To model this, we can assume this is how write works:
Using 4 writes:
Using 3 writes:
Using timestamps
Classic Solution
e.g. rename("a","b")
What can go wrong?
Crashes after steps 4 and 5 may cause problems
Single data block writes are atomic (all or nothing)
e.g. rename("a","bb")
This can copy an entry and move to a bigger entry in data blocks (only works if block is not full)
e.g. rename("d/a","e/a")
read 357, 362, 952736, 1012 into RAM, and assume there is enough room in 1012 for a new entry
We do 1 first so we don't lose track of the file. If we crash between 1 and 2, then remove 362 or 357, we lose 921, which results in a dangling pointer.
To fix this, after step 1, write "921 link count to 2".
After step 2, change the count back to 1
The problem here is, if we crash between 3 and 4, we will have memory leaks because the link count will be too high
If we crash between 1 and 2, we'll have this problem as well
To fix this, change order to:
These are conditions that should always be true for your system
e.g. for BSD-style file system
Consequences for violating the corresponding invariant:
If we needed to modify an invariant, we should choose 4, since it has the least serious consequence.
Keep a separate record used to mark whether an atomic action is done or not done
Code for doing commits:
BEGIN
pre-commit phase (invisible to upper level)
COMMIT or ABORT
post-commit phase (invisible to upper level)
END
From the user's point of view, the whole commit process is atomic
Recovery must be idempotent (because recovery can crash too!)
Recover must handle aborts as well
Write-ahead log:
Write-behind log:
Both methods work, but with different performance