CS 111 Winter 2013

Lecture 13: File System Robustness -  February 27th 2013

by Yoshnee Raveendran & David Mao

Table of Contents

Ideal File System Goals

  1. Performance

  2. Durability
  3. Atomicity

What can go wrong with a file system?

Overview

A simple model - Loss of power

Situation: Laptop is so old, that the baterry doesnt work well. Lets assume that the only failure that can happen is that this laptop looses power suddenly. Now, lets say that we are writing(low level: write_sector will tell the disk controller to replace old to new) to a pre-exiting file F as follows:

Files


Suppose that the laptop looses power right in the middle of "write-sector", then one of two things can occur:
  1. Power supply has a charged capacitor that has reserved energy to continue finishing the task despite not having power. Hence, the write might work and result in an accurate F'.
  2. Partially written-sector, so F' is not accurate.
To model this:
Attempt 1:

x
?
y

0        1      2
time
-------------------->

While writing, if the power loss happened at time 1, then the data could be anything (?).

Attempt 2
An improvement from attempt 1, instead of 1 copy of file F, we could have a backup copy and so we will not be overwriting our only copy. We will use a marker to take note of which file (either the original or the backup) needs to be written to. Lets call the original file, F1 and the backup file F2.

Attempt 2

How we will write to the files:
F1
X
X
Y
F2
?
Y
Y
marker
1
1
2

However, the downside to this approach is that we will require 3 writes - hence too much overhead.

Attempt 3
Another way to succesfully recover the file F is by keeping three copies of the files: F1, F2, and F3. The user first makes changes to F1, followed by F2 and lastly F3.

F1
X
?
Y
Y
Y
Y
Y
F2
X
X
X
?
Y
Y
Y
F3
X
X
X
X
X
?
Y

To recover the file i.e. read the file at any given time, we use the following procedure:
if F1 == F2
        then return F1
    else return F3

The above procedure guarantees that we will always either get the old version of the file (X) or the newly-written updated version (Y), so we will never get garbage (?).

However, just like before, there is too much overhead with this method, since we will be keeping 3 copies of the file and writing multiple times.

Lampson-Sturgis Assumptions


Another Atomic Write Example: Changing File Names

Suppose we have a file "a" that we want to rename to "b":
Rename_1
The procedure to do that is as follows:
  1. Pull inode 357 into memory
  2. Read data block 952736 into RAM
  3. Change 'a' to 'b' in the RAM
  4. Write data block onto disk
  5. Update inode time stamp
  6. Write inode to disk
What can go wrong with this?
Suppose that a crash occurs just after step 4(before step 5), then this will mean that the rename did in fact happen, but the time-stamp was not updated accordingly. If a crash were to occur after step 2(before step 3), then there will be no loss, since the rename wasnt executed.

For the purpose of this class, we will assume that single data block writes are always atomic.

Now lets say we want to rename ("d/a", "e/a") instead:

Rename_2
The procedure to this is as follows:
  1. read inode 357 into memory (d)
  2. read data block 952736 into RAM
  3. read inode 362 into memory (e)
  4. read data block 1012 into RAM
  5. update d's data in RAM
  6. update e's data in RAM
  7. write d's data (a in d is removed)
  8. write e's data (a in e is created)
We can see a downside in this approach. For instance, if the power supply were to suddenly go off just after step 7 (before step 8) then we would have lost file a forever. If we flip steps 7 and 8:

    7. write e's data (a in e is created)
    8. write d's data (a in d is removed)

Now, if we were to loose power just after step 7 (before step 8), then we will have two copies of a after reboot. However, the link count of this file will only be '1', and hence it will be disastrous if the user were to delete either one copy of this file and try to access the other at a later time. To solve this situation, we should update the link count of this file.

Models of Underlying Systems

  1. Every block in a file system is used for exactly 1 purpose.
  2. All referenced blocks are properly initialized with data appropriate for their type.
  3. All referenced data blocks are marked as used in the bitmap.
  4. All unreferenced blocks are markes as free in the bitmap.
Consequences of violating the above 4 rules:
  1. Disaster - if we were to use a block for 2 purposes, then we may corrupt the pointers
  2. Disaster - An uninitialized block will not point us to the right data
  3. If a referenced block is marked as free in the bitmap, then multiple files will think that they own the sama data block
  4. If unreferenced block is marked as used in the bitmap, then it will never be called, hence causing a memory leak

Two Ideas for a Robust File System

  1. Commit Record
BEGIN
     pre-commit phase (invisible to upper level)
COMMIT or ABORT
     post-commit phase
END

     2.  Journaling