Central Concern with File Robustness

What happens to our data if we are in the middle of writing to our storage disk when we get a power failure?

It is important that when our system reboots we don't have scrambled data where we were writing with no idea of whether we finished or not. Idealy, all writes will appear to either have written the new changes completely or not written at all once our system is back up and running. We call such changes atomic, meaning they happen fully or don't happen at all. We explore Journaling as a means to expand file system robustness by adding atomicity to file writes as well a history for the state of the file system.


What is the journal?

A circular buffer of the data we want to write and logs of commits regarding whether the writes have successfully completed.

Journaling Protocol

  1. Log the data to be updated before physically storing into cell memory
  2. Write a commit log to the journal indicating that our previous entries (the data we want to write) are complete
  3. Install the changes to the physical cell memory
  4. Log that the entire process is complete

If we lose power in the middle of a write, we will have information in the journal about what we were trying to do and whether or not it was completed. The following table summarizes the result of recovery if we fail after the various points throughout the process.

Step In Process State of File Recovery Process
1 Before Write We won't see the commit from (2) and will know that we need to re-collect the data that was supposed to be written. The data in cell memory will not have changed.
2 After Write We will know that the data to be written is all in the journal but has not physically been written to cell memory. We can write it and finish the journaling protocol for this write immediately.
3 After Write Upon reboot we might have garbage in cell memory, but all the data to be written (as confirmed by {2}) is stored in the journal and can be written immediately.
4 After Write From the journal we will see that the entire write completed successfully and there is no more work to do for the given write

NOTE on journaling buffer

The buffer may fill up with data entries and commits before any of the data recorded is actually installed to the cell memory. In this case, the OS must prioritize installing data to the cell memory before any further write commands can be journaled, otherwise critical information would be lost.

NOTE on accessing the Journal History on SEASnet

On the SEASnet machines you can use the following command:

$cd ~/.snapshot

to access a directory clones of your user directory taken at various time intervals in the past. It appears that SEASnet keeps journals of the exact state of our user directory each hour, day and week. We are able to retrieve each past hour for the previous 8 hours, each past day for the previous 7 days, and each past week for the previous 2 weeks!


Performance concerns with Journaling

The major bottleneck for "naive journaling" is moving the disk arm. If the journal is kept in one contiguous region, as well as on the same disk as the data being changed, it may be constantly moving too and from cell memory to the journal, spending most of its time actually moving.

Performance Improvement Ideas

  1. Have the Journal and data on two separate disks/devices. This way, each disk can communicate with each other through the cpu while minimizing arm movement, since the data disk arm can always be sitting in the current data being read while the journal disk arm can be sitting in the current journal entry that is being written. Also, the journal disk would need to write only as appends, so the disk arm would never have to seek during writes.
  2. Don't use a physical disk for cell storage whatsoever -> Instead, cache the needs pieces in RAM
    • This method is a big win if the file system is small enough to fit entirely into RAM. The journal disk arm would never need to seek since it is only appending entries, and access to the file system data would be extremely fast since it is stored in RAM.
    • The major downsides here are space (since we are fastest when the FS fits in RAM, but RAM is obviously very small compared to a disk) and re-boot speed. The re-boot will be slower since we need to reconstruct the file system entirely from the journal, since it was all stored on RAM.

Introduction between disk scheduling (future) and Robustness (present)

Consider the following region of case, where we have 4 writes indexed in the order in which they were requested.

As mentioned before, disk arm movements are a major bottleneck of disk performance. The disk device has its own internal scheduler, and as a result, often re-orders writes to reduce the total distance the disk arm has to seek. With the above example, it is plausible the disk scheduler would re-order the writes as follows:

w2 -> w1 -> w4 -> w3

Problems with Disk Scheduling


Unreliable Processes: Bad Memory References and Our Quest for Virtual Memory