CS 111 Scribe Notes

Lecture 13: File System Robustness - Feb 29 2012

By Minhan Xia

Real Time Unix Files

Problem:
If you are reading a large file, it probably takes multiple lseeks to get to the actual data block. This is due to multi-level indirect block. See the example below.

Figure 1. Inode of a Large File

In this example, it may need 3 seeks to reach the actual data block.
1. lseek to the doubly indirect block indicated by the inode.
2. lseek to the indirect block indicated by the doubly indirect block.
3. lseek to the actual data block.

Solution:
Create contiguous files. In order to do so, several features need to be added to the inode.
1. Add a bit in the inode to indicate whether this file is contiguous.
2. Add the starting block index to indicate where is the starting block of the file.
3. Add the number of blocks to indicate how many blocks is the file contained.
See the example below.

Figure 2. Inode of a Contiguous File

Here is the code to open an contiguous file:

open(“file”, O_CREAT|O_RDWR|O_CONTIG, …)

File /dev/null

In Unix-like operating systems, /dev/null or the null device is a special file that discards all data written to it (but reports that the write operation succeeded) and provides no data to any process that reads from it (yielding EOF immediately)

Here is the code to create an inode but no regular file.

mknod(“/dev/null”,...)

Note: Only root can run this. It could cause real problem.

Named Pipes

In computing, a named pipe (also known as a FIFO for its behavior) is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication.

Instead of a conventional, unnamed, shell pipeline, a named pipeline makes use of the filesystem. It is explicitly created using mkfifo() or mknod(), and two separate processes can access the pipe by name — one process can open it as a reader, and the other as a writer.

See the following example.


		$mkfifo /tmp/mypipe
		$ls –l /tmp/mypipe
		prw-rw-rw
		$cat /tmp/mypipe > foo
		$sed s/a/b /etc/passwd > /tmp/mypipe

In this example, when sed start writing to the pipe, cat read from it and write to foo. The pipe has an inode

File System Robustness

Design Goals:

Durability: could survive limited hardware failure (e.g. loss of power)
Atomicity: changes of file are either made or not made
Performance: high throughput and low latency

GOLDEN RULE OF AUTOMICITY:

Never over write your only copy of file on disk.

Solution: Always write to a copy!

Simple Assumptions:

1. Low level write of a block to achieve higher level atomic update of a block.

Lampson_sturgis Assumptions:

Storage writes may fail, but a later read will detect the bad block by checksum
Storage blocks can decay spontaneously (later reads will detect)
Errors are rare
Other assumptions for process failures

Write Model:

Figure 3. Write Model

High Level Update of a Block:

Figure 4. Acheive Automicity by having 3 copy of the data

In this design, if a crash occurs, after the crash, the system scans all the blocks during reboot.
1. Use majority rule for repair.
2. If all disagrees, use #1.
Note: Nobody use this design.

File System Correctness Invariants:

Every block is used for exactly 1 purpose
Consequence of Violation: DISASTER
All referenced blocks are initialized to data appropriate for its type
Consequence of Violation: DISASTER
All referenced data blocks are marked used in Bitmap
Consequence of Violation: DISASTER
All unreferenced data blocks are marked free in Bitmap
Consequence of Violation: not so bad