Lecture 11


File System Performance:

This is a continuation of Lecture 10

Prefetching... When is it bad?

Dallying Downsides

Say we issue the following.
write(...);
write(...);
write(...);
write(...);
write("done");
[crash]
These writes assume that we are writing to disk but in reality we have been writing to cache. And if the machine crashes before we can flush these changes to disk those changes will be lost. So we introduce a write barrier . A write barrier will flush out all the cached writes to disk on command.
We have multiple possible function calls to accomplish this:


File System Design

Story Time

IBM announced they are shipping a 120 petabyte file system to a nameless customer. Cheapest fast drives were 600GB and we assume they are 15K rpm drives. And we need ~200,000 drives.

IBM story
Problems

This all comes down to having as much parallelism as possible.

To take on this problem lets look at creating a simple file system first.

Simple File System

Our simple file system will just be considered an array of sectors.
If we want a file we just put it in the first available section of memory. Also have a region to hold file metadata. A directory is 256 bytes which is an array of dir entries which contain info about where the files are. There are no breaking of files, they are stored contiguously.

Problems

Advanced File System Technology

FAT File System

Goal

Want to kill the external fragmentation problem and the preallocating (atleast for data).

Details

File
Allocation
Table
So we have the following divisions:

Boot SectorSuper BlockFAT............ data ...........
Boot Sector

We didn't learn this piece of the filesystem but here is a link: Boot sector explanation

Super Block

This contains metadata about the file system. These include the size of the file system, the version number, and other assorted data.
The Super Block has a fixed size.

File Allocation Table

A FAT is an array of block numbers. For each 4Kib block in our file system we have an entry in the FAT. Each entry is either a 0 => EOF , -1 => Free Block, N : Block number of next block in the file we are looking at. This creates a linked list of storage. With the next pointers being the block numbers in the FAT.
This solves the problem of external fragmentation. Since we can string non-contigous blocks of memory together for a single file.

Data

This is just the actual file data.

Directory

A directory is a file whose contents are an array of directory entries. A directory entry contains a name, a size, and the index of the first block in the file.

...Name... Size Block index in FAT

We also need a root directory. This is simple just have it be the first entry in the FAT.

Issues

We can fix all this by introducing a new file system approach.

Inode Idea

Static Data Structures

As with the FAT we have a boot sector and a super block, but the difference is the inode table.
The inode table contains all the memory to hold all the files that will exist. This is a parameter set upon file creation.

Inode

An inode is an on disk data structure for each file in our file system. This is all preallocated, which means there is a max number of files our file system can have upon creation.
It has the following structure.

Improvements

lseek(...) is now O(1) because we can just find the block by offset into the file.
But we still need to fix the other 3 problems specified earlier.

As an aside, the Berkeley File System also includes a block bit map, which says if a block is free or not. This helps encourage contiguous allocation.