CS 111-Lecture 12: File System Implementation

By: Jimmy Vo


Removing Data from Disk

What actually happens when we run the command rm? Using rm on a file does not delete the data. It only unlinks the entries on the inode. The data is still left on the hard disk, which presents a security flaw, since anyone could read the data off the drive.

In order to completely wipe the data, you would need to overwrite the data. One command that does this is the shred command. The shred command overwrites the file multiple times, making it even harder to recover the data. Another way to overwrite the data is to use /dev/random, which contains a random bit sequence. For x86 machines, the command RDRAND will also return random numbers. For the paranoid, we can use both /dev/random and RDRAND for true randomness.

File Allocation Table System (1970)

The FAT system was invented in order to solve fragmentation problems in previous file systems. There were two types of fragmentations: internal and external. Internal fragmentation was when a file occupied multiple sectors, leading to unused bytes in the last sector. External fragmentation occurs when there is enough free space on the disk, but they do not occur contiguously. Allocations for a large file would fail since there wouldn't be any "room" for the data.

[ Boot sector ][ Superblock ][ File Allocation Table(FAT) ][ Data Blocks ]

The first sector is always the boot sector, which contains instructions on booting the system. The Superblock contains metadata about the file system, such as the FAT version, block size, file system size, etc. The FAT is an array of integers where each entry corresponds to a block number on the disk. The integers can contain one of the following numbers:

Value Meaning
-1 Free Block
0 EOF, or last block, has been reached
N Index of next block

Directories are stored in the data blocks as an array of directory entries. They consist of three parts.

File name File Size in bytes File's first block location
11 bits 32 bits 16 bits

The 11 bits for the filename is split into 8 bit namefield and a 3 bit extension.

So how does it work? Initially, all of the blocks are empty. When you want to add data and need a new block, you search the FAT until you find and entry with -1. Since each data block points to the next block, we follow the blocks until we reach EOF.

Since the data is stored in blocks, we can have the file spread across the entire disk. This fixes the problem of external fragmentation. However, performance suffers because the data is scattered about. In order to fix this, defragmenters will locate scattered blocks and attempt to reorder them. However, this is slow and will tie up the disk.

Another problem can occur when moving files between two direction. Two directory entries are modified one at a time. If power is cut off in the middle of the operation, we will end up with either duplicate entries or no entries.

BSD FFS

[ Boot sector ][ Superblock ][ Bitmap ][ Inode Table ][ Data Blocks ]

The BSD fast file system is an improvement on the FAT file system. We have two new blocks, a bitmap and an inode table. The bitmap contains either 0 or 1 to determine which blocks are in use (one bit per block). The inode table consists of an array of inodes that point to data residing in the blocks.

Inodes

The inode contains information such as the size, type, number of hard links, date, permissions, and an address to the data blocks. The inode does not contain information on the file name, parent directory, etc.

In order to prevent hard links from creating loops in a directory, we forbid hard links to directories. Why do we use hard links in the first place? They were made with the idea to make renaming files easier. In order to rename a file, we simply create a hard link and then unlink the original.

Something fun we can do. If we wanted to clone a directory, we would use the command:

$git clone dir1 dir2

This creates a new directory that links all the inodes. This operation is faster than copying over an entire directory. The only problem is that writing to a file would also affect the clone of the directory, which is not what we want. Therefore, we can only use this if it is read-only.

Ext2directory entries

Each entry in an ext2directory has the following format:

Inode number Entry length (e) Name length (n) File type Name Padding
32 bits 16 bits 8 bits 8 bits l bits (e-n-64) bits

For this entry, n ≤ 255 bits. We store the length of the entry to find out how much padding there is at the end of the file. There are many benefits to storing this. For example, if we unlink a file, all we need to do is increase the length of the file in the previous entry by the deleted entry length. Creation and deletion is very cheap.