Lecture 12: File System Design

Mahir Shah

November 17th, 2014

Table of Contents

  1. Introduction
  2. Simple File System
  3. FAT File System
  4. Inode-Based File System
  5. BSD File System
  6. Hard Links

1. Introduction

File System: A data structure (on disk and in RAM) that provides an abstraction representing a collection of files. A file system is a hybrid data structure, which lives on the disk and in memory.

2. Simple File System (1974)

The figure below describes the file system architecture:

simple file system diagram

This file system can be thought of as an array of files. Each file is allocated contiguously and at the beginning of the array, a file dictionary is stored. This dictionary stores important metadata regarding the files, including the filename, the location of the start of the file in the array, and how much free size is left in the sector.

This type of file system is similar to those found in RT-11 operating systems. This is because of the inherent reliability that comes with using a file system like this.

Pros Cons
Simple architecture and implementation the max number of files is determined by the directory size
fewer seeks if I/O is sequential file size (bounds) must be known in advance
Internal fragmentation is rampant. Since the files are allocated along sector boundaries, there is wasted space left at the end of a sector.
External fragmentation also occurs. You may have enough free space in the file system to store a file, but it may not be contiguous, so you cannot store the file.

3. FAT File System (~1977)

  1. goal is to lessen fragmentation
  2. eliminate external fragmentation
  3. lessen internal fragmentation

The figure below describes the file system architecture:

FAT file system overview
1. Boot Sector:
This is initially empty but can be loaded with the Master Boot Record.
2. Superblock:
This holds data about the system like the version, block size, and block number of the root directory
3. FAT (File Allocation Table):
An array of 16-bit block numbers, which can be 3 possible values:
  1. -1 indicates a free block
  2. 0 indicates EOF, or the next block has nothing
  3. A Block number i, indicates the next block in this file
4. Data Blocks
The files are stored here, but split amongst different blocks--each of size 4 KiB
It would be possible to put the next field inside the data block instead of having a FAT table, but there would be 2 main disadvantages:
  1. metadata is mixed in with the data
  2. The block sizes would no longer be powers of 2

In the FAT filesystem the directory entries are files! Below is an image describing a FAT directory entry:

FAT directory entry structure
Pros Cons
No external fragmentation Linked List implementation means that there can be slow sequential access
Files can grow dynamically Internal fragmentation still exists
Very hard to move files between directories

One possible solution to fix slow sequential access is to defragment, where same file blocks are moved to be sequential. But there are major disadvantages to this:

  1. It can be unreliable if the power goes out or if something unexpected happens
  2. The process can slow and intensive

4. Inode-Based File System (~1973)

Found in Unix v7 (1977). Below is a diagram of the file system:

Inode based file system overview
1. Boot Sector:
This is initially empty but can be loaded with the Master Boot Record.
2. Superblock:
This holds data about the system like the version, block size, and inode number of the root directory
3. Inode Table:
An array of metadata for each file, including which blocks the file is stored in.
  • The first part of the of the inode entry contains file information like the owner, group, permissions, time stamp, size, and the link count
  • The next ten numbers in the inode entry correspond to the block numbers in which the file is stored
  • The next section is a pointer to an "indirect block". This indirect block consists of pointers that point to blocks with block numbers.
  • If there is additional space needed, a double indirection can be created by creating an indirect block that points to other indirect blocks.
4. Data Blocks
The files are stored here, but split amongst different blocks--each of size 8 KiB

5. BSD File System (~1980)

Also called the fast file system-this is very similar to the Inode based file system. Below is a diagram:

BSD file system overview
1. Boot Sector:
This is initially empty but can be loaded with the Master Boot Record.
2. Superblock:
This holds data about the system like the version, block size, and inode number of the root directory
3. Block Bitmap
Each bit tells if the corresponding block is free
4. Inode Table:
An array of metadata for each file, including which blocks the file is stored in.
  • The first part of the of the inode entry contains file information like the owner, group, permissions, time stamp, size, and the link count
  • The next ten numbers in the inode entry correspond to the block numbers in which the file is stored
  • The next section is a pointer to an "indirect block". This indirect block consists of pointers that point to blocks with block numbers.
  • If there is additional space needed, a double indirection can be created by creating an indirect block that points to other indirect blocks.
5. Data Blocks
The files are stored here, but split amongst different blocks--each of size 8 KiB
Pros Cons
Finding free space is faster via the bitmap more internal fragmentation
hard links
lseek is faster becuase no need to go through linked list