General Parallel File System (GPFS)

Suppose we have a system that uses 120 PB of storage. We don't want to use a single hard drive because that would be super slow. So, let's use 200,000 drives, each 600 GB. Hard drives nowadays easily top 1 to 2 TB of storage, so why 600 GB? There are two main reasons: we can read more quickly from smaller hard drives and this system is several years old.

This begs the question, though: how do we manage 200,000 hard drives? We could use a single CPU attached to all of these drives via a single bus, but this would lead to horrible performance bottlenecks. Instead, we'll use the General Parallel File System (GPFS) developed by IBM. GPFS uses a network of computers, each of which is hooked up to a subset of hard drives, to efficiently parallelize I/O.

Features of GPFS

File Systems

All this being said, what actually is a file system?

A file system is an abstract organization of data on secondary storage that is big, persistent, and slow. It should survive program and OS crashes as well as power outages.

File systems shouldn’t be concerned with extremely low level details, such as disk sector sizes. But that raises the question: how abstract should it be? We need to be able to read, write, read-write, move, allocate, and delete data. Some file systems also support searching data, but this leads to greater complexity because there are so many ways to search through data - for example, most developers can’t even agree on a single regex format.

Professor Eggert's File System

When Professor Eggert was an undergraduate student, his Professor was developing a competitor to Unix. He tasked Professor Eggert with creating the file system for this Unix competitor. These are the machine and OS specifications Professor Eggert was given:

Keep in mind that Professor Eggert had no idea about how to write a file system at this time - he hadn't taken an operating systems class because... those didn't exist yet! Professor Eggert's approach for the file system consisted of dividing the disk into two parts.

  1. A table (arbitrarily made to be 2 sectors long), with each entry consisting of a:
    • Name (8 bytes)
    • Starting Sector (12 bytes)
    • Size in bytes (2 bytes)
    • Timestamps & other miscellaneous information (4 bytes)

    Some things worth noting about the table:
    • If an entry's name is shorter than 8 bytes (the allotted quantity for the name field in the table), the convention is to use null bytes to fill the empty bytes. Note that this means that names can be at most 8 bytes long.
    • A name field consisting completely of null bytes indicates that the table entry is not in use; inherently, empty file names are not allowed.
    • There is space for 26 total files.
  2. The file-data (taking up the remainder of the disk)

Now let's analyze some of the pros and cons of the file system that Professor Eggert came up with.

Pros

Cons

Surprisingly, some file systems were built this way: RT-ll (DEC 1970). The developers of RT-ll chose this paradigm because it makes working with real-time systems easier (sequential access is fast and predictable because files are stored in contiguous segments on disk).

FAT File System

Around the same time Professor Eggert was building his file system, engineers at Microsoft were also working on the same problem. They, however, did not want to pre-allocate or run into the above fragmentation issues. So, they came up with the FAT file system, with the following design:

  1. First, leave an empty sector at the beginning of the disk
    • This is used for the boot sector.
  2. The next section is called the super block.
    • The super block contains information about the file system - version number of the file system, size (how many blocks in the file system?), used block count, and other global information.
  3. Next, we have the File Allocation Table, which helps us keep track of where files start and end via a linked list.
    • Each entry in the FAT maps directly to a block on disk; if we have n blocks on disk, then we'll also have n entries in the FAT.
    • Each entry also contains an id that points to the next block to look at when we reach the end of the current cluster.
    • When we're reading a file and we reach the end of a block, we can follow the next pointer to find the next block that we need to read from until we reach an entry whose value is 0.
    • When we're writing to a file, we just need to find an empty entry in the allocation table (look for any entries whose value is -1).

We need at least one root directory in our filesystem - every other file is going to be a descendant of some sort. We can represent files as directory entries, where each entry keeps track of the file's name, where the file starts, and the file's size. Moreover, we can represent directories as files by flipping a few bits that tell us what kind of file we're working with. Regardless, let's say that we pick file i and, by looking at its directory entry, find that it starts at block number j on disk. We can then read from disk, starting at block j until we reach a block whose next value is set to EOF.

Let's say we wanted to list all the files in our file system. We'd first find the root directory and list all normal files directly in it. We'd then find all subdirectories and jump to the starting block specified in the directory entry. When we do so, we can do the same thing we did for the root directory - list all normal files, find all subdirectories and jump to their starting blocks, and so on.

FAT File system

Image courtesy of magic-partition.com.

Directories are just files!

A directory consists of an array of directory entries, where each entry is composed of a:

FAT Pros and Cons

Pros

Cons

Another Note on FAT Files

Let's say you wanted to rename a FAT file within the same directory. This is easy - the FAT file system would just simply
  1. Find the foo.c block.
  2. Load it into RAM.
  3. Then update the file entry's name and write the block again.
But say you wanted to move a FAT file between two distinct directories. This is difficult - the FAT file system would have to create a new block in the new directory and remove the old block from the previous directory, but several issues can arise if the process is interrupted (say, if the OS crashes). The old file could still hang around, for example, or it may even be deleted before the new file has been fully initialized. FAT solves this problem by prohibiting file renaming between multiple directories.