UCLA CS 111 - File System Implementation

Lecture 12 - November 17, 2014

Scribe Notes by Lawrence Becker, Jacob Moghtader, and Aaron Maughan

This is the third lecture in our study of file systems:

What are . and ..?

Let's open up a shell and type the following command:

$ ls -al

We know that ls lists the contents of the current directory and that the a and l options show the hidden files and file details, respectively. The output of the above command will be a list of the files in the current directory, even the hidden ones. Two of the hidden files in the list are of special interest: . and ...

. is a self pointer to the working directory.

.. points to the parent directory.

How do we implement these pointers in the file system?

Linking Directories

Consider this command:

$ ln / /usr/bin/oops

You can see that we're trying to hard link to a directory, but this isn't allowed. Why not?

Hard links to directories create loops

Linking directories would create cycles. Consider a program like find that uses Depth First Search (recursion). The find program in its current implementation would infinitely loop if it encountered a hard link to a directory. Let's take a look at some possible solutions.

Real-World Example: Cloning Emacs

Consider a git project with the following structure:

Directory before git clone

Let's trying running some commands:

$ cd src
$ git clone a b

Now our file system looks like this:

Directory after git clone

We've just copied the contents of a into a new directory called b. We were shocked to see that the clone operation took only 1.3 seconds to complete despite running on Professor Eggert's creaky old desktop machine that has a 5,400rpm hard disk drive. The project we've just cloned was the source code of Emacs, an open-source project with a 30-year history. How did we copy so much data that quickly?

This fast clone was possible because of hard links. The majority of the project (let's say a few hundred megabytes) are just history - historical files we won't ever need to change. Because these files are read-only, there is no harm in simply creating a hard link to their location. We really only needed to create brand new copies of the files we'll be modifying for our next commit.

Berkeley Fast File System: Symlink Implementation

In the Berkeley Fast File System, inodes are stored separately from the data to which they correspond. An inode contains information about a file, including its type (file, directory, or link), owner, timestamps, and data blocks.

How are symlinks treated differently?

For symbolic links, the data block points to the contents of the symlink. For example, if the link is pointing at /usr/bin, the data block of the inode simply contains the string "/usr/bin".

Symlink inode structure

Consider the namei command, which takes a file path and follows it until a terminal point is reached. It outputs the type and name of each level (a line of output, for example, would be d usr where "d" means "directory" and "usr" is the name of the directory being passed through). This program can be used for resolving confusion around the structure of symbloc links. When it sees a symlink, it substitutes its contents for teh name. We can see some potential problems with this: what if a link points to itself, or what if the file doesn't exist?

If the file doesn't exist, the program can simply fail. In a loop, the program will try to resolve the symbolic link 20 times before giving up: it sets errno to ELOOP. Since 20 is an arbitrary limit, a loop is not guaranteed, but there is a still a good chance that the loop exists.

How do we waste less space with symlinks? Short symlinks are stored in the inode itself. Data blocks are used to store large symlinks. The file system also uses two block sizes where short blocks are 1/4 the size of full blocks.

We can use the functions stat() and lstat() to access information about a file. These two functions are identical except when handling symbolic links: when stat() is given a symlink, it follows the link and fills in the information about the linked file. When lstat() is called on a symlink, the functions returns information about the link itself.

For a symlink "b", unlink("b") does not follow the link but deletes it, leaving the destination file intact.

rename("b", "c") also does not follow the symbolic link.


More File Types
Type Symbol
Directory d
Regular file -
Symbolic link l
File p
Character special c
Block special b
Contiguous file C

Layers of a Unix File System

From top to bottom:

Software File systems and mount
Software Inodes
Software File name components
Software Path names
Software Symlinks
Hardware Partitions
Hardware Blocks
Hardware Sectors

A file system must be mounted before its contents can be accessed by the system. For an everyday use case, consider what happens when you plug a USB drive into your computer. The kernel will mount the drive's file system so you can view and edit its contents, and then unmount it when you're done working. The kernel maintains a table of mounted file systems. An example mount table is as follows:

Pointers Dev, Inode Numbers
FS2 1,350
FS3 1,57
Multiple file systems are mounted to root (/)

To mount a disk, use the mount command:

$ mount /dev/dsk/03 /usr/bin

This invokes a system call to mount /dev/dsk/03 to /usr/bin. To prevent cycles, avoid duplicates in the mount table. Symlinks can work across file systems. The mount command requires root access because mounting a file system that already contains files will hide the existing files until the system is unmounted in the future. This poses many security risks.

shred and Erasing Disks

The shred command is useful for securely erasing files. It overwrites the data in a file with random numbers 3-7 times to prevent any party with physical access to the drive from seeing what was there. Because magnetic hard disks can leave traces of files on the platters, overwriting data multiple times helps to ensure that the data is in no way recoverable.

shred often has to write a lot of data, especially if it's overwriting large files. Let's look at a faster way to delete a file called "little-black-book".

$ rm little-black-book

This command is practically instant. The file won't show up when we ls inside the directory, so it's gone forever, right? Wrong! All we've done is erase the pointer to the file - the data is still physically present on the disk, and if somebody wanted to, they could recover it with little trouble. We've already discussed the security advantages of a command like this:

$ shred little-black-book

Should we just use shred every time? Which file deletion command you pick depends on your security needs and time contraints.

What if we're using a file system that uses redundant data? It's possible that the file we've just shreded is still present on disk, but possibly hidden by the operating system. We could shred the entire file system, but this could take a very long time. At this point the safest, most sure-fire why to remove the file from existence is to physically melt the drive. Indeed, some agencies possess furnaces for this exact purpose. In fact, a common process for permanetly deleting data is soft shred (using the shred command) -> hard shred (physically shredding the disk) -> demagnetizing (using magnets to erase the pieces of the disk) -> melting the pieces.

Supporting Different Types of File Systems in the Same Kernel

There are multiple types of file systems - EXT4, VFAT, and FFS are just a few examples - and we may want to mount them together in one place. We do this by abstracting the file systems using a VFS (Virtual File Systems) layer.

The VFS is implemented in the style of many object-oriented programs. For example, struct inode_operations contains pointers to functions such as link, unlink, and open. struct inode_operations contains pointers to read and write.

Sources

Valid HTML5! Valid CSS!