CS 111 Lecture 12: File System Implementation

Scribes: Troy Tabrilla and Trisha Liao

Lecture Date: 02/27/12

An Aside: A Bug in Grep
1. But We Still Want to Attack!
2. Preventing DOS Attacks
Levels of a Unix-like File System
Issues with Symlinks
An Example File System: Linux ext3 v2
Security Issues
1. Wiping Data
2. Shred Issues
Symlinks vs Hard Linkes
1. Differences
2. Name Traversal for Symlinks
Miscellaneous Details

An Aside: A Bug in Grep

The command

$ grep –r '^.*x\1' file

where 'file' contains a line with 2³¹+2 characters would lead to a bug. Inside grep, the following line of code is executed:

int len = strlen("...x");

where "...x" is the string matching the regular expression above with 2³¹+2 characters. len would overflow into a negative number, and attackers could use it to subscript into a "negative" offset of the array. Since an int on a 32-bit machine can only represent at most 2³¹-1, it is too small to contain the length of the string above. One way to fix the bug is to change len from an int to a size_t.

size_t len = strlen("...x");

But We Still Want to Attack!

We can launch a DOS (Denial of Service) attack and force grep to run for an extremely long time. This could easily be done by simply plopping a terabyte file onto the file system. However, this would waste too much disk space.

A smarter way to make grep take longer would be to abuse our knowledge of the file system's implementation and feed it specially prepared files. For example, we could use grep on a holey file, which contains a huge empty area (of zero bytes) and ends in the terminating character that grep is looking for. Representations of holey files are shown in Figures 1 and 2.

Figure 1. Array representation of a holey File Inode representation of a holey file.

Figure 2. Inode representation of a holey File

A holey file can be made with the following command:

$echo x | dd >holey bs=1MiB seek=2048

which skips the first 2048 blocks of the output file and writes out an x. To see how much space the file really takes up, use the -s option:

$ls –ls holey

Preventing DOS Attacks

DOS vulnerabilities can be fixed.

Grep is for text, and it can do whatever it wants with non-text.
Idea: skip 0’s and treat them as \n’s, so there will be no large buffers.
Grep should detect holey files and give up if the file is too big.
Idea: use $ls –ls holey and $stat(“holey”, &st) to get the nominal size of the file and the actual size of the file. Grep should give up if the difference between the two numbers exceeds internal fragmentation limits.

Levels of a Unix-like File System

From the highest level to the lowest level, a Unix-like file system is roughly organized as follows:

Symbolic links: files that contain file names. A kind of "shortcut" to a file. Highest level.
File names: “a/b”, “ foo.c” , and “/a/b///c/d” are all file names. Include the path of the file. File systems have to deal with mapping file names to file name components.
File name components: names of individiual directory entries, such as foo.c or usr. The components of the path not including any slashes.
Inodes: contain the metadata of files and pointers to the actual data blocks for files. Of type ino_t.
File systems: contain the superblock (the metadata for the file system), the free block bitmap, inode blocks, and data blocks.

Figure 3. Unix File System Layout
Partitions: isolated segments of the disk. Typically, each segment should not affect other segments. For example, partitions can be divisions of file systems in separate operating systems such as Ubuntu, Windows, etc.
Figure 4. Sample Partitions on Professor Eggert's Computer
Blocks: can be thought of as a linear array of sectors. Their size is dependent on the file system, e.g. a file system can define a block to contain 8192 bytes, or 16 sectors.
Sectors: can be thought of as a linear array of bytes on a disk. The number of bytes in a sector is dependent on the operating system, e.g. there can be 512 bytes per sector. Lowest level.

Issues with Symlinks

Names of symbolic links that are a terabyte long can be difficult to handle. Setting a name length limit could result in lots of internal fragmentation if it is too small, so the length should just be the size of a block. Symbolic links can be implemented in three ways:

An inode can be dedicated to a symbolic link, with the symbolic link’s contents in a data block. However, for a symbolic link of length 8 on a file system with 8192 byte data blocks, 8184 bytes are wasted. Figure 5. Symlink Type A
Symbolic link contents can be stored in the inode itself. There is a 96 byte limit (8 bytes x 12 block pointer slots) for the symlink path. A better solution could be to have a bit for switching between using data blocks for symbolic links (option A) or storing the symbolic link’s contents in an inode (option B). Figure 6. Symlink Type B
Symbolic links can be implemented as directory entries. This option is faster than the previous options, but it is not used for various reasons. There would be a varying length for directory entries and an extra bit used to account for symbolic links. Figure 7. Symlink Type C

An Example File System: Linux ext3 v2

A directory entry in this system would contain the following properties: a 32-bit inode, a 16-bit directory entry length, an 8-bit name length, an 8-bit file types, and an N-byte file name. The space after the file name is considered to be unused.

Figure 8. A Linux directory entry.
The 16-bit directory entry length and the 8-bit name length help ensure proper unlinking. To unlink a directory entry, the preceding directory entry's length is increased to include the unlinked entry. The unused space after the file name is eventually cleaned up by the operating system. The two lengths help the operating system align the directory entries. Unlinking a directory entry.

Figure 9. Unlinking a directory entry.

A problem with this file system is that the command “rm –rf” would take O(N²) time to execute. Another directory structure is needed. Hash tables, trees or other structures can be used, each with their own advantages and disadvantages.

For example, implementing a file system with a hashed directory and using the following commands:

$cd /my/file/system
$rm –rf *

would take O(N) time. However, the data of the files are still accessible. When unlinking, data blocks and inodes are untouched, which leads to security issues.

Security Issues

Since the data blocks and inodes are untounched after unlinking, a malicious character can still come around and snoop through your data. We want ways to ensure our data does not fall into the wrong hands.

Wiping Data

Creating new files (e.g. “>f1 >f2 >f3”) would not get rid of the data in inodes. The following are a few ways to delete the data on a disk:

$dd if=/dev/random of=X
$rm X

The above commands would overwrite the data with random bytes and then unlink the file, but /dev/random is way too slow for us. /dev/urandom would be faster, but it is still too slow.
$shred file

This command wipes the data by overwriting it 3 times with random bytes and then unlinks the file. It contains a faster random number generator than /dev/random or /dev/urandom. It will not work if disk controllers lie about saving data to disk when it is really saved to cache, so caching needs to be turned off first.
Melt your disk. The single surefire way to ensure that no one would ever be able to steal your data. User discretion is advised.

Shred Issues

Bad blocks: Sectors have checksums that let them repair simple errors and remap data to reserved space at the end of the disk before a block becomes bad. Shred doesn’t overwrite those bad blocks, so an attacker could still find data in them.
Figure 10. Bad blocks on a disk.
Shred is still too slow: Attackers can quickly steal data before shred is completed. So how can we quickly "shred" our disk? Encrypt it. If the contents of the disk need to be deleted immediately, toss the key. Problem solved.

Symlinks vs Hard Links

Hard links - can be made with the command:
$ln d/a e/b (corresponds to link(“a”, “b”)).
Symbolic links - can be made with the command:
$ln –s d/a e/c (corresponds to symlink(“a”, “c”)).

Figure 11 shows the links created by the above commands.

Figure 11. Example links

Differences

Symbolic links can dangle, but hard links cannot. Using the commands:
$ rm d/a
$ ls –l e/c
would show "... e/c -> d/a", but trying to follow e/c would lead to an error.
A symbolic link is relative to the containing directory.
$ ln –s foo d/x
$ ln d/x e/y
$ cat d/x is equivalent to $cat d/foo
$ cat e/y is equivalent to $cat e/foo
Symbolic links can loop.
$ ln -s a b
$ ln -s b a
An open system call on either a or b would loop, but the operating system finds loops when you follow a symbolic link. The command open(“a”, …) will fail. There is a loop condition: errno == ELOOP. The system will just give up after 16 path traversals. Hard links can’t loop because hard links to directories aren't allowed as a rule. It would just cause lots of headaches if they were.

Name Traversal for Symlinks

$ln -s a/b/c link would map the path "a/b/c" to an inode with the name "link". Traversing this symlink involves finding a's inode, checking to see if it's a directory, checking if the file b exists in it, finding b's inode, etc. until it finds the file c.

Figure 12. First Part of a Symlink Traversal for 'a/b/c'

Some other examples of symlink paths are:

“a//b/c” - the double slash (a//b) is treated as a single slash (a/b).
“a/b/c/” - only works if c is a directory; uses a as the working directory.
“/a/b/c” – starts from the root of the root directory.

Miscellaneous Details

chdir(“a/b”) changes the working directory mode. This is the function call version of the shell command $cd a/b.
chroot(“a, “b”) changes the root directory inode. You can’t go back to the original root directory (for the most part). Used to create a "chrooted jail" for sandboxing.

./ and ../ may or may not be inodes. This depends on how the file system is implemented.