CS 111 Lecture 12: File System Implementation
Scribes: Troy Tabrilla and Trisha Liao
Lecture Date: 02/27/12
Table of Contents
- An Aside: A Bug in Grep
- But We Still Want to Attack!
- Preventing DOS Attacks
- Levels of a Unix-like File System
- Issues with Symlinks
- An Example File System: Linux ext3 v2
- Security Issues
- Wiping Data
- Shred Issues
- Symlinks vs Hard Linkes
- Differences
- Name Traversal for Symlinks
- Miscellaneous Details
An Aside: A Bug in Grep
The command
$ grep –r '^.*x\(\)\1' file
where 'file' contains a line with 231+2 characters would lead to a bug.
Inside grep, the following line of code is executed:
int len = strlen("...x");
where "...x" is the string matching the regular expression above with 231+2 characters.
len would overflow into a negative number, and attackers could use it to subscript into a "negative"
offset of the array. Since an int on a 32-bit machine can only represent at most 231-1,
it is too small to contain the length of the string above. One way to fix the bug is to change len
from an int to a size_t.
size_t len = strlen("...x");
But We Still Want to Attack!
We can launch a DOS (Denial of Service) attack and force grep to run for an extremely long time.
This could easily be done by simply plopping a terabyte file onto the file system. However, this would
waste too much disk space.
A smarter way to make grep take longer would be to abuse our knowledge of the file system's implementation
and feed it specially prepared files. For example, we could use grep on a holey file, which contains a huge
empty area (of zero bytes) and ends in the terminating character that grep is looking for. Representations of holey
files are shown in Figures 1 and 2.
Figure 1. Array representation of a holey File
Figure 2. Inode representation of a holey File
A holey file can be made with the following command:
$echo x | dd >holey bs=1MiB seek=2048
which skips the first 2048 blocks of the output file and writes out an x. To see how much space the file
really takes up, use the -s option:
$ls –ls holey
Preventing DOS Attacks
DOS vulnerabilities can be fixed.
- Grep is for text, and it can do whatever it wants with non-text.
Idea: skip 0’s and treat them as \n’s, so there will be no large buffers.
- Grep should detect holey files and give up if the file is too big.
Idea: use $ls –ls holey and $stat(“holey”, &st) to get the nominal size of the file and the actual
size of the file. Grep should give up if the difference between the two numbers exceeds internal
fragmentation limits.
Levels of a Unix-like File System
From the highest level to the lowest level, a Unix-like file system is roughly organized as follows:
- Symbolic links: files that contain file names. A kind of "shortcut" to a file. Highest level.
- File names: “a/b”, “ foo.c” , and “/a/b///c/d” are all file names. Include the path of the file.
File systems have to deal with mapping file names to file name components.
- File name components: names of individiual directory entries, such as foo.c or usr. The components of the path not
including any slashes.
- Inodes: contain the metadata of files and pointers to the actual data blocks for files. Of type ino_t.
- File systems: contain the superblock (the metadata for the file system), the free block bitmap, inode blocks,
and data blocks.
Figure 3. Unix File System Layout
- Partitions: isolated segments of the disk. Typically, each segment should not affect other segments. For
example, partitions can be divisions of file systems in separate operating systems such as Ubuntu, Windows, etc.
Figure 4. Sample Partitions on Professor Eggert's Computer
- Blocks: can be thought of as a linear array of sectors. Their size is dependent on the file system, e.g. a file system
can define a block to contain 8192 bytes, or 16 sectors.
- Sectors: can be thought of as a linear array of bytes on a disk. The number of bytes in a sector is dependent on the
operating system, e.g. there can be 512 bytes per sector. Lowest level.
Issues with Symlinks
Names of symbolic links that are a terabyte long can be difficult to handle. Setting a name length limit could result in lots
of internal fragmentation if it is too small, so the length should just be the size of a block. Symbolic links can be implemented
in three ways:
- An inode can be dedicated to a symbolic link, with the symbolic link’s contents in a data block. However, for a symbolic
link of length 8 on a file system with 8192 byte data blocks, 8184 bytes are wasted.
Figure 5. Symlink Type A
- Symbolic link contents can be stored in the inode itself. There is a 96 byte limit (8 bytes x 12 block pointer slots) for the
symlink path. A better solution could be to have a bit for switching between using data blocks for symbolic links (option A) or
storing the symbolic link’s contents in an inode (option B).
Figure 6. Symlink Type B
- Symbolic links can be implemented as directory entries. This option is faster than the previous options, but it is not used for
various reasons. There would be a varying length for directory entries and an extra bit used to account for symbolic links.
Figure 7. Symlink Type C
An Example File System: Linux ext3 v2
A directory entry in this system would contain the following properties: a 32-bit inode, a 16-bit directory entry length, an 8-bit
name length, an 8-bit file types, and an N-byte file name. The space after the file name is considered to be unused.
Figure 8. A Linux directory entry.
The 16-bit directory entry length and the 8-bit name length help ensure proper
unlinking. To unlink a directory entry, the preceding directory entry's length is increased to include the unlinked entry. The unused
space after the file name is eventually cleaned up by the operating system. The two lengths help the operating system align the directory entries.
Figure 9. Unlinking a directory entry.
A problem with this file system is that the command “rm –rf” would take O(N2) time to execute. Another directory structure
is needed. Hash tables, trees or other structures can be used, each with their own advantages and disadvantages.
For example, implementing a file system with a hashed directory and using the following commands:
$cd /my/file/system
$rm –rf *
would take O(N) time. However, the data of the files are still accessible. When unlinking, data blocks and inodes are untouched, which leads
to security issues.
Security Issues
Since the data blocks and inodes are untounched after unlinking, a malicious character can still come around and snoop through your data. We want
ways to ensure our data does not fall into the wrong hands.
Wiping Data
Creating new files (e.g. “>f1 >f2 >f3”) would not get rid of the data in inodes. The following are a few ways to delete the data on a disk:
- $dd if=/dev/random of=X
$rm X
The above commands would overwrite the data with random bytes and then unlink the file, but /dev/random is way too slow for us.
/dev/urandom would be faster, but it is still too slow.
- $shred file
This command wipes the data by overwriting it 3 times with random bytes and then unlinks the file. It contains a faster random number generator
than /dev/random or /dev/urandom. It will not work if disk controllers lie about saving data to disk when it is really saved to cache, so caching
needs to be turned off first.
- Melt your disk. The single surefire way to ensure that no one would ever be able to steal your data. User discretion is advised.
Shred Issues
- Bad blocks: Sectors have checksums that let them repair simple errors and remap data to reserved space at the end of the disk before a block becomes bad. Shred
doesn’t overwrite those bad blocks, so an attacker could still find data in them.
Figure 10. Bad blocks on a disk.
- Shred is still too slow: Attackers can quickly steal data before shred is completed. So how can we quickly "shred" our disk? Encrypt it. If the
contents of the disk need to be deleted immediately, toss the key. Problem solved.
Symlinks vs Hard Links
- Hard links - can be made with the command:
$ln d/a e/b (corresponds to link(“a”, “b”)).
- Symbolic links - can be made with the command:
$ln –s d/a e/c (corresponds to symlink(“a”, “c”)).
Figure 11 shows the links created by the above commands.
Figure 11. Example links
Differences
- Symbolic links can dangle, but hard links cannot. Using the commands:
$ rm d/a
$ ls –l e/c
would show "... e/c -> d/a", but trying to follow e/c would lead to an error.
- A symbolic link is relative to the containing directory.
$ ln –s foo d/x
$ ln d/x e/y
$ cat d/x is equivalent to $cat d/foo
$ cat e/y is equivalent to $cat e/foo
- Symbolic links can loop.
$ ln -s a b
$ ln -s b a
An open system call on either a or b would loop, but the operating system finds loops when you follow a symbolic link.
The command open(“a”, …) will fail. There is a loop condition: errno == ELOOP. The system will just give up after 16 path traversals.
Hard links can’t loop because hard links to directories aren't allowed as a rule. It would just cause lots of headaches if they were.
Name Traversal for Symlinks
$ln -s a/b/c link would map the path "a/b/c" to an inode with the name "link". Traversing this symlink involves finding a's inode, checking
to see if it's a directory, checking if the file b exists in it, finding b's inode, etc. until it finds the file c.
Figure 12. First Part of a Symlink Traversal for 'a/b/c'
Some other examples of symlink paths are:
- “a//b/c” - the double slash (a//b) is treated as a single slash (a/b).
- “a/b/c/” - only works if c is a directory; uses a as the working directory.
- “/a/b/c” – starts from the root of the root directory.
Miscellaneous Details
chdir(“a/b”) changes the working directory mode. This is the function call version of the shell command $cd a/b.
chroot(“a, “b”) changes the root directory inode. You can’t go back to the original root directory (for the most part). Used to create a
"chrooted jail" for sandboxing.
./ and ../ may or may not be inodes. This depends on how the file system is implemented.