Lecture 12: File System Implementation
compiled by Danial Sajed & Paul Servino
Levels of Implementation
Layers of abstraction in a typical Unix-like file system
- Symbolic Links
- File Names
- ('/usr/bin/firefox') [absolute]
- ('../bin/sh') [relative]
- File Name Components
- directories map file name components to inode numbers
- Inodes
- numbers that represent files
- contain metadata for files
- File System
- Partitions
- Numbers that represent files
- Contain metadata for files
- Blocks
- typically 8192-bytes
- Sectors
- 512-bytes
Secure File Removal
Unlinking:
unlink('f');
Insecure because:
- File could be open.
- There could multiple hard links to file.
Erase data:
open(f,O_WRONLY | O_TRUNC );
Insecure because:
- Doesn't actually erase disc blocks.
Overwrite data:
open(f, O_WRONLY, );
fstat(fd, &st);
write(fd, zero_buffer,st.st_size);
Not quite secure because:
- Doesn't quite erase data.
- One can read from next to where the write head writes.
Solution: Write random data, multiple times:
Used in linux -- shred in many systems, this will still fail because write writes to a log, rather than the actual block of filespace. In this case, shred the entire partition
Most Secure Methods:
- Encryption. Encrypt entire disk. Commit key tomemory.
- Melt disk (completely secure).
In a RAID system:
A raid system contains two cloned disks (see diagram). This enables faster performance.
This make data harder to remove. If a drive fails and is replaced, the discarded drive will contain the crucial data until it is properly melted.
Directory Entries
Layout
Unix directory entry circa 1979:
Uniformly 16 bytes = [ 14 bytes(name) ][ 2 bytes (inode number) ]
Limit on file name size.
A more modern approach:
[ bytes in name (typically max 255) ][ name ][ file type ][ inode number]
Variable name length.
rm -rf method
open() - open directory for read
readdir() - read entries
unlink() as we go (won't work for subdirectories)
-use readdir to get next directory entry.
If it's a directory, recurse & rmdir()
else unlink
Finding Directory Entries
open('a/b/c', O_RDONLY);
Find 'a', directory entry inside working directory each process has a working directory stored in the process descriptor can be changed with chdir('')
looks for b inside a's entry
looks for 'c' inside b's entry
find's 'c'
if c doesn't exist, it creates it
Changing the root directory
We can use chroot to change root.
Chroot causes problems think of redefining (/etc/passwd).
Because of this chroot can only be run by root.
One can use chroot to create a chrooted jail.
Max Filename Length
char buf[1000000001];
for(i = 0; i < sizeof buf/2; i++)
strcpy(buf+i*2, '/a');
open(buf, O_RDONLY);
Linux: 1024 byte max file name (length on string passed to open).
e.g. 'a/a/a/a/a/a......./a/a/a/a/'.
Linking Directories
Unix Directories
'.' = (current dir's inode number)
'..' = (parent's inode number)
Therefore, every directory has a link count of at least two (it's parent and itself).
Problem
Cannot link directories to other directories.
Causes problems for deletion (can have nonzero link count, whilst directory is not accessible).
Solution
Cannot call link or unlink on directories.
Instead, use symbolic links.
Symbolic Links
Special file type: contents are file name, interpreted relative to a directory containing symlink.
Interpretted relative to the directory containing the symlink.
Can dangle and don't count for gardbage collecting.
Permissions on symlinks are ignored.
symlink('a','b');
File b contains the name 'a'. 'a' need not exist.
Aside: Random Numbers
Methods:
- A hardware random number generator
- Costly.
- Trust issues.
- Pseudo random number generator
- Can be cracked because keys are small and algorithm is too predictable.
- Low order bits of inter-arrival times from mouse & keyboard & packet arrival times...
- /dev/random (← exhausts entropy pool)
- /dev/urandom (← pool + pseudo random number generator)