CS 111 Review: Inodes and Indirect Blocks
Lecture Date: 2-23-15

By Davin Hau, Kevin Lu

Disk Space wasted in File System Storage:

   File Size: 0 Bytes ⇒48 Bytes (Size of Inode) wasted
   File Size: 1 Byte
   ⇒Bytes wasted in Block: 8192 - File Size = 8191 Bytes
   ⇒Bytes wasted in Inode: 48 - 1 File Entry (4 Bytes) = 44 Bytes

Holey files are files don't contain a lot of bytes but take up a lot of disk space.
They can be generated by using lseek on a file pointer, forcing the use of a doubly indirect block.

Example: lseek creates a file with 1 trillion and 5 bytes, only last 5 bytes of file is used

   int fd=open("foo", O_CREAT|O_WRONLY|O_TRUNC, 0666);
   if (lseek(fd, 1000000000000, SEEK_SET) < 0) error();
   write(fd, "hello", 5);

   Word Size: 5 Bytes ("Hello")

File Structure Component Bytes Used Bytes Wasted
Inode 4 44
Triply Indirect Block 4 8188
Doubly Indirect Block 4 8188
Indirect Block 5 8187
However, executing ls says the file does not take up much space.

Map relational database into a 1012 bytes file
Advantages:
  Cheap file to create, beause zeroes only act as placeholders, not stored on disk
Disadvantages:
  Underlying system could exhaust disk space before database does

Multiple file systems are used for partitioning because damage is limited as a file system fills up.

File System Navigation

Mount Table

Implementation: Mount Device Numbers and Inode Numbers to a File System
Each entry has a pointer that points to a file system. User can only see one tree, since lower directories are hidden from the top directory
Directory Tree Number of Files Pointer to FS
1 13 →fs3
2 40 →fs2
3 40 →fs4
4 132 →fs1
PROBLEM:Loops may be possible in the file system
Example:
  map /etc →filesystem 3
  map /etc/foo →filesystem 4
  map /etc/foo/bar →filesystem 1 (root directory)
User would see an inifitely long tree: /etc/foo/bar/etc/foo/bar...

SOLUTION: Implement system call: mount(src,dest)
Mount system call fails if destination already exists EXCEPT for loopback mounts, which is another looping problem.

Support for Multiple File System Types

Structure:
  Use object-oriented programming to support file systems
  File Systems are treated as objects
  Virtual File System layer implemented in C

Example:
  /usr ext4 (Journaling File System)
  /home vfat (File Allocation Table)
  /home/eggert nfs (implementation via network server)

File Name Resolution

There is a function namei that loops over a filename.
It checks the current inode directory and adds the next file name component to form the inode number. Errors:
  If file is missing ENOENT
  If file permissions EPERM
Beginning of filenames
  With "/": Start at working directory
  Without "/": Start at root directory Functions:
chdir: Change working directory
chroot: Change root directory
NOTE: Changing root directory to a lower directory causes files to disappear, since chroot("..") does not do what "cd .." does

Hacking Example:
  1) Set up playpen directory similar to root directory
  2) Set up password using su
  3) Use chroot to force system into thinking playpen has password
  Hacking Succeeds..?
Reality: Won't work because chroot requires root privileges

Safety Measures: Setting up chrooted jails
Allows us to run dangerous programs we don't completely trust
HOWEVER, experts can escape the jail...

Symbolic Links

  $ln -s foo/bar a/b
  $ls -l a/b
    lrwxrwxrwx eggert eggert 23 Jun 2015 b->foo/bar

Namei replaces symbolic link with actual path:
   a/b/ca/foo/bar/c
Advantages Disadvantages
Cross system boundaires Loops are possible
Link to directories
Link to nothing

SOLUTION: At most 20 symlink extensions, or fail with ELOOP

Are there inode Numbers for Symlinks?
YES NO
Option 1 Two Types of directory Entries
  Data = symlink contents Ordinary: Filename maps to inode number
   $ln -s a b Symlink: link name maps to string
   $ln b c
   $ls -l
   2 b -> a
   2 c -> a
   $mv c /some/other/dir
Option 2
  Put information about link directly in inode
For Option 1: Moving symlink can point to two different files in different directories!

Symlink permissions
   Symlink permissions are mostly junk since they can be read but not written to.
What is important however, is the owner of the symlink, since symlinks take up space and that space needs to attributed to some owner to count towards their quota of disk space usage.
lstat("foo", &st); It does not follow symlinks and fills in file info for st
stat("foo", &st); As opposed to lstat, stat does follow the symlink.
readlink("foo", buf, bufsize); It reads the symlink and not its destination and doesn't follow the symlink.
unlink(); It does not follow the link and simply removes the symlink, not the actual file.

Device Drivers

crw-rrw-rw 1, 3 /dev/null
$mknod /dev/null
The numbers indicate the driver number(1) and device number(3).
/dev/null throws away everything written to it.
/dev/zero returns as many zeroes as you want.
/dev/full always returns a message indicating that it's full.

Named Pipes

  $mkfifo foo //named pipe
  $ls -l foo
  prw-rw-rw- <> foo

--In first terminal--

  $cat foo

--In second terminal--

  $echo blah >foo

--In first terminal--

  blah

Low-level Problems

PROBLEM: Need to purposely corrupt top-secret data
Solution 1: chmod 0
   Doesn't work, hacker mounts on different system to set chmod 444
Solution 2: rm file
  File still sitting on disk
Solution 3: Shred file
  Implementation: Overwrites file multiple times with gibberish bytes and zero bytes, then unlink

However, shredding the file itself is not sufficient, since a new data block is allocated, the old blocks are freed but data is still there
Instead, we need to shred at the file system level shred /dev/dsk/03, but this process is really slow.

Best solution: Encrypt file and throw away key
Most entertaining solution: Melt the physical disk