CS 111 Lecture 12
Scribe Notes for 5/12/2014
by Andrew Foglia and Anthony Rivera

Today's Lecture:
File System Implementation

Let's revisit the paranoid professor. He wants to reliably remove data from the file system but is afraid people may still be able to read it depending on the file system. The professor can try multiple approaches ("file" is the file we are attempting to remove):

$ rm file



$ rm file
$ reboot



Where is this remove function implemented?

FAT file system

UNIX file system



$ shred file



So how do we remove a file and satisfy our paranoid professor?

Can still use shred, but instead try erasing one file at a time and erasing the whole file system:
$ su
$ shred /dev/rst/0
But there is a catch!
- Disks are flakey
- Blocks can be bad on occasion
- Disk drive manufacturer reserves extra area to recover from a bad block (using a bad block map)
- For each bad block, the map will tell us where the good version is
- At the low level, we can recover data from looking at the bad blocks


Summary


Levels of a File System (UNIX)

This is the typical layout of a UNIX file system (from highest to lowest level):

Symbolic Links
File Names
File Name Components
inodes

---- FILE SYSTEMS BOUNDARY ----

Partitions
Blocks (81952 bytes)
Sectors (512 bytes)

What is all this stuff?


Sectors


Blocks


Partitions


Above the File System Boundary

inodes
- Representation of a file
- Talks about a file no matter how you get to that file
File Name Components
- All parts of the file name except the '/'
- e.g. user, bin, and grep in /user/bin/grep
File Names
- Also known as "pathnames"
Symbolic Links
- Operate above the file name
- Abstract

inodes

inodes are an integral part of any file system. What is in an inode and what isn't?

Free Space in UNIX File System

Look at all the inodes

- If a block is not being pointed at then it is free
- All blocks - {blocks in use} can be an expensive computation
- Not practical
Need an efficient way to detect free space and where it is

- FAT: If entry is -1 then it is free space
- UNIX: How do we tell...
A solution for UNIX: BSD File System

+ Fast to allocate storage
- I/O needed to allocate or free anything
- Cache in memory may not be same as what is on disk (constant checks are expensive)
- Want to minimize seek time (average seek time is ~10 ms + rotational latency of ~8 ms)
Note: When growing a file we want contiguous data which speeds sequential I/O if no other activity

What is NOT in an inode?

Bad performance:

Must go right to disk whenever a file is opened
Kernel must keep track of which processes have the file open in main memory
Reboot clears all data anyway (don't want to clear later)
Security issues

Hard Links

Many arguments are made against the use of hard links. Hard links can complicate the user model. Also, without them it is easy to compute the parent directory file name from the file. They can also cause some loops which could tamper with your system (except this is not possible in UNIX).

Question: How do we avoid hard-link loops?

Answer: We prohibit hard links to directories. No loop can be created since the link cannot point anywhere else.

Why bother with hard links?

In the Original UNIX File System

Instead of:
rename("a", "b")

Use:
link("a", "b")
unlink("a")




Fun Stuff with Hard Links

$ git clone glibc glibc-copy


A Typical Directory Entry (ext2)

Say we want to remove a directory entry...

Example Filetypes


Looking at the directory entry will give you some information about the file, however the inode also has the above information (which is also cached). Be careful about this cache coherence, where Linux addresses by saying a file cannot change its type.


The lecture was interrupted by a fire alarm, so notice shorter notes than expected.