CS 111 section 2

Scribe notes for 5/15/2007

by Keith Stevens and Collinn Nielsen

More on File Systems

Hard Links

Hard links occure when two or more directory-entries map to the same inode number

Example code regarding Hard Links
  • To Create: link("/a/d/n", "/b/m");
  • To remove: unlin("/b/m");

this example will create a directory entry in the directory "b". It's inode number wil be the same as the the number found in "/a/d/" for the entry named "n".

Possible issues with hard links, and their answers

  • What if you unlink the last hard link?
    *the link count is reduced to 0 and the storage will be reclaimed by the OS
  • What if the link count becomes 0 and the file is still open
    *the OS will wait until the file is closed before reclaiming the memory
  • what if you do link("/a/b/m", "a/b/m");
    *This is an error, the destination must not already exist
  • Suppose you unlink the same destination twice
    *The second time you try to unlink the same destination will return an error
  • Suppose you try to unlink a directory
    *Unlinking a directory is not allowed
  • Then how do you unlink a directory?
    *the command: rmdir("/a")l would be used, but the directory must be empty first

Now suppose we have a directory "/usr/bin/" which contains a large amount of files and we want to have the contents of "/bin/" to contain all the same files. We could do this by create a hard link for each of the files in "/usr/bin/" but this would take up a large number of directory entries, so we need to have a better solution. This is where Symbolic Links come in.

Symbolic Links

A symbolic link is a third type of file, which allows for linking more files together. Instead of being a directory with directory entries that just reference an already existing inode number, a Symbolic Link is a file whose contents are just the path we wish to link. When a symbolic link is read, its contens are read and placed into the pathname to find the actual files. For our example the "/" would have a directory entry for "/bin/" and the corresponding inode would be marked as a symbolic link. So now if we request "/bin/ls", the contents of "/bin/" will be read and and included to the requested path name so that we now have a pathname of "/usr/bin/ls" and read that file.

Issues with Symbolic Links

  • If the directory "/usr/bin" changes name, we get a dangling Symbolic link
    *If a file is requested the OS just returns that the file was not found
  • Symbolic links can loop, for example symlink("x", "x");
    *Most OS's have a limit to how many symbolic links will be expanded, so eventually it will just stop
  • Can we have a hard link to a symbolic link?
    *Some Operating Systems allow this, others do not, but in principle yes we can
  • Can we change an already existing symbolic link?
    *In principle we can since it is just a file, but it is not allowed by Operatiing Systems
  • Then how do we replace the contents of a symbolic link?
    *first remove the old symlink and then create a new one with the same name and the new pathname

Multiple File Systems

Suppose we have two hard disks each with their own file systems which we want to access at the same time. A simple solution would be to number each disk and include the disk number a file is located on when we want a file. Unfortunately this simple approach is not very flexible and makes it difficult when the two file systems are not of the same type.

More Elegant Approach

A more Elegant way to use multiple hard disks is to mount one of them onto the other. One hard disk must be specified as the main disk, and any others will be mounted to a particular directory on the main disk. So for example we could say mount(disk2, "/home"), so when a user views the "/home" folder, they see the root of disk2.

Some more details about mounting
  • a mounted disk can be unmounted to remove it from view
  • a disk can only be mounted once
  • the entire file system must be mounted, not parts of it
  • no hard links between mounted file systems
  • the inode numbers will be local to each file system
  • we still need the inode number and device number to uniquely identify a file

But what about dealing with two hard disks that have differnt file system types, such as FAT and ext3. Our more elegant solution of mounting doesn't fully solve our problem. Instead we need to apply Object Oriented techniques to file accesses. This is where Virtual File Systems come in.

Virtual Fle Systems

A virtual file system requires each file system to conform to a generic interface so that a another layer in the kernel will be able to call the common read and write requests with with the same parameters for each file system. The Virtual file system will determine which of the file systems that are registered should recieve a particular request. Aside from conforming to a standard interface the implementation of each file system can do what they need to, such as having inodes or not having inodes.

Prior to any file system being available to the virtual file system, they must register themselves with the VFS layer

example:

rename ("a","b")
  1. Get working directory's directory # and inode # (struct task)
  2. Get inode into memory
  3. Get directory contents into memory
  4. change "a" to "b"
  5. write to disk

File operations that might cause trouble (continued)

rename("d/a", "e/b");
one possible solution:
  1. read d1 & d2's directory contents into memory (assume both fit into one block)
  2. clear "a" entry from d1's block
  3. write d1's block
  4. copy inode # into a new entry "b" in d2's block
  5. write d2's block

Power failure between steps 3 and 4 causes file to disappear right before step 5. On disk there is an inode that is unreferenced.

Description of a function that runs after reboot:

fsck (file system check)
  • inspects unmounted file system directly
  • looks for problems, repairs as best as it can
    *unreferenced blocks that aren't marked free are marked as free
    *unreferenced inodes are placed in a lost and found directory (must be fixed by hand)

As one can see, after power failure and reboot (and running of fsck) problems are caused with this first implementation:

Here is a better way of doing it:

  1. read d1 & d2's directory contents into memory
  2. increment file's link count, write to disk
  3. copy inode # into a new entry "b" in d2's block
  4. write d2's block
  5. clear "a" entry from d1's block
  6. write d1's block
  7. decrement file's link count

Worst case, if power failure happens, we won't have the problem of losing data; rather, an unreferenced block might be marked as used (if the link count is too high). Having too high of a link count is less severe than having a link count decremented too much and freeing data you weren't supposed to.

File System Correctness Invariants
  1. Every block is used for exactly 1 purpose: boot, superblock, bitmap, inode, data. (no two tables point to the same block)
  2. All referenced blocks are initialized to an appropriate value for their type.
  3. All referenced blocks are marked used in the bitmap.
  4. All unreferenced blocks are marked free in the bitmap. (relax this invariant)
Penalties for Violating the above 4 invariants (respectively):
  1. Disaster
  2. Disaster
  3. Disaster
  4. garbage that's uncollected, wasted space

Disk Performance Issues

Optimize disk access since it's the real bottleneck; CPU is doing fine in comparison. Worry about disk scheduling.

To read a sector on disk:

  1. accelerate heads in desired direction, decelerate once you're done
  2. wait for proper sector to fall under read head (via rotation)
  3. perform a read

Some sample specs of a disk (Western Digital Caviar SE16):

7200 rpm (120Hz), 16 MB buffer, 4.20 ms avg. rotational latency, 8.33 ms rotation time

  • 8.9 ms avg. read seek time
  • 10.9 ms avg. write seek time
  • 2.0 ms track-to-track seek
  • 21 ms. full stroke seek
  • 3.0 Gb/s transfer rate (between buffer and computer)
  • 0.97 Gb/s transfer rate (between buffer to disk)