CS 111

Scribe Notes for 11/06/08

by Anh Quyen and Young Eum

FILE SYSTEM IMPLEMENTATION

Inodes (index nodes, fixed-size file descriptor)

Inodes are contained with the inode table. They contain a file�s size and type, permissions, owner, the number of links (hard links) to this file, and time stamps. There is 1 inode per file (regular files, directories, etc�). However, inodes do not hold the file�s name or directory containing the file. To find the file�s name, you would need to look it up using the inode number.

Problems:

  • Multiple hard links won�t work.
  • Disk space will be run out either inodes or data partition in file system.

Alternative design: Bitmap

File name look up

To look up a filename "a"

  1. Get the inode number of the working directory. Every process has its own working directory. Look for working directory in the process descriptor.
  2. Read inode data into RAM. Check permissions, you must be able to search the directory.
  3. Scan for a directory whose name is �a�. This gives you an inode number for the file.
  4. Once you know the inode number for the file, you now know the metadata info and several pointers to the content.
  • for stat, we just need the metainfo;
  • for open/read, we need to follow the pointers;
  • we can use chdir to change directory: It does above step 1-4, then it sets the working directory to be the inode number of directory.

Note: Usually for small file system (N<1000), it takes O(N) to look up the filename. For a faster file system, use a hash table or tree.
e.g. open(�a�, O_CREAT)
To open �a� with O_CREAT, we need to allocate an inode, then modify the data in the parent directory.

How to look up �a/b�

  1. Look up �a�
  2. Make sure it is a directory. If not, fail with ENOTDIR (you are trying to do something with a file where a directory would be expected).
  3. Recurse with �b� using a�s inode as the working directory.
Notes:
  • 2 characters are not allowed in filename �\0� and �/�.
  • �.� means the current directory. e.g. �a/./b� is the same as �a/b�
  • every directory has at least one link to itself.

e.g. open(�/a/b/c�)
  • start from the root directory instead;
  • every process has its own root directory;
  • use chroot syscall to change root (only root can do this);
  • It creates a jam (which can�t get out of root) which is called �chrooted jail�
  • If chroot is used, that means there is no way you can get out with current directory �/..�


Notes:
In chroot jail, if we do this
$ln /victim/file /home/eggert/root/tmp/psdata
$ln /tmp /tmp/ouch
$unlink /tmp ? this is not allowed. It becomes garbage but not reclaimed.
$ls �R will loop!
$pwd
look in �..�
  • scan all names, looking for one that maps to �.��s inode.
  • recurse on �../..�, etc until no more process.

Symbolic Links:

Contents are the name of another file accesses that other file.
Problems:

  • they can dangle (OS response: file�s not there)
  • they can loop (errno = ELOOP). Note: Linux has an internal counter which checks if counter > 20 times through symlinks.
  • logical vs. physical names for files
  • performance is slow
  • hard links to symlinks
    $ln �s /bin/sh a
    $ln a b
    $ls �li a b
    19621 lrwxrrwxr� a->/bin/sh
    19621 lrwxrrwxr� b->/bin/sh
  • open(�), stat(�), execlp(�) follow symlinks
  • lstat(�), unlink(�), readlink(�), symlink(�), link(�) don�t follow symlink
  • system doesn�t care about permissions to symlinks -> security problem.

Multiple file system available simultaneously:

  1. (classic windows) put file system in file name
        A:/etc/passwd
        B:/usr/bin/sh
  2. (Unix) mounting a file system on another file

Mounts:

Mounting is the process of making a file system ready for the operating system to use, typically by reading certain data structures from storage into memory ahead of time. The mount command attaches the file system found on some device to the big file tree, thus instructing the operating system that the file system is ready for usage. The unmount command will detach it from the file hierarchy.

pic1

Advantages:They support more than one device and multiple types of file systems.
Problems:
  • Hard links can�t cross file system boundaries
  • An open file in an �invisible� area stay open.
  • Can create loops.
  • Open files (in working directory) are mounted file system. Do not allow unmount.
  • Mounts allow bypassing permission restrictions via mounting strange file systems.

VFS (Virtual File System):

Object Oriented access to file systems in C

pic2