CS111 - Lecture 12 - File System Impl...

File System Implementation

CS111 - Lecture 12 - May 11, 2010

Scribes: Advait Shinde, William Salinas, Thanh Nguyen

Lecture Topics:

Levels of a Traditional Unix filesystem
Unlinking and Security
Handling Symbolic Links
Mount and Umount
The Stickybit

1. Levels of a Traditional Unix File system

File System Work

{
Symbolic Links

File Names
File Name Components
Directories
Inodes
}
---------------------------------------
Block Level
{
Partitions
Blocks 8192
Sectors 512
}

Inodes: represents blocks.
Directories: there is an inode for each directory.
File name components: each division of the path is a file name component.

ex: /usr/bin/cat

Special Files name components

. current directory ex: /usr/./bin/./././cat
.. parent directory ex: /usr/bin/../bin/cat is the same as /usr/bin/cat (assuming no symbolic link)
empty means . ex : /usr/bin/ /cat

File name: is a group of name components in a path. The previous path is a name component.
Symbolic links : special types of files that contain a reference to another file of a path.

2. Unlinking and Security

ex: Suppose you don't want another person to read your file called foo. So you will probably call the following function

Unlink("foo");

Problem:

There may be an open file descriptor.
Reboot
what's wrong?
open files, read them --> OK
can read from other partition

Possible solutions

Shred file : looks at file (10 mB) overwrites that file with data, then removes it. (assumes traditional low file system) (perform 30 times random data)

We can do better because the file is still there, we just delete the inode pointing to it.

What can we do?

Shread the partition

Encript the file and forget about the password(encript every bit (hard))
Magnetic disk

3. Handling Symbolic Links

Can we shred symbolic link?

ex: ln -s quaterly-report.doc my dear cutie.pie

Directory	Inode Table	Data Block

f = open("d",O_WRONLY)

write(f.......)

Operation on Symlinks

// open ("l",O_WRONLY) /*does not work*/
unlink("l")
rename("l","m")
stat("l",&st)

lstat("l",&st)

readink("l",buf)

link("l","m")
ln -s /etc/password l
ln l m
ln -br l m // print the inode number

................... eggert-eggert 12 date l-> /etc/passwd

...................
what happened?

ls -l /usr/bin/x11
........................./usr/bin/x11-> .
ls -l /usr/bin/x11/x11/.......{works till symbolic link limit}
what happened?
ex: diff -r /usr/bin /some/other/bin
question: Should diff -r follow symlink?
suppose the answer is "yes"
Then diff will loop (GNU diff checks for these loops)

4. Mount and Umount

The commands 'mount' and 'umount'

Mounting is a process in which a secondary partition containing a filesystem is placed within a parent filesystem such as a directory in the parent acts as the root directory for the secondary. Consider the following command:

# mount /dev/sd3 /usr/local

This command will assumes that /dev/sd3 is a partition containing a recognizable filesystem. It will take the /usr/local directory and replace it with the root directory of /dev/sd3. The reverse of mount is known as umount:

# umount /dev

Note that mounting does not modify the contents of /usr/local. These contents are temporarily replaced with /dev/sd3 and will be replaced after umount is called.

Why root?

Also note that mount and umount must be run as root. There are several problems that could occur had regular users been allowed to execute them. For example, in a multiuser system (like seasnet), imagine a malicious user mounts his/her own filesystem to replace /usr/bin or /etc. This mount would affect all other users who would now be executing programs or reading the /etc/passwd file from the malicious user's filesystem.

Problems with mounted filesystems

Just as with symbolic links, there are several potential problems that can occur with mounted filesystems:

Inode numbers are per filesystem. Thus, in order to uniquely identify a file, one must have both an inode number (ino_t) as well as device identifier (dev_t). The device id serves to clarify the context of the given inode number (i.e. on which device the inode exists). In the current Unix model, hard links point to only an inode (the implied device id is the device in which the hard link exists). For this reason, hard links cannot exists across mounts. Experiments were conducted in the past where a device id was included in the hard link's inode, but more problems happened when a filesystem with hard links pointed to it needed to be unmounted. In favor of simplicity and flexibility, hard link support across mounts was dropped.
Circular mounts? Just as with symbolic links, a user might create a circular chain of devices. Luckily this problem was easily solved by allowing a given device to be mounted a maximum of one time.
Umounts and open fds. Imagine a process that has an open file descriptor to a file in a mounted filesystem. What happens when this filesystem is umounted? There are three possibilities:
1. Umount requires that there are no open file descriptors to files on the target filesystem - This approach is taken by Unix.
2. The umount succeeds and subsequent I/O operations on the open file descriptors fail - This is what happens when you unplug a flash drive.
3. The umount is treated just like unlink. Subsequent I/O operations on the open file descriptors succeed and the filesystem is not actually umounted until the last file descriptor is closed.
Setuid mischief. Brief aside: Consider the unix command passwd. Its function is to change the current user's password:
$ passwd
Enter new UNIX password: abc123
This program modifies two files - /etc/passwd and /etc/shadow. The tricky thing here is that the file /etc/shadow has the permissions (rw-r-----) and is owned by root. So how does a user without root privileges modify the /etc/shadow file? The answer is with the setuid bit. Examine the ls -l of passwd:
$ ls -l /usr/bin/passwd
-rwsr-xr-x 1 root root 42856 2010-01-26 09:09 /usr/bin/passwd
Notice the highlighted s in the permissions! This bit indicates that all users running this program temporarily get escalated privileges for the execution of that program. The Unix designers were careful enough to make sure that passwd only modified the intended files. Put simply, the setuid bit allows program creators (typically root) to grant other users the creator's permissions during the execution of that program.
Although this idea is great on paper, when combined with mounting, a mischievous user can definitely do some damage. Imagine if a user manages to get his filesystem mounted on a server (e.g. he asks the sysadmin to plug in his flash drive). Within the flash drive, the user can have root-created malicious executables with the setuid bit on. By executing these programs, the user can grant himself root access!

5. The Sticky Bit

Not known to many people is the fact that Unix files tend to have 12 bits of permission flags (as opposed to 9). The traditional 9 include the read, write, and execute priviliges for the user, his group, and all users. The three other bits include the setuid, setgid, and sticky bit. Just as setuid above, the setgid escalates executors to the group's privileges. However, the sticky bit is a little different. Here is an excerpt from the chmod man page:

The restricted deletion flag or sticky bit is a single bit, whose interpretation depends on the file type. For directories, it prevents unprivileged users from removing or renaming a file in the directory unless they own the file or the directory; this is called the restricted deletion flag for the directory, and is commonly found on world-writable directories like /tmp. For regular files on some older systems, the bit saves the program's text image on the swap device so it will load more quickly when run.