CS 111, W13

Prof. Eggert

Lecture 12

Week 8, Monday

Angela Navarro - 203817385

Bryan Yasukawa - 403775761

Daynan Lai - 003766382

Kevin Fongson - 603785957

Problem with Symbolic Links

Example:

  1. In emacs, let's say you're editing/etc/passwd
  2. While the emacs buffer contents != file contents, emacs creates an extra file (symbolic link) /etc/.#passwdthat points toeggert@penguin.941[host + pid]
  3. This is a locking mechanism, and the symbolic link contains info about emacs

What can go wrong?

  1. Non emacs editors
  1. emacs runs stat to check if file's timestamps has changed before rewriting
  1. Emacs crashes & exits before extra lock can be deleted, preventing new emacs from running
  1. workaround: issuekill() system call with the process ID - kill(941, 0)
  1. This workaround results in another issue.PIDs are reused, so by the time you call kill(), that PID could belong to another process
  1. Emacs loops while holding lock
  1. Workaround: emacs can steal the lock. This blows away the symbolic link and creates its own
  1. Suppose/etc/.#passwdalready exists for some other reason
  1. Workaround: if a regular file exists by this name, then we skip all locking and charge ahead. We hope that this is a rare occurrence
  1. Another app removes the lock file (or changes what it points to)
  1. this messes up emacs!
  1. You haven't changed the buffer yet and someone else locks it
  1. when you start editing the file, emacs stats the file again and checks the timestamp (same solution as #1)
  1. File name base after last slash >= 254 bytes (going over 255 byte limit)
  1. Solution: same as #4, you just ignore locking since you can't create the lock file
  1. Different emacs'es on different hosts can interoperate
  1. lnxsrv01.seas.ucla.edu vs lnxsrv03.seas.ucla.edu
  2. Windows: doesn't like symbolic links
  1. You can't create symbolic links unless you have a special "Create Symbolic Links" privilege
  2. Workaround: use a regular file instead, i.e.

cat /etc/.#passwd

eggert@penguin.941

This workaround breaks the solution in#4

Alternatives to using symbolic links for lock files

  1. system call: fcntl(fd, F_SETLK, ...)
  1. POSIX only, won't work on windows
  2. postdates emacs
  3. doesn't work with network file systems until NFSv4
  4. machine boundaries do not let outside machines kill a machine's own process
  1. just using regular files instead of symlinks
  1. performance:

img4

img2

  1. with symlinks, instead of having data in a separate block you can put data into the inode's contents
  2. this optimization is possible if the symlink length < 48 bytes, which avoids an extra seek
  1. getting contents:
  1. regular file: fd = open(...) + read(...) + close(...)
  2. symbolic link:symlink(".#file", buf, size)
  3. symbolic links use 2 fewer syscalls and are atomic, which means they always get the whole contents of one version of the file

Example: Here symlinks are treated as regular files

img3

img0

$ ln -s 'eggert@27' foo

$ ln foo bar

  1. Above has hard links to the same symlink
  2. Two different names for same file
  3. Symlinks are read only, cannot change the contents of a symlink
  4. You can't change footo change bar
  5. Suppose someone wants to open(/foo/bar)
  1. have to look at 4 disk spaces
  2. slows down filename resolution

Example: Here symlinks are treated as directory entries

img5

Symbolic links are different types of directory entries

- varying amount of space

+ fewer disk accesses

- no hard links b/w symlinks

Consider the following exploit:

Attacker (eggert):

I know someone will put data into /tmp/foo so...

$ ln -s ~/eggert/data /tmp/foo

Victim:

unmask 077

sort -o /tmp/foo

uniq /tmp/foo

rm /tmp/foo

So now the attacker has a copy of the file but can't read it:

Attacker (eggert):

I know someone will put data into /tmp/foo so...

$ ln -s ~/eggert/data /tmp/foo

$ touch ~/eggert/data

$ chmod 777 ~/eggert/data

Victim:

unmask 077

sort -o /tmp/foo

uniq /tmp/foo

rm /tmp/foo

Now the attacker can look at the file!

File Name Resolution:

$ open("a/b/c/foo", O_RDONLY)

Steps:

  1. Get the process' working directory entry D from the process table:

[ ~~~~~~~~ | 3961 (working dir) | ~~~~~~~~~~~~~~~]

  1. Get 1st file name component C
  2. Look up C in D's data
  1. if none, fail with errno = ENOENT
  2. if its a symbolic link, substitute symlink contents to be the actual path
  1. Now we have inode I
  2. if D = I, loop back to 1

The system call chdir uses this algorithm above to set the process's current working directory to D.

Problems:

  1. Suppose there exists a symlink a/b -> x/y
  2. If symlink contents starts with a slash, we have to erase the beginning of the path up until now to get the correct path.
  3. What if its a symbolic link loop?
  1. keep a counter of the number of symlinks traversed. the limit is 20, otherwiseerrno = ELOOP
  2. this solution is a heuristic designed to improve speed
  1. If the path starts with a slash, the system call chdir("foo") will change the working directory.

Sidenote:

#include <unistd.h>

int main (int argc, char** argv) {

chdir(argv[1]);

}

--

$ gcc main.c -o mycd

$ ./mycd /tmp

$ cat foo

This won't work because chdir and chroot can't be called in a program (changes a child process's dir or root, not the parent shell that called it)

Link counts and hard links:

img7

Problems:

  1. removed a link but forget to decrement link count
  2. didn't remove link but decremented link count
  3. link count hit max and overflows
  4. loops of hard links: not allowed, no hard links to directories! See diagram below.

img6

Brief look at other problems in FS:

Example: GPFS (a big machine file system)

120 PB 200,000 hard drives - ~600 GB each

Some features:

  1. Stripes: blocks of data over multiple disks

img1

  1. Parallel I/O
  2. Distributed metadata - directory lives in file system
  1. ex./usr/binbig widely used disc drive
  2. Several copies of common directories
  1. Efficient directory indexing - faster than O(N) (say, B-Tree structure)
  2. Distributed locking
  3. File system stays live during maintenance

magic-gpfs-clone /gpfs /gpfs-feb-25

cd /gpfs-feb-25

tar -cf /dev/tape