CS 111
Scribe Notes for 5/14/08
by Jeremy Burgess, George Fangwen Wang
File System
Unix/BSD file system
Having extra level of indirection directory |
+/- |
Performance, slow indexing |
- |
Renaming is easier |
+ |
Allow hard links |
+ |
inode table must be preallocated |
- |
Here is an overview of inodes being link-listed with multiple levels
Procedure for looking up a file (inside kernel) e.g. "a"
- start with the working directory
NOTE: this is per-process; in each process descriptor, there is a working
directory slot this points to the inode for the working directory (main memory inode: copy of disk plus extra information) tells us where the directory's blocks are
- read those blocks
- scan for a directory whose name is "a". This gives you an inode # for the file
- you now know the metadata info + pointers to the content
- for stat, we just need the metainfo;
- for open/read, we need to follow the pointers too;
- we can use chdir to change directory
It does above step 1-4, then it sets the working directory to be the inode # of directory
e.g.
- open("a/b/c")
we need to do step 1-4 for directory 'a' to get a's inode #;
then do step 1-4 for 'b' starting at a's inode #.
- open("/a/b/c")
start from the root directory instead;
every process has its own root directory;
there is a chroot syscall to change root;
It creates a jam (which can't get out of root) <-- called "chrooted jail"
If chroot is used, that means there is no way you can get out with
current directory '/..'
- open("a/b", O_RDWR | O_CREAT, 0644)
open with creat:
if exists_fail
if not
create a new inode, add dirent to directory pointing to new node
Symbolic links
problems with hard links |
- replacing a file is tricky (need to find everyone pointing at the file and modify all directories
|
- no hard links is allowed to directories (It messes up reference counter/loops)
e.g. some times /bin=/usr/bin
|
- Symlink cannot cross file system boundaries
|
A new type of regular file contents of the symlink are always treated as a file name interpreting the
link almost always resolves to the contents of the link
Having extra level of indirection directory |
$ cd / |
|
$ ln -s /usr/bin bin |
(This will produce following shell link) |
$ ls -l bin |
lrwxrwxrwx root
root 8 (time) bin --> usr/bin |
|
Here usr/bin is treated from viewpoint of containing directory, not currently working directory |
$ ls -l bin/sh |
It will show /usr/sh |
Note:
In this case, $ cd /bin/sh == $ cd usr/bin/sh
Sidebar Note: How does pwd work?
- traditionally there was no symbolic links. Here is the way to pwd:
- stat(".",st); --> which gives you info on current directory
including inode
- then can open("...", O_RDONY, ...);
read its entries with readdir()
look for which one has your inode
- repeat until "..." = "." --> (which
indicates that you are at the root)
- print them out in reverse order
- symbolic links throw this off
- pwd will never return symlinks
- currently shell remembers where it came from..... it's a mess
Symbolic downsides
- you can have a symbolic link to a symbolic links
- this can create loops
$ ln -s loop loop
$ cat loop
$ fails: at open fails (errno == ELOOP)
Note: Linux has an internal counter which checks if counter > 20 times through
symlinks
- kernel saves you from an infinite loop by failing after ~20 times through
a symlink in a single path name
- "logical" vs. "physical" names for files
- performance: slow things down
- hard links to symlinks
$ ln -s /sur/bin
$ ln bin b
$ ls -li bin b
9176 2 lrwxrwxrwx root root
(time) bin -> /usr /bin
9176 2 lrwxrwxrwx root root
(time) b -> /usr /bin
- how do you fix/change a symlink? you can't
- how do you remove a symlink? unlink("sym") doesn't follow the simlink,
neither does rename() or lstat()
- system doesn't care about permissions to symlinks
Supporting multiple file systems
simple case:
same files system type ext3
DOS: |
|
|
A:/etc/passwd
B:/etc/passwd |
UNIX: |
|
|
/mnt/a/etc/passwd
/some/other/place/etc/passwd |
|
(administrator sets these names using mount)
|
mount table:
- these inode numbers are special
- maps inodes to file systems
- now to uniquely identify a file, you need the inode # and the device # (file
system number)
- hard links can't cross file system boundaries
- slows things down (don't mount things trivially)
- can create loops by mounting the same file system twice, but this is check by
syscall by default
- makes files vanish
- causes bugs
Supporting multiple kinds of file systems in Linux: you have multiple file
systems modules
"filesystem" base class: ext2,ext3,iso9660,NFS,VFAT
Virtual File System (VFS)
Linux VFS
-
struct task->struct filestruct->struct file->struct
inode->struct inode_operators->open/unlink;
-
struct file->struct file_operations->read/write;
-
struct task; is the
process descriptor
-
struct filestruct;
which is a set of file descriptors for open files
-
struct file; is an
open file
-
struct inode; is a
file (open or not)
- directories always point to these (working/root)
-
struct inode_operators
tells how to perform open/unlink type commands on a specific file
-
struct file_operations tells us how to perform read/write style functions for a specific file
Virtual memory
problems: we have buggy programs with bad memory references
solutions:
- hire better programmers -- costs money, trust
- runtime checking in user programs -- costs money, trust
- partition memory;
each running program gets a region;
enforce that in hardware
otherwise, TRAP
a. program cheats by changing the base or bounds register
answer: make these privileged instructions
-fixed size allocation
-no way for programs to share memory
|