In computer disk storage, a subdivision of a track on a magnetic disk is called a sector. Sectors are units that performs read and write in the file system. Back in 1956, sectors started off at 512 bytes longs. It is anticipated that hard drives will move to the "advanced format" that use sectors of 4096 bytes. (Writes must be at least one sector large.)
A group of sectors is known as a block. Blocks are traditionally 8192 bytes, or 16 sectors, long. Bigger block sizes increases fragmentation and causes smaller files to waste more space. Some file systems actually allow smaller files to share a single block, which helps reduces fragmentation. Smaller blocks offer the benefit of flexibility. For applications that have lots of sequential I/O requests, such as scientific calculations, larger blocks sizes would increase efficiency.
Partitiions create smaller "virtual" disk out of one larger physical disk. It is possible to use different file systems in each partition. It is also possible to create one large "virtual" disk out of several smaller physical disks.
^Say /usr and /home use different file systems.
$cp /usr/bin/sh /home/eggert/junk/sh
//cp calls open
int ifd = open("/user/bin/sh", O_RDONLY...);
int ofd = open("/home/eggert/junk/sh", OWRONLY|OCREAT, 0666);
Traditionally, the file system type is designated in front of the file. Given that there are only 26 letters in the alphabet, the system can keep track of 26 different file systems at a time in a table stored in kernel memory.
$cp A:/usr/bin/sh B:/home/eggert/junk/sh
/usr/bin is stored in an inode, say inode number 3762. sh is stored as a directory entry in inode 3762.
Its directory entry points to the inode that stores the file, say inode number 263. Inode number 1 is reserved
for the root directory, designated by "/"
If there are multiple slashes in a path name, the extra slashes are ignored.
$cd /usr/bin///sh
#This is interpreted as
$cd /usr/bin/sh
There is an exception when there are exactly two leading slashes at the start of the path. The implementation of this is platform dependent. If a directory entry is not found, the open function returns -1 and sets errno == ENOENT. If the path does not start with a slash, the working directory is used as the start of the path. The kernel stores the inode number of the current working directory in the process table. The chdir system call changes the working directory by setting the inode number of the current process.
$chdir("/bin")
One of the first bugs found in UNIX was found in chdir in 1972.
//chdir.c
int main(int argc, char* argv){
if(argc != 2)
error();
if(chdir(argv[1]) != 0) {
error(argv[1]);
return 1;
}
return 0;
}
There were no bugs in the actual implementation of chdir. The correct inode number is written into the process table. The bug was actually in the shell, which did not change its local working directory. The fix was to have the shell recognize the change directory command and update its local working directory before calling chdir.
chroot changes the root directory of the current process.
//Change root directory to /home/eggert/junk
chroot("/home/eggert/junk/");
The process then cannot access files above the new root directory.
//This would execute /home/eggert/junk/usr/bin/sh
execvp("/usr/bin/sh");
chroot presents a possible security issue, allowing users to pose as root. On login, two files are checked to verify the user's password: /etc/passwd and /etc/shadow. These files can be spoofed by creating alternate files /home/eggert/junk/etc/passwd and /home/eggert/junk/etc/shadow. Then calling chroot.
chroot("/home/eggert/junk");
The user can now execute a command as root using sudo. Sudo wil check /home/eggert/junk/etc/passwd
rather than /etc/passwd. To fix this issue, chroot is a privileged system call that can only be executed by root.
chrooted jails are subset images of the root directory created by a superuser. They are often used by web
hosts to create multiple virtual hosts on a single server. One virtual server can have an Apache web server
and all relevant libraries stored in its chrooted jail. Thus, that single server cannot modify any files
outside of its jail, keeping the servers from interfering with one another. chrooted jails are created by:
//This creates an Apache server that can only modify its own files.
fork();
chroot(“/a/b”);
chdir(“/”);;
setuid(“apache”);
execlp(“/usr/bin/apache”);
chrooted jails cannot be escaped by using “..”. If the working directory is already the root directory, “..” is treated as “.”.
In a multiple file systems, the user just sees file names. The mount table tells you about the file systems being used.
Mount Tables are stored in kernel memory. Inode #'s are local to the file system that they're in.
To unique identify a file, we need: dev_t ino_t, filesystem #, inode #.
Directory layout (UNIX 1977)
- 16 bytes (14 for name, 2 for inode number).
Linux ext4 d 2
- 32-bit inode number, 16-bit directory entry length, 8-bit name length, 8-bit file type.
For small Linux directories, a concatenation of the above is used.
For large Linux directoriess, a hash table is used.
Hard links are 2 different directory entries taht point at the same file
To unique identify a file, we need: dev_t ino_t, filesystem #, inode #.
To create a hard link, we use the ln command
//This creates a hard link at /home/eggert/junk/pass for /etc/passwd
$ln /etc/passwd /home/eggert/junk/pass
Therefore, hard links cannot be created for directories, they can only be created for non-directories.
There may be certain situations where hard links can cause errors to certain function calls. The following
example would not work if we try to invoke a call of pwd.
$ln /home/eggert/junk /home/eggert/junk/j
So in this example we have a hard link from j that points to a junk directory, but it would not
work since pwd would subsequently invoke cals on both open and readdir
//Code for pwd
readdir(home/eggert/junk) //look at names in parent directory
fd open ("home/eggert/junk", ....)
Because of the error, we end up recursively going back from directory junk and j and end up with the following recursion: /home/eggert/junk/j/junk/j/junk/j
Acceptable commands for the BSD FFS Layout include creating a file, writing some data, and extending file. In the block bitmap, each block
is represented by one bit, with a 0 indicating that it's free, and a 1 indicating that it's allocated. The BSD FFS layout may not be optimally
efficient since it requires 3 lseeks and 3 writes to do a write. This, however may not be all thad bad, since a lot of applications are run in
parallel, and blocks are always allocated next to each other which help spatial locality. There is a correctness issue however when a power is pulled,
and half of the writes are finished, but this topic will be continued in the next lecture.