Operating System Principles
CS 111 - Spring 2013
Instructor: Prof. Paul Eggert
Scribe Notes - Lecture 13 (5/16/2013)
By: Deepak Ananth Rama
Topics:
Consider the case of updating a virus checker software on an file system. Assume that the current version of the software is placed in the file system as below.
What we want during the update:
Suppose the following simple steps are performed:
$ wget http://new-version.tar.gz //1 The problem with the above steps is that between steps (4) and (5) no virus checker software exists on the system, leaving it open to attacks. In addition, a power failure after (4) may lead to inconsistencies. Therefore, we would ideally want to perform steps (4) and (5) together as an atomic operation, but no such enabling construct exists currently. A workaround would be to use symbolic links.
Symbolic links are directory entries whose value is a filename. In other words a symbolic link has a filename as its contents they are files with filename for contents, i.e. their content refers to another file. The destination file may or may not exist, thus symbolic links may often be used to store string values.
Here foo is the destination, and bar is the symbolic link. The file type l indicates that it is a symbolic file. System call symlink() is used to create a symbolic link.
symlink(“foo”, “bar”)With an assumption that creation of symbolic link is an atomic operation, it can be used to do the virus checker update. Suppose v1 is the old version of the software, and virus-checker is created as a symbolic link to v1. It would have been simple to change the value of virus-checker to point to the new version. However, symbolic links are read only. Creation of symbolic links is atomic, but deleting and re-creating it is not.
$ ln -s v1 virus-checkerA workaround is to create a new symbolic link, say new-virus-checker as a symbolic link to the new version v2, and then rename the new symbolic link new-virus-checker to the old link virus-checker. This works, as rename is always atomic. This ensures that the effective location of the current active virus-checker is maintained.
$ ln -s v2 new-virus-checker $ mv new-virus-checker virus-checkerHow to implement symbolic links? Traditional answer is to use inodes, with the destination filename stored in a data block pointed to by the symbolic file’s inode’s first direct pointer as shown below.
Hard links can be created to symbolic links.
$ ln -s a b
$ ln b x
$ mkdir d
$ ln b d/y
$ ls -l d/y
lrwxrwxrwx ...... d/y -> a
There are arguments that this wastes memory. A short circiuit answer is to use short symbolic links. Here the destination file name is stored in the inode itself, in place of the direct pointers. This limits the length of the destination file name, while preventing memory wastage.
Symbolic links are always in the context of the directory. The below cat command fails as d/y points to a, which is expected to be in the same directory as y. d/y => d/a.
$ cat d/ySuppose you have opened a file in write mode to write the contents from a buffer into the file.
fd = open(“file”, O_WRONLY);The above steps violate the golden rule of durability - “Never overwrite the only copy of your data”. The reasoning behind this is to prevent loss or corruption of data in case of a power failure. Since large writes are not atomic, a power failure midway through the write operation could possibly corrupt the existing data on the file, as well as cause loss of the data in the buffer.
One possible solution is to use two files, say file1 and file2. During each write operation, overwrite file1 and file2 alternately. This ensures that at least the existing data in the file is recoverable. This approach should work in a simple environment, but there would be no easy way of identifying the old and new versions of the file. Let’s look at the various steps below, where A and B are the two copies.
At any point of time we have either A or B on the disk. The problem here is that on a reboot case 2 and 3 are bad as we don’t know which of the two versions is correct. Timestamps would not help identify the correct version, as the junk data file might have a more recent timestamp.
A more feasible solution involves having three copies of the open file. The steps are shown below. In any case, the copy with the majority is chosen as a winner and identified as the correct file to be considered. Since three copies are maintained, more storage is required and hence rarely implemented in operating systems.
Consider a rename() system call as below, it is not an atomic operation. The current process’s process descriptor is suggested as well.
The steps in the rename() system call are:
Steps (d) to (f) are critical for successful rename() operation. Steps (e) and (f) if performed before (c) would result in wrong timestamp being recorded in the inode. Another issue wit with the length of the new filename. The directory entry would have been compacted to store the current filename a. If the new filename’s length is longer say bbbbbbbbbbbbbbb, the directory entry will have to be moved elsewhere. This results in an additional write() operation, update of pointer etc. More such writes only increase the risk of data corruption in case of a system crash or power failure midway.
The rename() operation gets much more complicated if both the file and the containing directory are to be renamed.
There are two options to handle this, but both have issues if a system crash or power failure after the first step.
In the first case, midway we may have a situation as shown with both filenames pointing to the same inode. Though they would be similar to hard links, the link count in the inode is 1, as expected. On deleting the old filename the link count goes down to 0, resulting in the file data being deleted by the kernel, causing data loss. The fix for this is to perform the below steps. This however requires over four separate write operations and is thus expensive.
fsck() can be performed without a reboot as well. The file system is first unmounted and then the fsck() system call is run. I also claimed that a fsck() can be performed without requiring the file system to be unmounted. We use the concept of invariants to handle issues such as loss of data and corruption of file system.