CS 111 Lecture 11:File system design (Spring 2013)

Continued from last lecture, by Qijun Chen and Sibo Liu.

Disk Performance(continued)

First come first serve: In the first come first serve, we process the request based on the order they arrive, regardless of where the head of the disk is.

Assume that we are sending random access, the block we want come in a total random order: Then what is the average distance to access the blcok?

[integral]

We want to do better to have a average performance than 1/3.

First come first serve is fair, but its cost is high:

Alternative solution for disk schedule:

Shortest seek time first(SSTK)

It moves the head the shortest amount of distance it can to satisfy the request.
Go to closest blcok first, maximize the throught put.

Each amount of time go as far as you could, however, starvation is possible, some current request may be processed later than the new request and it is not fair

Elevator algorithm

We always go to the nearest request, but only go to that request if it is on the current direction that the disk head is moving, just like the model of elevator we used in real life

This is not as fair as first come first serve, but no starvation, and should perform weaker disadvantage: a little bit unfair

The blocks in the middle are accessed more frequently than the blocks on the both end, but we can put the important frequent access file there.

Adjusted Elevator algorithm: Circular Elevator(one way elevator)

It only moves in one direction, when it reach the end of the tape, it magically jumps to the beginning of the tape.

advantage: fairer

disadvantage: go back from the top to bottom has cost, slower due to the rewind cost

Great cost to move the disk head back, lousy performance no matter which algorithm

Anticipatory Schedule:

Solution: wait and if the next request is on different pattern, wait for some time, if no more request come in, move to the other postition

For the kernel, it remembers the access pattern of the process and speculates what the process will do, and delay, we hope there’s another request coming in. For example, wait for 1ms after each process finish.

Summary

We should make use of all sorts of algorithms, combine them together for disk schedule design.

Flash

How does flash change things:

It is more expensive, but it is faster(no seeks) You access the data automatically, directly,

disadvantage:

Flash can be wear out on write -- If you keep writing to the same part of the flash, it might wear out

solution: Need a controller for write leveling. It limits the total number of writing to the flash, in order to make it last longer. (It keeps track of which flash has been written too often, it slows down performance, but it is still very fast.)

File Systems

A data structure that lives on the disk or SSD data stucture, it provides an abstraction of a set of files.

Eggert’s File system designed in 1974, before he knew any better

Comparing with the Similar System: RT-11 File System

RT-11 ('RT' for Real Time) was a small, single-user real-time operating system for the Digital Equipment Corporation PDP-11 family of 16-bit computers. RT-11 was first implemented in 1970 and was widely used for real-time systems, process control, and data acquisition across the full line of PDP-11 computers.

Eggert's system is Similar to the RT-11 file system What are some properties about this kind of approach?

When you create a file, you must specify its size.
We get fragmentation issues, specifically external fragmentation - we have enough free space, but it's scattered all over the place. We also get internal fragmentation - because each file starts on a sector boundary, there are (-size)%512 bytes wasted in each file.

advantage:

It is really simple, we are giving up some performance to make the code clean.
Its sequential reads or writes are fast as a performance advantage.

disadvantage:

no sub directory
limititation on the number of files it can hold
You have to know the length of the files in advance, or at least give a good estimate.
possibility of a corrupt directory. Directory could be corrupt, and this would cause disaster.
fragmentation:occurs when although you have enough spacem you cannot store the data.

Internal Fragmentation:
due to the last sector of tail is unused.(about ~256 Bit/file wasted)

External Fragmentation:
You have enough free space, but it is not available all at once, it is not sequential.

Fat File System (designed in 1970s)

It is designed to solve fragmentation problems.

What fat file system solved?
1. No subdirectories in the previous design.
2. No external fragmentation in the previous design.

Problems of FAT:

performance issue:

Performance suffers when you have fragmentation. So it designs a defragmenter to read file in the low level, and create sequential blocks FAT[i]=i+1 as much as possible.
However, this thing tend to be slow and tie up the drive. The consume of power can be franky.

moving file

Rename is easy. But suppose you want to move a file from 1 directory to the other directory, if you want to do that, you get problems.
You cannot write at the same time, because either you erase 1 entry, you write the other; or you add new entry, and erase the old one.(problem: You have to directly point to same file), when you remove one file, your free its blocks, this may cause disaster.

Solving moving file from 1 directory to the other directory.

Berkeley FFS(1980s)--Unix file system like

Inode idea

Inode idea(want to have data structure on disk that describe the file), it is just a separate piece of data structure, independent of any directory. It has a pointer type hierarchy, which helps to find files fast and helps in extra storage of data by indicating data blocks with pointers to more data.

Disk Space Structure:

Boot Sector: Information on disk booting and loading up the OS.
Super Block: Contains version number, size, amount of free space, and location of rootdir of the filesystem
Bitmap Block: 1 indicates used blocks and 0 indicates free blocks.
Inode Table: This table stores all the inodes that have relevant file information. Data blocks: Contains data; depending on the size of the file, these blocks could instead be pointers that divide the file into even more blocks.

Implement lseek with FAT
You walk through a linklist to get to the number, with fat, the lseek can be slow.

However, wirh Inode, which contains multiple block pointer, it has direct access to the block with number, it is faster than the implementation on FAT.

Original unix design:

No way to rename file by sys call.
Instead, use link, unlink
link(old,new)
unlink(old)
writes, destination dir
Every time you call the unlink, if (f->linkcount--==0) free space of the file.

FAT
+defragment

Unix/BSD
+repair file system after rebooting
For the lost + found directory, you can choose either remove or save it anyway.

to be continued...