Most programs need a place to access, process, and store data. Although RAM provides some implementation for memory management, it is inadequate for meeting the demands of most applications. File systems are implemented to manage information on a much larger scale. The following properties characterize the benefits provided by any adequate file system:
CPU performance has increased on a non-linear scale as time has passed. However, disk performance continues to only improve at a steady, almost constant rate. With the development of the file system, CPU performance has far surpassed disk performance.
A hard disk contains several platters containing data. The disk controller consists of an arm possessing several read/write heads for accessing data contained in each platter on the diak. As the disk rotates about an axis, sectors of the platters spin under the heads and are read into a buffer. The heads are aligned, so a cylinder of data is read for every disk rotation. The arm manipulates which cylinders of data are read by moving the heads to different cylinders of the platters. To read a specific sector from the disk, the arm must move the heads to the right cylinder and wait for the sector to rotate under the head.
A hard disk drive is a storage device that uses disks to store data. The drive contains a disk controller for managing data on a hard disk. Many of the metrics of the hard disk are influenced by the spinning disk.
Device Metrics | Seagate Barracuda LP 2 |
Total Storage | 2 TB |
Cost | $150 |
External Transfter Rate | 300 MBits/s |
Sustained Transfter Rate | 90 MBits/s |
Average Latency | 5.5 ms |
Average Seek Time | 10 ms |
Cache Size | 32 MB |
Disk Rotation Speed | 5900 rpm (98.33 Hz) |
Contact Starts and Stops | 50,000 |
Power (Operating) | 6.8 W |
Power (Idle) | 5.5 W |
Read error rate | 10^-14 nonrecoverable read errors per bit |
AFR | 0.32% |
Operating Shock | 70 G |
Non Operating Shock | 300 G |
Noise (Operating) | 2.6 Bels |
Noise (Idle) | 2.5 Bels |
An SSD is a storage device that uses integrated circuits to store data. Data is stored electronically in the circuits instead of on a disk. SSD's contain no moving mechanical components, so they produce less noise, have smaller access times, and are less fragile than hard disks.
Device Metrics | Corsair Force GT |
Total Storage | 60 GB |
Cost | $90 |
External Transfter Rate | 3 GBits/s |
Sequential Read Rate | 280 MBits/s |
Sequential Write Rate | 270 MBits/s |
Mean Time Between Failure | 1 Million Hours |
Shock Capacity | 1000 G |
Maximum Power | 20 Watts |
Device Metrics | Seagate Barracuda LP 2 | Corsair Force GT |
Total Storage | 2 TB | 60 GB |
External Transfter Rate | 300 MBits/s | 3 GBits/s |
Cost | $150 | $90 |
A file system uses speculation to improve performance by predicting requests ahead of time and processing them before the application makes them. The file system finishes work ahead of time to mask delay.
Suppose an application reads blocks based on a predictable pattern. The file system could predict future blocks that the application will request and read them ahead of time. For example, if an application read adjacent blocks, the file system could read the next adjacent block before the application requests it.
Suppose an application makes many requests to read data from a given location. Instead of reading one of these blocks each time the application requests one, the file system could read a large batch of blocks at once to prepare for future requests.
When given a request by the application, the file system waits instead of processing the request immediately. Future requests may be done simultaneously with the previous request to save time as opposed to processing each request individually. Time is saved if simultaneous processing results in the prevention of repeating similar instructions for each request.
Write data is placed on a buffer to write to the disk. Instead of performing a write for every write request, the file system delays writing to the disk until the buffer is full. Batching increases the amount of data to write but reduces the number of disk writes which saves time.
Suppose an application requests a read from a given location and then requests another read from a different location. The file system could delay the processing of the request and instead wait for any future read requests from the current location. This reduces the amount of searching done for the file system.
In practice, data requests have patterns which can be analyzed to predict the application's next request.
The file system can take advantage of spatial locality if the application repeatedly requests data in locations that are adjacent to each other. Speculation can be used to access memory early and get data in adjacent locations before the application requests them. Dallying can be used to wait for all requests from the current location before accessing memory.
An application may make several requests for data in a small location during a given time interval. Dallying can be used to wait for all requests from the current location.
Consider a machine with the following properties:
The following code reads and processes one character at a time:
for(;;) { char c; if(sys_read(&c) < 0) { break; } process(c); }
The latency per read is:
5 microseconds (send command) + 50 microseconds (disk latency) + 1 microsecond (inbyte) + 0.125 microseconds (compute) = 56.125 microseconds
A machine that uses this simple strategy for reading and processing bytes experiences high latency and low throughput. Utilization is also very low.
Instead of processing only one Byte at a time, batch with 40 Bytes to increase throughput.
for(;;) { char buf[40]; if(sys_read(buf, 40) < 0) { break; } process(buf, 40); }
The latency per read is:
5 microseconds (send command) + 50 microseconds (disk latency) + 40 microseconds (1 PIO/Byte * 40 Bytes) + 5 microseconds (0.125 microseconds/computation * 40 computations) = 100 microseconds
A machine that uses batching for reading and processing bytes experiences a lower latency and higher throughput for each requested byte in comparison to the simple strategy. Utilization also increases but is still small.
The elapsed time for processing a request can be improved by implementing device interrupts. Interrupt handlers can handle PIOs and computations in parallel.
The following pseudocode handles device interrupts:
do { block until there is an interrupt handle interrupt() } while (more data); handle interrupt() { Issue PIOs for next request compute )
The latency per read is:
5 microseconds (send command) + 50 microseconds (disk latency) + 40 microseconds (PIO) + "5 microseconds" (computation) (done in parallel with PIO ==> = 0 microseconds) = 95 microseconds
The latency per read is:
5 microseconds (send command) + 50 microseconds (disk latency) + 400 microseconds (PIO) + "50 microseconds" (computation) (done in parallel with PIO ==> = 0 microseconds) = 455 microseconds
Direct memory access allows the file system to access memory in parallel with the CPU. With DMA, the file system can perform computations while the DMA controller completes the transfer of data.
The latency per read is:
5 microseconds (DMA setup) + 50 microseconds (disk latency) + "40 microseconds" (PIO) (done in parallel with other work ==> = 0 microseconds) + "5 microseconds" (computation) (done in parallel with disk latency ==> = 0 microseconds) + 5 microseconds (overhead of interrupt handler) = 60 microseconds
To avoid the extra overhead of interrupts, a polling strategy is used instead of a blocking strategy. The file system sets up the data transfer for the DMA controller to complete. Then the file system performs other work while the DMA controller finishes the transfer. However, the file system polls the disk controller after finishing its work instead of having the DMA controller interrupt it.
The latency per read is:
5 microseconds (DMA setup) + 50 microseconds (disk latency) + "40 microseconds" (PIO) (done in parallel with other work ==> = 0 microseconds) + "5 microseconds" (computation) (done in parallel with disk latency ==> = 0 microseconds) = 55 microseconds
Multitasking allows processes to run simultaneously through a scheduler. If the current thread cannot obtain the resources it needs to do work, then it invokes the scheduler and transfers control of the CPU to another thread that can use it.
while (inb(0x1f7) & 0xc0 != 0x40)){ yield(); }