CS111 Fall 2008

File System Organization (Nov 4)

By Xiao Lei

What is a File System?

-          an on disk data structure that provides an abstraction of a collection of files

 

We look at some file system examples below:

 

RT-11 (Real Time) Contiguous Allocation

Disk

image

Each directory entry stores the file name, the file header, and the size of the file

 

Pros:

Simple to implement

Predictable

Fast read/write/seek times

 

Cons:

Directory fills up if there are lots of tiny files

External fragmentation

Disk defragmentation is required after a period of time

Files cannot easily grow (Serious Drawback!)

 

How do we fix the file growth problem?

Use a Block Structured File System

-          Fixed sized blocks

-          Whenever we need to grow a file we simple allocate a new block

 

FAT (File Allocation Table) File System (late 1970’s)

Disk

image

 

The first block is the boot sector (we typically do not touch this block)

The second block is the superblock, which stores information about the disk like

The FAT section stores an array of block numbers:

The Data section is consisted of 4kb blocks

 

For example a file that takes up 4 blocks of data is represented in the FAT file system in the following way:

image

 

How do we know where the first block of the file starts?

We use a Directory – A file that contains directory entries with the following structure for each entry:

image

 

Pros:

Files can now grow

No external fragmentation

 

Cons:

Internal fragmentation

Slow and unpredictable performance (Especially if the FAT is not cached)

 

 

Taking a look at the Abstraction Layers of a Unix File System

 

Software Level

 

Hardware Level

 

 

What are some problems with these abstractions?

We look at some security issues:

$rm file – does not actually remove the file (file info is still stored on disk)

We can have a new instruction “$shred file” that overwrites the file’s contents with random data an X number of times, then remove it.

(The reason for multiple writes is that writing on magnetic disks is inaccurate and old information can still be deciphered if the new info does not completely overlap the old info)

-          A problem with this approach is that if the file is cached, the writes are only done in RAM, and the file is only overwritten once

-          We can fix this by having a special system call to make sure each write is done to the disk

However we are assuming we are always overwriting the old files

Some File Systems allocations new blocks on the fly while writing to optimize performance, but this means the old info still remains on disk!

(We can always shred the entire partition as the safest method to protect our data…)

 

Another way to protect data is to using an Encrypting File System

All the bytes on disk are encrypted with a secret key, which is kept in RAM only

Pros:

Shredding is trivial – Simply lose the key!

Cons:

Slow

If the key is accidently lost, its too much hassle to retrieve the data

 

Other Problems to Consider:

If the disk fries, we lose all the data!

 

We want a File System that can cross partition/disk boundaries!

Why?

  1. Reliability (RAID – write data to multiple disks)
  2. A Big File System (If we want to manage say a 1PB File System, we cannot just buy a 1PB hard drive as current technology stands)
  3. More speed and throughput!

 

BSD Fast File System (FFS) (~1980)

Disk

image

 

The book sector block and superblocks are as before

(Some systems have multiple copies of superblocks scattered across the disk to obtain redundancy)

The block bitmap is an array of bits that tells you which blocks are free

The Inode Table contains an array of fixed sized Inodes, each of which describes a file

 

Big Inode Approach:

image

 

This method is inefficient, as it takes up too much space, and the space will be unused for small files.

 

Extensible Inode Approach:

image

 

The first 10 pointers directly points to data blocks (direct data blocks)

 

If the file is bigger than 10 blocks, the 11th pointer points to a indirect data block, which is consisted of pointers to the rest of the blocks

 

Unix File Systems have doubly indirect data blocks (for even bigger files)

The 12th pointer points to a block of pointers, each of which points to another block of pointers, which then points to the actual data blocks.

 

We can grow this system of representing files to include triply indirect data blocks!

 

Pros:

No external fragmentation

Cons:

Internal fragmentation

 

 

Directory entries for 1977 Unix:

Files whose contents are Tables

image

 

Linux ext3 v2 Directory entry:

image

 

Say we call the command “mkdir(“d”)” in our working directory

We allocate an Inode (say #1059)

Our working directory will now contain a directory entry where

Filename = “d”

Filetype = directory

InodeNum = 1059

 

Hardlinks

Inside each inode metadata there is also a LINK COUNT entry, which counts the number of directory entries pointing to this inode.

 

For example:

$ cat > a

$ ln a b

$ echo foo > a

$ cat b

OUTPUT: foo

 

At this point there are two directory entries, with filenames “a” and “b” both having the same inode number, which effectively points to the same file.

(Hardlinks can point from different directories!)

 

Why have hardlinks?