File System Implemenation Issues
Paranoid professor is at it again and wants to now remove data from his file system. What does he do?
Rm a file? Nope, some other processes might have the data opened and running. Also rm file will not actually remove the data from a file system. It just makes the data not visible, but with some tools the datacan be recovered by people trying to steal paranoid professor's work.
So what can be done? $shred file! It overwrites contents of the file with random data.
Why random data? Because if the file was overwritten with zeroes it could leave hints as to what data was written there before hand. This is because of hdd's microscopic imperfections that does not write data in the same exact place on disk as before.
Where does the random data come from? /dev/random which gets random data from "random" external events. This is a limited resource that can run out.
/dev/urandom has a limitless entropy pool.
You can use the RDRAND instruction with x86... But can you trust it?
Use FIPS standard on deleting data
- melt device //breaks the drive!
- physically shred device //breaks the drive!
- de gauss //breaks the drive!
- overwrite with random data (3 times?)
To really be able to destroy data you have to know all the levels of a data system
Levels of UNIX File System
File Name Components
ex: "/usr/bin/sh" Each individual section of the path name, eg root, usr, and bin are all file components.
File Names (Path names)
Combination of the file name components make up the complete path name of the file.
Inode Number
Number that defines an index number of the file. Contains essential metadata of the file.
Symbolic Links
Contains a reference to another file or directory in the form of an absolute/relative path.
Partition
The logical part of a disk. This block of disk holds the file system. It is possible to have multiple file systems in one disk by creating multiple file systems.
Blocks
Blocks are a collection of sectors. It is a sequence of bytes containing data having a maximum length, known as the block size. Blocks are typically 8192 bytes, or 16 sectors.
Sectors
A sector is the smallest storage unit addressible by a hard drive. They perform read and write commands for the file system. Sectors are typically 512 bytes
BSD File System
Very simple to find free blocks now. It is very fast to allocate memory. A big advantage to BSD file systems is the minimal use of seek time
- Book Sector: Stores machine code to be loaded into RAM to boot Operating System
- Super Block: Contains the file system metadata and defines the file system type, size, status, and information.
- Block Bitmap: Used to track allocated blocks. Usually a block of bits that would indicate whether a particular disk block is free or in use.
- Inode Table: Contains a listing of all of the inodes, described below, of a file system.
Inodes
An Index number, or inode, is a data structure used to represent a file system object. The inode number of a file can be found by using "ls -i" command and is commonly stored in an inode table. It stores attributes and disk block locations of the file system object's data. The information in an inode is known as the "metadata".
What's in an inode? Metadata contained in an inode:
- Size: Usually in bytes
- File type: Directory, regular file, symbolic link, etc.
- Permissions: Describes user/group/other access to the file
- Link count: Keeps of the number of hard links to the file
- Timestamp: Contains the time last modified and time last accessed.
- Address of Data Blocks: Pointers to the disk block that stores the file's actual contents
Inodes do not contain all of the metadata of a file. Some metadata must be stored in another location to allow specific features to be implemented.
What is not in an inode?
-
File Name: There is no file name entry in the inode. Instead, another directory entry runs parallel the inode.The reason for separating out the file name from the other metadata is to maintain links to files. Thus, one can have various file names point to the same inode.
Parent Directory: This is for the same reason file names are not included in the inode. Multiple directories can have directory entries that point to the same file. Therefore, there should not be one single parent directory in the inode.
Processes that have file open: The processes would be implemented as a linked list. This leads to bad performance and security issues.
Arguments against hard links
- Complicate user model: Hard links critiziced as a "high maitenance design". They can complicate the design of a program that handles directory trees.
- It is also easy to compute parent directory or the file name from file (with some help). Therefore it is not necessary to keep a hard link count.