CS 111 -- Operating Systems
Scribe Notes for 5/13/2013
by Robert Roizen
Outline
Problems with NFS
Media Faults
Redundancy to Handle Media Faults
Concatenation
Striping
Mirroring
Implementations of Redundancy
RAID0
RAID1
RAID2-5
NFS Performance
Sun ZFS Storage 7320
NFS Security
Network introduces new problems (ie, snooping...)
Physical protection
VPN (Virtual Private Network)
Individual Authentication
Security
Traditional vs. Computer
Against Privacy
Against Integrity
Against Service
Threat Modeling & Classification
General Functions Used for Defense
Problems with NFS
Media Faults
A Media Fault is when the disk or SSD dies. We want reliable crashes in case of power failure
SSDs are notoriously bad at this
Some SSDs will write the wrong data but the right checksum
Some drives just stop working altogether
One solution
Uninterrupted power source/backup battery
These have to be maintained properly
If not maintained, they can actually bring down the system
Let's assume that we crash reliably
Can we use journaling/commit records to solve the consistency problem?
In practice, no. You're assuming that you can solve this problem for the disk the journal is on
Can't you use main storage if the journal fries?
No, main storage is inconsistent without the journal
Redundancy to Handle Media Faults
RAID - Redundant Arrays of Independent (Inexpensive) Disks
10TB is around $1500
1TB is around $80
$800 vs. $1500, you pay for higher density of information on a single disk
Because of the price descripancy, the temptation is to build a big drive out of smaller ones
Concatenation
To use multiple small drives as a single big one, you must write a special disk driver
The OS thinks that you have a single large disk, but the driver knows it's made up of smaller drives
Performance Problem
A single small drive may be busy, but the other 9 aren't doing anything
This can happen due to locality/access patterns
Striping
You can optimize for access paterns by splitting a contiguous write to different internal disks
This has its own problems
The reliability of the concatenated drives is less than that of the big one.
Intuitively, this is beacuse all it takes a single small drive failing for the system to fail
In fact, if MTF (mean time to failure) is 1 year, then a concatenation of 10 drives will fail in ~1 month
Mirroring
Mirroring solves the increased failure rate of striping by storing each block on two devices
But this increases the cost (and power) by a factor of 2!
Actual implementations, like RAID4, make tradeoffs between reliability and cost, as described below
Actual Implementations of Redundancy
RAID 0
Uses concatenation and Striping
RAID 1
Uses Mirroring
Performance of RAID 1
For every read/write you have to do two read/writes, but they can be done in parallel
The time it takes to do a write is therefore the slower of the two writes
Read performance is better, simply choose the faster of the two reads
Power is ~2x!
Cost is ~2x!
RAID 2,3,4,5
These configurations deal with the fact that the 2x cost increase of mirroring is sometimes prohibitive
Raid 4
Doesn't use striping
Let's assume we have N=5 disks (Call them A,B,C,D,Pairity)
The Virtual disk is the concatenation of 1...(N-1). A,B,C,D in this case.
The final Nth disk is called the pairity disk
Each block on parity disk contains the exclusive or of the other disks: Pairity = A^B^C^D
Let's say we lose drive B
Can we recover it from the other 4 working drives?
Sure: A^B^C^D = Pairity
Thus: A^A^B^C^D^D^C = A^Pairity^D^C
Thus: B = A^Pairity^D^C
This is because:
A^A=0
0^X = X
(A^B)^C = A^(B^C)
Just the mathamatical properties of the exlusive or operator
Notice that RAID1 is simply RAID4 with N=2
Problems?
When a drive fails there is a recovery period
Notification Time (How long until someone realizes that a drive has failed)
Physical Replacement Time (How long it takes to replace the drive once someone is aware of it failing)
Copy time (Compute XOR and store data on new disk, can take a LONG time)
What if we try to use the virtual drive while copy is going on?
It's going to be very (probably unusably) slow
RAID 5
RAID 5 uses striping
The pairity bits are stored using striping in 5
This avoids the hot spot of the pairity drive
In Raid 4, every write to any disk has to write to the pairity disk
In general, Concat vs Striping
Say you need to grow your virtual space.
With concatenation, you just get a new drive. The pairity drive is still correct
With striping, you need to reorganize data on existing drives
Performance for NFS
Sun ZFS Storage 7320
2x storage controllers (2x reliability)
2x 10GB Ethernet Adapters (2x for reliability)
8x 512GB SSD read accelerators (cache reads)
8x 73GB SSD write accelerators (cache writes)
136x 300GB 15kRPM disk drive
Note that the lifetime of an ssd depends on operations used.
Writes wear it out, you get more expensive SSDs for write cache
37TB exported capacity
NFS Security
Mimics unix traditions
Say you have an open file descriptor
fd = open("f1", O_RDONLY)
But some other process does: chmod("f1",0000)
You don't notice because permissions are checked on open, not on every read
Initially, NFS had client kernel do permission checking
This means you have to trust the client kernel...
This is never a good idea
Bad client kernels can pretend to be good users
Attackers can also snoop on the network
Solutions
1) Use physical protection
all servers/clients in a room on a closed network
2) VPN (virtual private network)
Setup keys and encryption, essentially you recreate physical protection
3) Individual Authentication
Keberos Tickets
Not used that often because of performance loss
Security
Traditional vs Computer
Traditional security focused on securing against force and fraud
Computer doesn't really have to worry about force. Fraud is a huge problem though
Main Forms of Attack
(i) -- Against Privacy (unauthorized data release)
(i) -- Against Integrity (modify info you don't have access to)
(ii) -- Against Service (prevent system from responding to valid requests, ie, DOS)
You want to disallow unauthorized access (i)
You want to allow authorized accesss (ii)
Threat Modeling & Classification
Insiders -- people who were authorized misuse the system, you have to mistrust people with access
Social Engineering -- Outsiders pretending to be an insider
Network Attacks -- DOS, virus, drive-by download, phishing
Device Attacks -- USB virus
General Functions Used for Defense
Authentication
prove you are who you say you are
ie, passwords, RSA keys...
Integrity
timestamps, checksum
Authorization
Log of what changes were made and by whom
Efficiency
Security must not put undue strain on resources
Correctness
Monitoring + Maintenance