CS111 Scribe Notes for Lecture 17 (May 30th, 2013)

by Zach North

Media faults (disk, SSD, or other media dies.)

We want reliable crashes.

disk doesn't lose any data except for block write in progress
disk still works on restart
no wrong data written, just lost the block in progress

How do we accomplish this?

One common technique: battery backup for system
- Extra power in case of power outage that handles it if power dies
- Problem with this is replacing batteries, reliability
Another: journaling / commit records
- Keep track of everything written to disk so we can reset
- What if the journal data is corrupted? The cell data may be corrupt -- have no idea
- We could store the journal on a different disk... what if it fails?
- Helps for power outages but not media faults.

The problem with both of these techniques is, while they are good ways to deal with the problem of power failure, they aren't designed to deal with media failure.

The standard technique for dealing with media faults: redundancy.

RAID (redundant arrays of independent disks) was invented in a famous Berkeley paper.

The naming of RAID is a little confusing (used to be "inexpensive" over "independent")

Some sample economics:

A 10TB drive costs say $1500.
A 1 TB drive costs say $80.
If you could buy 10x of the 1TB drives and combine them somehow, it's half the price.

Using RAID, we can configure it so users see only 1 drive.
This configuration is called concatenation. The Berkeley people invented a special disk driver to do this.

But there is a performance problem with concatenation: access patterns dictate that a lot of the time one drive is doing all the work and the rest of the system just sits there (temporal locality.)

Because of this they came up with a different way of "gluing" drives together. Instead of just laying out the data end to end, split all the different parts up among different drives (block-level striping.)
This gives much better performance because multiple disk arms can run at once.

Another problem now: reliability of these big drives is less than reliability of small drives. If any of the small drives fail, the whole system will now fail.

Mirroring: all written data should be written to two seperate physical disks.
Each block gets written twice. This will halve our available storage space, but will greatly improve reliability.

Aside: there's no law saying the individual hard drives can't be virtual drives...
So we can "layer" different schemes on top of each other. Mirror at the bottom level, stripe the level above that, etc.
The different techniques are just sort of a toolkit for building a system that is large + reliable + performant. Can be "tuned" to get what you want.

Different forms of RAID

RAID 0: concatenation + striping without redundancy.
- Better performance
- Worse reliability
- Requires 2x the power.
RAID 1: just mirroring.
- Slightly slower write performance (slower of two writes, because have to write twice.)
- Better read performance (whichever disk arm is closer can read.)
- Better reliability.
- Takes 2x the power.
- Costs twice as much for the same amount of storage. Defeats the point of "inexpensive."
RAID 4: parity drive with concatenation. Our example is assuming 5 disks:
- The virtual disk is the first 4 disks (size n-1 drives.)
- The final (fifth) disk is the parity disk. Each block on the parity disk contains the xor of the other disks.
- In this case a block on the parity disk contains A^B^C^D, if A-D are the blocks on the first 4 disks.
- If one disks fails you can recover it with data from the other three disks and the xor (parity) disk.
- If the parity disk fries you can just recompute it.
- If more than one disk fries, you are still going to lose data.
- You can mirror the parity disk to increase the number of failed disks you can handle. (This becomes RAID 6)
RAID 5: like RAID 4 but uses striping instead of concatenation.
- Parity drive is distributed evenly throughout the disks.
- Idea is to avoid hotspots. In RAID 4 the parity drive is a hotspot -- it's always writing.
- Advantages over RAID 4: less hotspots, so more performant.
- Disadvantages over RAID 4: adding in more disks is a pain. Have to copy everything around.

If you're managing a RAID4 system and a disk fails, you replace the disk. What happens then?

The system realizes it has a new blank drive.
It starts to copy all the missing data into the new drive. The time it spends doing this is called the recovery period.
If we try to access the system during this period it's going to be super slow. It could take hours to repair, and performance will suck this whole time.
What if another disk fails during recovery period? data is lost.

Disk failure rates: high initially (manufacturing defects), then low for a long period, then begins to rise again (the graph looks like a bathtub).

RAID 4 systems have way better reliability at low t and much worse reliability at high t.
If you have someone around to do repairs its great, because you can keep replacing parts and stay in "low t."
But you wouldn't send a RAID system to Mars, because as time goes on the failure rate eventually becomes bigger than an individual drive. The odds of drives failing increases exponentially.

The overall reliability depends on the human factor, and also the recovery period.

If the recovery period is 1 hour, the overall reliability approaches 100%.
If the recovery period is 2 weeks, the system could eventually fail.

NFS

This is the example benchmark Prof. Eggert gave in class.

Details:

SUN ZFS Storage 7320 applicance

2x storage controllers
2x 10GbE adapters
8x512GB SSD read accelerators (caches)
8x736GB SSD write accelerators
136x 300GB 15,000 RPM disk drives
~37TB exported capacity.

Throughput: 134140 operations per second (avg 1.51msec response time.)

This means a proper NFS is 4-5x faster than a local disk (!) and can handle a huge amount of requests.

NFS was discussed in more detail last lecture.

NFS Security

What can go wrong with a network file system?

For one thing, permissions problems -- what if we're reading a file and another user makes it unreadable?

Traditionally, the client kernel deals with permissions for NFS files, just like regular files.
But this indicates a security problem: trusting the client kernel. An attacker with a "bad" client can give a fake user id and get access to other files.

There are a couple solutions to this in the NFS world:

Use physical protection. Run all client kernels in a physically secure room.
- Most NFS traffic in the world is sent over private networks like this.
- Best performance because no encryption needed.
Virtual private networks. Set up keys on each trusted machine; use them to set up a virtual subnetwork across the internet.
- If you send encrypted, authenticated packets you basically reproduce physical protection.
- But something like SEASnet can't use this method -- thousands of connections, all must be trusted.
Individual authentication.
- Each client-server request must contain more info than just the userid.
- NFSv4 specififies Kerberos tickets for this (details beyond scope of class)
- Not used very much in the real world.

Security

What's the difference between traditional and computer security?
Well, for one, attacks via fraud are more of a problem than attacks via force.
DDoS attacks can take you offline, but at least you won't compromise data.

Main forms of attack:

against privacy (unauthorized data release)
against integrity (tampering with other's data)
against service (DDoS)

We want a system that both 1. disallows unauthorized service and 2. allows authorized access.

How to test 1? Try fake users, obviously bad clients, etc... but you won't really know if you are safe until your system is compromised.
How to test 2? A lot simpler, just make sure everyone can log in. People will tell you if they can't.

How to test against DDoS? Well... a lot of the time you don't. MyUCLA doesn't, becuase they take the attitude of "who would DDoS us?"

Which leads to the next point:
We have to think about threat modeling and classification.

Threats, ordered by severity:

Insiders
- Most common form of security breach: authorized users do things they shouldn't have.
Social engineering (Mitnick)
- Mitnick was a famous hacker who broke into systems by pretending to be a repairman.
- Actually, "smooth talkers" getting into systems is a big problem.
Network attacks
- DDoS
- drive-by downloads (browser vulnerabilities)
- viruses
- phishing
Device attacks
- USB virus
- others

General functions used for defense:

Authentication (proving you are who you say you are)
- passwords
- RSA keys
Integrity
- timestamps
- checksums
Authorization
- access control list
- root access
Auditing
- log of who changed system and when