Scribe Notes: Lecture 17

CS 111, Spring 2014
Date: June 2, 2014

What can go wrong with NFS/CIFS

Media Faults
Security holes

Media Faults

Lampson-Sturgis

Log-structured file system.
It is slow because you have to go twice the length for a single item.
In principle, log is infinite. But in practice, it holds "just enough."

RAID: Redundant Array of Inexpensive Disks

Like many other discoveries in computer science, RAID was initially designed for entirely wrong reasons. Computer scientists at Berkeley needed more storage, but felt they were being ripped off by disk suppliers for large disks. They figured maybe they could connect several smaller disks, and treat them as one big drive via software.

RAID 0

Concatenation (1 big drive, concatenated from little ones)
striping. assumption: locality of reference (reliability problem)
- + seek in parallel
- - possibly lose data blocks from many files

RAID 1

Mirroring: Anytime you read or write, do it to all 10 copies.
- + lowers the failure rate => more reliability
- + Read seeks are faster
- - Write seeks can be slower
- + 2x read throughput
- - 2x storage cost

RAID 4

Has one special parity drive (E), and all writes are done so that the following property holds true:
- E = A^B^C^D if A, B, C, and D are the data drives.
Reads are like RAID 0: Concatenation
- A negative is worse read performance than RAID 0 with striping
Writes like RAID 1
- A negative is that we read E before writing it
Positive is that if one disk crashes, we can reconstruct it using C' = E^A^B^D
- All drives are exclusive or's of the others
Cost = ^N⁄_N-1
- N is the number of drives
- Make N bigger to reduce the cost
Failure rate = (FR)*(FR)*(window of repair)*N
- Window of repair should be as short as possible.
- However, often, it is to the tune of hours, even days, depending on how much data is contained in the drive.

Assumptions in RAID 4

we're notified of write failures
we're notified of read failures
little light goes on
disk replaced quickly (relative term: 1 hour - never)
during replacement, run in degraded mode
after replacement: rebuild the drive (few hours).

Advantages of RAID 4

+ use XOR to restore blocks
+ cheaper than RAID 1
- extra cost is 1/(N-1) compared to no parity
- total cost is 1 + 1/(N-1)

Disadvantages of RAID 4

- complexity
- parity drive is I/O bottleneck on writing
- writes more expensive (like RAID 1) and because extra reads

RAID 5

RAID 4 with striping
- RAID 4 > RAID 5 because easier to add new drive

RAID 4 vs. RAID 5

If you want to add an additional drive:

RAID 4 -> add it, put 0 in it -> Done
RAID 5 -> add it, re-arrange parity areas in ALL the drives -> expensive and slow.

Disk drive reliability

Failure rate (eg. 2%/year)
- Annualized failure rate: Since manufacturers want to get the disks to market as soon as possible, they test them for, say, 1 month and annualize the results.
  Note: This metric is an approximation.
MTTF: Mean time to Failure
- Typically about 300,000 hours or 34 years
- However, in reality, disks should be replaced every 5 years.

PDF

PDF for RAID 4

Security holes

Real world security defends us against:

force eg. Police, army.
fraud eg. Impersonators <- big deal because of the internet

Main forms of attacks

attack against privacy: unauthorized information release
attack against integrity: tampering with victim's data
attack against service: denial of service

Defense goals

Deny unauthorized access (defends against privacy and integrity attacks)
- not natural to test for
Allow authorized access (defends against privacy and integrity attacks)
- natural to test for
Be able to handle lots of "bogus" requests (defends against service attacks)
- relatively easy to test

Supreme Council of Virtual Space

reports directly to the Supreme Leader of Iran
controls "all" information flow in Iran
launched a DoS attack on BBC's website and satellite feed
blocked more than half of the world's websites in Iran

Threat modeling and classification

insiders (eg. system admins)
social engineering (eg. Kevin Mitnick)
network attacks
- DDoS
- Drive-by download (DBD)
virus
device attacks (ie. virus on USB flash drive)

Kerchkhoff's design principle (for cryptographic systems)

Minimize what needs to be kept secret. Assume bad guys will learn your design (or any global key).
Counterexample: DVD Scramble

General functions needed for almost any security mechanism

Authentication (e.g. password)
Integrity - that data hasn't been tempered with. (e.g. checksum)
Authorization (e.g. access control list [ACL])
Auditing - defending against insider attacks. (e.g. logs)
-------------We want the above four functions to be orthogonal-------------
Correctness
Efficiency

How to authenticate

who the principal is. eg. retinal scan
something the principal has. eg. smart card
who the principal knows. eg. password

Table of Contents