Scribe Notes: Lecture 17

CS 111, Spring 2014
Date: June 2, 2014
Prepared by Shobhit Garg

Table of Contents

  1. What can go wrong with NFS/CIFS
  2. Media faults
  3. Disk drive reliability
  4. Security holes

What can go wrong with NFS/CIFS

  1. Media Faults
  2. Security holes

Media Faults

Lampson-Sturgis

  • Log-structured file system.
  • It is slow because you have to go twice the length for a single item.
  • In principle, log is infinite. But in practice, it holds "just enough."

RAID: Redundant Array of Inexpensive Disks

Like many other discoveries in computer science, RAID was initially designed for entirely wrong reasons. Computer scientists at Berkeley needed more storage, but felt they were being ripped off by disk suppliers for large disks. They figured maybe they could connect several smaller disks, and treat them as one big drive via software.

RAID 0

RAID 0

  • Concatenation (1 big drive, concatenated from little ones)
  • striping. assumption: locality of reference (reliability problem)
    • + seek in parallel
    • - possibly lose data blocks from many files

RAID 1

RAID 1

  • Mirroring: Anytime you read or write, do it to all 10 copies.
    • + lowers the failure rate => more reliability
    • + Read seeks are faster
    • - Write seeks can be slower
    • + 2x read throughput
    • - 2x storage cost

RAID 4

RAID 4

  • Has one special parity drive (E), and all writes are done so that the following property holds true:
    • E = A^B^C^D if A, B, C, and D are the data drives.
  • Reads are like RAID 0: Concatenation
    • A negative is worse read performance than RAID 0 with striping
  • Writes like RAID 1
    • A negative is that we read E before writing it
  • Positive is that if one disk crashes, we can reconstruct it using C' = E^A^B^D
    • All drives are exclusive or's of the others
  • Cost = NN-1
    • N is the number of drives
    • Make N bigger to reduce the cost
  • Failure rate = (FR)*(FR)*(window of repair)*N
    • Window of repair should be as short as possible.
    • However, often, it is to the tune of hours, even days, depending on how much data is contained in the drive.
Assumptions in RAID 4
  1. we're notified of write failures
  2. we're notified of read failures
  3. little light goes on
  4. disk replaced quickly (relative term: 1 hour - never)
  5. during replacement, run in degraded mode
  6. after replacement: rebuild the drive (few hours).
Advantages of RAID 4
  1. + use XOR to restore blocks
  2. + cheaper than RAID 1
    • extra cost is 1/(N-1) compared to no parity
    • total cost is 1 + 1/(N-1)
Disadvantages of RAID 4
  1. - complexity
  2. - parity drive is I/O bottleneck on writing
  3. - writes more expensive (like RAID 1) and because extra reads

RAID 5

RAID 5

  • RAID 4 with striping
    • RAID 4 > RAID 5 because easier to add new drive

RAID 4 vs. RAID 5

If you want to add an additional drive:
  • RAID 4 -> add it, put 0 in it -> Done
  • RAID 5 -> add it, re-arrange parity areas in ALL the drives -> expensive and slow.

Disk drive reliability

  • Failure rate (eg. 2%/year)
    • Annualized failure rate: Since manufacturers want to get the disks to market as soon as possible, they test them for, say, 1 month and annualize the results.
      Note: This metric is an approximation.
  • MTTF: Mean time to Failure
    • Typically about 300,000 hours or 34 years
    • However, in reality, disks should be replaced every 5 years.

PDF

PDF for RAID 4


Security holes

Real world security defends us against:
  • force eg. Police, army.
  • fraud eg. Impersonators <- big deal because of the internet

Main forms of attacks

  • attack against privacy: unauthorized information release
  • attack against integrity: tampering with victim's data
  • attack against service: denial of service

Defense goals

  • Deny unauthorized access (defends against privacy and integrity attacks)
    • not natural to test for
  • Allow authorized access (defends against privacy and integrity attacks)
    • natural to test for
  • Be able to handle lots of "bogus" requests (defends against service attacks)
    • relatively easy to test

Supreme Council of Virtual Space

  • reports directly to the Supreme Leader of Iran
  • controls "all" information flow in Iran
  • launched a DoS attack on BBC's website and satellite feed
  • blocked more than half of the world's websites in Iran

Threat modeling and classification

  • insiders (eg. system admins)
  • social engineering (eg. Kevin Mitnick)
  • network attacks
    • DDoS
    • Drive-by download (DBD)
  • virus
  • device attacks (ie. virus on USB flash drive)

Kerchkhoff's design principle (for cryptographic systems)

  • Minimize what needs to be kept secret. Assume bad guys will learn your design (or any global key).
  • Counterexample: DVD Scramble

General functions needed for almost any security mechanism

  • Authentication (e.g. password)
  • Integrity - that data hasn't been tempered with. (e.g. checksum)
  • Authorization (e.g. access control list [ACL])
  • Auditing - defending against insider attacks. (e.g. logs)
    -------------We want the above four functions to be orthogonal-------------
  • Correctness
  • Efficiency

How to authenticate

  • who the principal is. eg. retinal scan
  • something the principal has. eg. smart card
  • who the principal knows. eg. password