CS111 Lecture 16 Scribe Notes

By: Eric Wei

Network File System

NFS utilizes a Client service based architecture

Performance

  1. benchmarks obtainable through www.spec.org 
  1. companies run older benchmarks like SPECsfs2008_nfs.v3 to obtain better results
  1. Example NFS - Sun ZFS storage 7320 Appliance (To be released May 2012) specs
  1. 2 storage controllers
  2. 2 10 Gb Ethernet Adapters

  1. 8 512 GB SSDs (for read acceleration)
  2. 8 73 GB SSDs (for write acceleration)
  3. 136 300 GB 15 kRPM harddrives
  4. Split into 32 filesystems

  1. The system has no single point of failure because of redundancy, multiple copies of the same components

RPC is part of NFS

* 2 ms isn’t too bad, but we want to speed this up. How can we do this?

* Let’s do multiple reads at once

 

* If the threads are independent, this words well

* Web browsers basically use RPC

  1. originally, web browsers (client) issued requests to servers sequentially
  2. now, web browsers issue multiple requests in parallel through HTTP pipelining
  1. This brings up new issues. The client must deal with failed out-of-order requests
  2. Also, what if the client issues multiple writes and some of them fail? Here are 2 solutions:

1. be slow: don’t pipeline; wait for response

2. be fast: pipeline; keep going. Lie to the user about whether write() worked. Although, at some point, you need to fess up at report what really happened.

* Conventionally errors are reported on ‘close’

* ‘close’ now becomes slow because it needs to wait for all responses to come in, but files aren’t closed very often so this is usually acceptable.

* This is why you should always check the return value of close()!!! (since you only discover the truth then)

Issues with RPC

(+ = the good, - = the bad)

+ hard modularity (client and server have different address spaces)

- messages are delayed

- messages can be lost

- messages can be corrupted

- the network might be down, or slow

- the server might be down, or slow

How do you tell the difference between being down and being slow? (big issue)

* We can usually deal with corruption by using checksums (use them liberally)

If the server detects a bad packet, it should send a response “huh???” and ask for retransmit

* If no response, we have a few options:

  1. at-least once RPC - we try again and keep trying until it succeeds
  1. okay for idempotent operations (read/write)
  1. at-most once RPC - return an error to caller. Let the caller choose how to handle it.
  1. for “dangerous” operations (like changing the balance of a bank account)
  1. exactly-once RPC - do nothing

Robustness

* NFS assumes “stateless” server

  1. stateless - controller’s RAM doesn’t count as part of the state, so if the power cuts out, nothing vital is lost
  2. RAM is cache only

  1. This is essentially mounting
  2. NFS protocol goes over the wire
  1. READ(fh, data)
  2. WRITE(fh, data)
  3. LOOKUP(fh, name)
  4. REMOVE(fh, name)
  5. CREATE(fh, name, attr)
  1. fh = file handle
  2. What is a file handle?
  1. an integer (actually a little more than than) uniquely identifying a file
  2. these are like inodes in the actual file system
  1. To have the file system be fast, we need a module in the kernel that allows the file system to fiddle with files directly through via inode numbers

  1. NFS does not guarantee write-to-read consistancy
  2. It does guarantee close-to-open consistency (because close is much slower)

Reliability

Main issues

  1. bad network
  2. bad client (operator powers off machine)
  3. bad server
  4. bad disk (Media Faults)

Let’s focus of Media Faults

  1. can we address this issue via logging?
  1. no, because the journal used for logging could be corrupted
  1. RAID( Redundant Arrays Inexpensive Independent Disks)
  1. the original purpose of RAID was to get a bunch of cheap, smaller disks to act like a larger disk because disk makers were overpricing larger disks (i.e. A 1 MB disk would be $100 but a 5 MB disk would be $2000)
  2. nowadays, key feature from RAID stems from the R (Redundant)
  1. The various flavors of RAID
  1. RAID 0 - concatenation
  1. make a larger virtual disk by stringing together a bunch of smaller disks

  1. RAID 1 - mirror
  1. multiple physical drives for a single virtual one
  2. reads are faster

  1. Striping - a combination of RAID 0 and RAID 1
  1. overlapping regions of virtual memory across the physical disks

  1. There are more types of RAID, but we’re going to focus on RAID 4

  1. XOR disk is a bit parity of the other disks which allows data on another disk to be recovered if it fails
  1. example: if disk B dies, to resort the bits on B, we use the following equation
  1. B = A ^ C ^ D ^ (A ^ B ^ C ^ D)
  1. you can lose any single disk and still run, but if you lose 2 disks, you won’t be able to recover their data anymore so MAKE SURE THE SERVER GUYS GET NOTIFIED IF A DISK FAILS
  2. Of all the drives, the XOR drive is the busiest
  1. every write to any of the other disks = a write to the XOR drive as well