Lecture 16 - CS 111 Scribe Notes

Robustness,parallelism and NFS

Table of Contents

Abstraction

Abstraction can be implemented via several ways. One way is use the virtualization.

Example r = chmod("/etc/password",0644)

  1. Above is actural a syscall.
  2. Scaling is an issue.
  3. Bus can be bottleneck(or locks)

Abstraction via RPC(remote procedure call)

  1. Looks like a funtional call
  2. scale better

Differences between RPC and function calls(or even syscalls)

  1. caller and callee don't share memories(+)
  2. no call by reference(-)
    • only call by value--large value may be slow.
  3. hard modularity -- even better than syscalls (+)
  4. caller and callee may use different architectures, like X86,ARM etc.(+-)
    • different data representations(big endian vs. little endian)

solutions for different architectures:uniform network representation

  1. marshalling(serialization)
  2. caller marshalls
  3. callee unmarshalls
    • XML, JSON, etc.
marshalling

difference between ordinary app and distributed system

For common app, we have a little glue code to connect the library and the code. The library and the code are in the same machine.

ordinary

RPC failure modes are different

  1. callee can't trash caller's data
  2. message can get lost
    • TCP:resend
      UDP:app deals with it
  3. messages can get corrupted.
  4. network might be down(or slow)
  5. server might be down(or slow)

glue code should do

  1. if message corrupted
    • resend
  2. if no response
    • retry, keep trying, until success-- at least once RPC
  3. fail, return error -- at most once RPC
    • works better for transaction
  4. exactly one RPC -- hard to implement
distributed

RPC example

  1. HTTP protocol
    • send "GET / HTTP/1.0\r\n"
      receive "HTTP/1.1 200 OK\r\n content-length: 10423"
  2. SOAP(Simple Access Object Protocol)
  3. X-windos system
HTTP call

performance problem about RPC

coalese requests: 4 set pixel requests from X windos program

Asychronous calls is used. It splits calls into two parts

  1. request
  2. notification

How to improve the performance of RPC

  1. HTTP pipelining
    • requests identify themselves
      responses specify request to receive
      possible problem is that response come back out of order
      dependent requests may be problematic: create file xyz and change permission of xyz
  2. Change API to send bigger data chunks
  3. Cache recent answer to requests
    • collaboration to server
      stale cache problem is very common when you deal with RPC program.
  4. Prefetch answers
    • only for read-only actions
HTTP call

NFS protocol


	MKDIR(dirfh,name,attr)-->fh+attrs
		name: no "/" allowed
	LOOKUP(dirfh, name)--->fh+attr
	CREATE(dirth,name,attr)---->fh+attr
	REMOVE(dirth,name)---->status
	READ look like Unix, NFS was designed for Unix
	WRITE CIFS(designed for Windows)	

NFS design goal

  1. clients should survive server crashes nicely
    • stateless server(can be slow)
      we can solve this problem: use flash on server to store pending requests.(NVRAM)
      Writes don't wait for the server to respond,if a write fail, a later "close" will fail.
      Can use "fsync"(write all data) and "fdatasync"(write all data to disk) to make sure data is written
  2. NFS by design doesn't have read/write consistency.
  3. It does have open/close consistency through fsync and fdatasync.

In order to keep consistency, we add a serial# for file handler.

HTTP call