Lecture 16: Robustness, Parallelism, and NFS

Abstraction

Abstraction by Virtualization:
r=read(fd,buf,size)
Differences from ordinary function calls (or even syscalls) pros: +hard modularity: the caller and callee are on different address spaces, so they cannot interfere with each other +-callee, caller may use different architecture, but may use different data representation cons: -caller and callee are from different machines so they do not share memory -*server may be offline/slow -*network may be offline/slow -no call by reference -only call by value; large values will be slow *= However, from client's perspective, the two events are indistinguishable.

RPC

RPC failure modes are different:
pros: +callee can't trash caller's data cons: -messages can get lost -netwrok might be down/slow -server might be down/slow So how can the client help with these issues? In the event of data corruption, we could use checksums to make sure all packets are uncorrupted. If we find a bad one, then we can ask the server to resend the packet. Server can do the same for packets received from the client side. The main problem arises when the response from either side is absent. In this case, we can do the following: -keep trying/At least once RPC: keep trying until you get a response. This is a valid option for idempotent operations such as read and non-appending writes. -return error to the caller/At most once RPC: good for more volatile, dangerous operations such transactions or renaming a file. -Exactly Once RPC: A single request is sent and executed successfully. This is the ideal case for the user. if message corrupted resend if no response retry, keep trying until successful - at least once RPC fail, return error -at most once RPC -works better for transactions HTTP protocol connect to HTTP server send GET /HTTP/10\r\n receive HTTP/1.1 200 ok\r\n content_length; 1023
Performance Problems-RPC suppose we have several requests
1.Asynchronous/Pipelining 2.Change API to send bigger data chunks 3.Cache 4.refetch answers-only for read only actions

Network File Systems (NFS)

Network File Systems allows abstraction of the physical locations of the disk containing the information from the users. In this way, multiple machines can contribute the same file system, but there are various issues that must be dealt with from operating systems perspective.
Implementation A struct task points to a struct filestruct(contains file descriptors), which points to numerous struct files, which point to struct inode. Note the VFS layer (virtual file system layer) dividing the two sets of structs. The purpose of VFS layer is to hide a set of struct file_operations and struct inode_operations for each file system containing pointers to file operation functions and inode operation functions, respectively.
NFS Protocol
MKDIR(dirfh, name,attr) //Parameters: parent, name of dir, permissions/ownership LOOKUP(dirfh,name) ->fh+attrs CREATE(dirfh, name)->fh+attrs REMOVE(dirfh,name) ->status READ(fh,offset,size)-> data
These are a few functions of the RPC protocol of the NFS, and they are quite similar to unix syscalls. The major difference is that system calls use file descriptors, and NFS does not. Instead NFS uses file handles which uniquely identifies different files. Because of this, Close function is not part of NFS protocol.
Design Goals Stateless Server can be slow NVRAM Clients should survive Server crashes nicely
file handle = inode# + device# + serial#
// REMOVE((inode#, device#, serial#), ...) REMOVE ((3, 12, 70), ... ) //client 1 CREATE(...) -> (3,12) //client 2 //client 1 resends REMOVE((3, 12, 70), ... ) -> ...

Synchronization Issues

Synchronization Issues
write(fd, buf, bufsize) //process 1 @12:00 read(fd, buf, bufsize) //process 2 @12:01
NFS lacks read-after-write synchonization 'buf' is effectively copied to fix
write(fd, buf, bufsize) //process 1 @12:00 close(fd) //process 1 open(...) //process 1 read(fd, buf, bufsize) //process 2 @12:01
close can fail with EIO or EQUOTA