CS 111 Week 9 Lecture 16

Wednesday, May 28, 2014

Lecturer: Paul Eggert | Scribe: Caitlin Torsney

Table of Contents

Remote Procedure Calls

What Can Go Wrong

Techniques to Address Problems

Sample Protocols

HTTP

X Window System

Network File System

Scheduling I/O

Disk Scheduling Algorithms

Remote Procedure Calls (RPC)

One important benefit that RPC's introduce to a system is that they allow the caller and callee to be isolated from each other. This means that the two entities cannot trash each other's memory. This is an important benefit, but there are a number of problems that RPC's have. As engineers, it is always important to think about problems and what could potentially go wrong within a system. This section examines scenarios in which things go wrong with RPC's. It also examines general techniques used to avoid these scenarios.

What Can Go Wrong

Lost Messages

Lost requests or lost responses can lead to problems within a system. For example, a server can respond correctly but the client may never get it. This is problematic because the client will not know that the server has responded and the server will not know that its response has been lost.

Duplicated Messages

Duplicated messages, including responses or requests, can also lead to problems. For example, if a request to withdrawl money from a bank account is duplicated N times the system will incorrectly withdrawl (N-1) too many times from the account. This will leave the user to wonder what happened to all their money!

Corrupted Messages

Corrupted responses and requests can be accidental or purposeful. For example, a request can maliciously ask for A instead of B or a request can accidentally ask for A instead of B. Either option will lead to unwanted or unexpected behavior in a system.

Network Problems

There are 3 types of network problems:

1) The network is too slow

2) The network is overloaded

3) The network is down

Each of these problems will slow down communication between the client and the server. Even worse, there is no way to tell the difference between these problems. For example, if a server's response is delayed due to a slow network the client that is waiting for the response has no way of knowing the delay is caused by a slow network. To the client the delay could be attributed to a slow, overloaded, or down network.

Server Problems

There are 2 types of server problems:

1) The server is too slow

2) The server is down

These problems also slow down communication between the client and the server. And similar to network problems, it is difficult for the client to tell which problem is affecting the server when responses are delayed.

Techniques to Address Problems

Checksums

Problem Addressed: Corrupted Messages

Checksums are be used to check for data corruption. They can verify that data has not changed from one state to another. This not a new idea, TCP Protocols generally use this technique. While it is a reliable technique does not perfectly detect corruption. In most practice, when a corrupted message is found the message gets resent.

At Least Once, RPC

Problem Addressed: Lost Messages

This technique tries to ensure that a message is never lost. It does so by continuing to resend a message while there is no response. This is not good practice in situations where an action should only be executed once (i.e. transferring money).

At Most Once, RPC

Problem Addressed: Lost Messages

This technique also tries to ensure that a message is never lost, but this approach is better for actions that should only happen once. In the case that there is no response, an error will be reported. This ensures that the system will know a message has been lost but it will never execute an action more than once.

Exactly Once, RPC

Problem Addressed: Lost Messages

This technique also tries to ensure that a message is never lost, but it is extremely hard to implement. Its goal is to have each action executed exactly once. This requires all lost messages to be found and executed properly. Implementing this technique may be impossible because you can never really trust the network to transfer your message. Therefore it is hard to keep perfect communication between the server and client.

Sample Protocols

HTTP

HTTP is a widely supported protocol. One problem that it has is that it is bulkier than some of its alternatives. HTTP is largely used as a basis for higher-level protocols. SOAP is an example protocol that builds off of HTTP.

Extra Resource

X Window System

X Window System (aka X), is widely known as "pixrect on wheels". It involves running a client on a separate machine from a server and they communicate by sending messages to one another. The server then controls a separate device that is used to display information. This protocol is useful in situations where security is more important than other aspects of a system. The display device is so simple, and has such small memory, that it is extremely trustworthy.

Extra Resource

Network File System (NFS)

NFS can be thought of as "the unix file system on wheels". The goal of NFS is to allow you to have a file server running on some machine other than the one running your OS and still have the system work. This means that your applications will have to work with NFS as if they were working with local files. In other words, applications should be unaware of the separate file server. Today, we are up to the 4th version of NFS. Although version 3 is still widely in use.

In order to accomplish the goals of NFS the kernel will ship out read and write requests over the network to the file server. The server will then handle the requests and send back responses.

Differences From Unix

Even though NFS is "unix on wheels" there are some system calls in NFS that differ from Unix. Some examples are listed below.

NFS Request: lookup(dirfh, name)

dirfh: Directory file handler (a number specifying a unique ID for the directory that we're interested in)

name: File name (string)

The directory file handler is implemented via an inode number and device number.

NFS Response

A response to the NFS request described above will return a file handler (fh) and the attributes (attrs) of the file

Other System Calls and Returns

Create File -- Call: create(dirfh, name, attr) | Return: fh + attrs

Create Directory -- Call: mkdir(dirfh, name, attr) | Return: fh + attrs

Remove File -- Call: remove(dirfh, name, attr) | Return: status

Remove Directory -- Call: rmdir(dirfh, name, attr) | Return: status

NFS Version 1

Designed to be Stateless

Version 1 of NFS focused on reliability. The goal was to allow the clients to keep working correctly even if the server was being completely rebooted. This means there would have to be no part of the server state that cares who the clients are. This was accomplished by making the RAM in the file system just a cache. A problem with this approach is that it is really slow.

Going Faster by Cheating

Client Cheats: Have the kernel return immediately from a write request saying that it succeed. This eliminates the need to wait for writes. Later on, when you call close(fd) the writes will be confirmed as successful and if one of the writes did not work close() will return false.

Server Cheats: Have the server respond immediately without actually saving the data yet. This eliminates the need to wait for server responses.

It is important to note that both of these cheats assume smart applications. They assume that the system has "fat clients" with lots of memory and dumb servers.

Extra Resource

Scheduling I/O

Assuming we have just one disk in our system, we want to manage a file system that resides on one rotating disk. The biggest factor in wait time will be the disk arm. This section will examine some algorithms that can be used to schedule tasks in this one disk system.

Disk Scheduling Algorithm

First Come First Serve (FCFS)

This algorithm will not cause any misbehavior or starve any processes but the average cost will be 1/3.

Shortest Seek Time First (SSTF)

This algorithm maximizes throughout because it minimizes the time that the system spends moving the disk arm. The problem with this approach is that it allows for starvation. If a ton of requests come in that map to the same area of the disk, it's possible that a request that is far away will never be run.

Elevator Algorithm

This algorithm follows the principles of SSTF but always seeks in the same direction. By doing so it eliminates the possibility of starvation but does not achieve as high a throughput as SSTF. This approach also gives an unfair advantage to requests in the middle of the disk because they are passed when the disk arm is moving in either direction.

Circular Elevator

This algorithm is similar to the normal elevator algorithm except that it removes the unfair advantage for requests in the middle. It does so by making the disk arm move in only one direction.

Elevator + Dally

This algorithm is also similar to the regular elevator algorithm except that it adds a delay (a few microseconds) after a request is serviced. The idea is to take advantage of sequential requests. This is a case where slowing down may actually improve performance.