Scribers: Ze (Albert) Yuan and Jacob Lewe
Contents
Memory Hierarchy
*we will look at these
Which part of disk should we put in RAM?
0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 | |
A | 0* | 0 | 0 | 3* | 3 | 3 | 4* | 4 | 4 | 4 | 4 | 4 |
B | ? | 1* | 1 | 1 | 0* | 0 | 0 | 0 | 0 | 2* | 2 | 2 |
C | ? | ? | 2* | 2 | 2 | 1* | 1 | 1 | 1 | 1 | 3* | 3 |
0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 | |
A | 0* | 0 | 0 | 0 | 0 | 0 | 4* | 4 | 4 | 4 | 3* | 3 |
B | ? | 1* | 1 | 1 | 1 | 1 | 1 | 0* | 0 | 0 | 0 | 4* |
C | ? | ? | 2* | 2 | 2 | 2 | 2 | 2 | 1* | 1 | 1 | 1 |
D | ? | ? | ? | 3* | 3 | 3 | 3 | 3 | 3 | 2* | 2 | 2 |
0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 | |
A | 0* | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2* | 2 | 2 |
B | ? | 1* | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3* | 3 |
C | ? | ? | 2* | 3* | 3 | 3 | 4* | 4 | 4 | 4 | 4 | 4 |
0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 | |
A | 0* | 0 | 0 | 3* | 3 | 3 | 4* | 4 | 4 | 2* | 2 | 2 |
B | ? | 1* | 1 | 1 | 0* | 0 | 0 | 0 | 0 | 0 | 3* | 3 |
C | ? | ? | 2* | 2 | 2 | 1* | 1 | 1 | 1 | 1 | 1 | 4* |
The read/write head is relatively cheap to access ⇒
Disk Reading Algorithms
- Until now for around 1/2 of the course, we've been talking about virtualization and using virtualization to produce systems
- Another approach is to use distribution.
- The caller (client) and callee (server) don't share an address space
+ A distributed system provides hard modularity for free
- Data transfer is less efficient
- Hard modularity means that you can no longer make calls by reference
- Example you may no longer: "read(fd, buf, bufsize)" between client-server
- The caller and callee can disagree about fundamental formats such as:
floating point formats, integer formats, big endian vs little endian, or width size
- Another important thing to consider is that the network protocol must specify data representation on the wire :
e.g: 32 bit big-endian, 2's complement for floating points
- A process called marshalling and unmarshalling is used to resolve differences in data representation
- A common client/server pattern to do this is Remote Procedure Call (RPC)
- A Remote Procedure call looks like a function call at the client
e.g: we call time_t tod("Casablanca")
- We want our tod function to look like:
time_t tod( char* loc) {
//fancy code to calculate the time
return some_time
}
- However the actual code will look more like:
client code: {
package("casablanca", hd)
setup_IP(hd, serveraddr)
send(hd)
receive()
}5
- The server code will also be similar to this
- Doing this is tedious and error prone, especially if you have lots of functions
- Luckily this process can be automated by programs that handle all of the marshalling and unmarshalling
- Messages can become corrupted (RAM suffers from this too), solved by adding a checksum to the packet
- Messages can be lost or dropped
- Messages can be duplicated
- Suppose the client sends "unlink("/~/foo/hw5.txt")", and the client never receives a response, then:
1) Keep trying (100ms timeout), This implements At Least Once RPC
2) Return an error, This implements At Most Once RPC
3) Hybrid, try a couple of times then return error
3) Just try once, This implements Exactly Once RPC
Example: X window Protocol
"pixrect on wheels"
- The X window protocol basically draws pixels to a screen using the RPC protocol
ex: client: draw_pixel(x,y,r,g,b)
- The setup for Professor Eggert's X server looks like this:
- So what happens if we want to draw lots of pixels at once ?
- Well we could send each pixel at once, wait for a response, however this causes too much delay
- Instead we want to send lots of requests simultaneously
- We'll get lots of responses back, they could even be in the wrong order, but this doesn't matter