Lecture 15

By Arjun Patel and Ming Liu

Hardware traps

Choosing A Page

Page Faults: Replacement Policies

- If we want to read a page into physical memory, and if the physical memory is filled, we must evict a page currently in physical memory to make room for the new page. For this purpose there are some existing replacement policies.

- Replacement policies work by trying to predict what pages currently within physical memory are unlikly to be accessed again within the near future and selecting those particular pages as candidates for eviction

- As such, if an application accesses pages entirely randomly, then no replacement policy is better than any other since the application's page usage cannot be predicted.

- However, applications sometimes exhibit non-random behavior that exhibit spacial and/or temporal locality, and we use the replacement policies to exploit these patterns in order to increase performace.

- In some cases, the hardware may not have/support replacement policies. This can be circumvented by turning off all r, w, and x bits for all the pages and instead store all the information within the kernal. Since the bits are turned off, every page access will fault and the kernal could then take appropriate action.

One Possible Policy: F(irst)I(n)F(irst)O(ut)

- The first page read into physical memory is the first one to be evicted.

- We now run an iteration of FIFO. Assuming physical memory could hold 3 pages and the following VPNs were requested: 0,1,2,3,0,1,4,0,1,2,3,4.

Reference String start --0-- --1-- --2-- --3-- --0-- --1-- --4-- --0-- --1-- --2-- --3-- --4--
Physical Page 1 \0 (0) 0 0 (3) 3 3 (4) 4 4 4 4 4
Physical Page 2 \0 - (1) 1 1 (0) 0 0 0 0 (2) 2 2
Physical Page 3 \0 - - (2) 2 2 (1) 1 1 1 1 (3) 3

-(x) indicates a page has been read into memory. FIFO incurs 9 page faults in this instance.

FIFO: Better Hardware

- Typically, there are two main ways to improve performance, via software or via hardware. Now we run FIFO with the same page requests as before, but instead of three physical page slots, we now use four.

Reference String start --0-- --1-- --2-- --3-- --0-- --1-- --4-- --0-- --1-- --2-- --3-- --4--
Physical Page 1 \0 (0) 0 0 0 0 0 (4) 4 4 4 (3) 3
Physical Page 2 \0 - (1) 1 1 1 1 1 (0) 0 0 0 (4)
Physical Page 3 \0 - - (2) 2 2 2 2 2 (1) 1 1 1
Physical Page 4 \0 - - - (3) 3 3 3 3 3 (2) 2 2

-More hardware managed to inccur more page faults than before. Typically this is not the case, but for particular inputs this may sometimes happen. This phenomenon is called Belady's anomaly: where better hardware sometimes results in reduced performance.

L(east)R(ecently)U(sed): a different software approach

- Instead of victimizing based on the time the pages were read in, LRU victimizes based on the time that the page was last used.

- We run an iteration with the same assumptions as before and three physical pages.

Reference String start --0-- --1-- --2-- --3-- --0-- --1-- --4-- --0-- --1-- --2-- --3-- --4--
Physical Page 1 \0 (0) 0 0 (3) 3 3 (4) 4 4 (2) 2 2
Physical Page 2 \0 - (1) 1 1 (0) 0 0 0 0 0 (3) 3
Physical Page 3 \0 - - (2) 2 2 (1) 1 1 1 1 1 (4)

- Ten page faults this time. So the takeaway here is that effectiveness of a replacement policy is largly correlated with the order of page accesses.

The 'Oracle' Policy

- In some cases, page accesses could be very predictable. Suppose this was such an instance. How many page faults would we incur if we could 'predict' the future page access perfectly and thereby take optimal action at every turn?

Reference String start --0-- --1-- --2-- --3-- --0-- --1-- --4-- --0-- --1-- --2-- --3-- --4--
Physical Page 1 \0 (0) 0 0 0 0 0 0 0 0 (2) 2 2
Physical Page 2 \0 - (1) 1 1 1 1 1 1 1 1 (3) 3
Physical Page 3 \0 - - (2) (3) 3 3 (4) 4 4 4 4 4

- Only seven page faults. Much better than before, but a policy like this requires very predictable inputs, which for most applications, is quite rare.

Distributed Systems & R(emote)P(rocedureal)C(alls)

- A distributed system is a number of independent machines linked via some kind of network

- As for RPC's, an analogy is best for its definition. RPC's to distributed systems is as system calls is to the kernal

- RPC's are generally provided via either an API or an ABT, abstracting away lower level details allowing both ease of use and relative robustness.

- Despite the similarities between sys calls and RPC, there are major differences:

Solutions for differing architectures

- Consider two machines in a distributed system, one running x86-64 and the other running spark-64. How would we resolve the seemingly simple issue that x86-64 is little endian and spark-64 is big endian? How is the two machine suppose to communicate?

-So we have the 'translating' down. But how is data actually sent from one machine to another?

-Marshalling: The client takes its message, however complex it may be, and changes it to a format so that it could be sent to the server sequentially(i.e. one piece at a time). The server then reconstructs the original message from the individual pieces that it received.

Stubs

Swap two halves of the word around recursively to convert from little Endian to big Endian. Swap it in log base 2. Intel created the instruction called swap on little endian devices to do this. How do we do it?

Stubs are pieces of code that run in the client gluing the client application to the network. They rename("abc","def") as an example performing conversion. Stubs will put it into a packet and ship it to the server automatically. It is possible to write the stubs by hand, but after 50-100, it gets boring. THey are generated automatically from protcol descriptions. There is a textual description of how to go over the wire and a c code will implement that program. e.g. rpcgen which generates stubs on server to support protocol. It doesn't use the kernel.