Lecture 14: Scribe Notes for May 21, 2014

Lecture 14: Scribe Notes for May 21, 2014
By Sagar Patel, Samir Choudary and Tanay Agrawal

----------------------------------------------------------------------------------------------------------

Table of Content

1. Page Faults
2. Page Fault Optimizations
3. Choosing The Victim
4. Distributed Systems

-----------------------------------------------------------------------

Page Faults

A page fault is basically a trap to the kernel by the hardware.
It occurs either when the Virtual Page that the process is accessing doesn't exist in memory. This can be due to 2 reasons:

The virtual address is illegal. In this case the virtual page corresponds to a NULL entry in the Page Table.
The virtual address is legal, and the virtual page maps to an entry on disk. This entry however does not exist in memory.

To deal with the first case the kernel sends a SIGSEGV signal to the offending process, to indicate that the address was out-of-bounds.
To deal with the second case, the kernel swaps the virtual page that the process is requesting into memory. Typically, it will need to evict another page from memory before it can do so (since memory is usually at full capacity!).

Difference between interrupts and page faults:
In an interrupt (like an Open() sys call), the kernel generally returns to the NEXT instruction of the program.
In a page fault, the kernel generally loads the corresponding page into memory and returns to the CURRENT instruction (i.e. the instruction that caused the fault) of the program.

The pseudocode for managing a page fault is shown below:

Key terms:
vpn - Virtual Page number
ppn - Physical Page number
v_ prefix - victim's attribute


/// supplied by kernel 
long long swapmap(proc, vpn){
            //return disk address of the corresponding virtual page or return FAIL; 
} 

void pfault(proc, vpn, acess_type){ 
            long long da = swapmap(proc, vpn); 
            if (da == FAIL) //if not valid page
                   kill(proc); 
            else { 
                   (v_proc, v_vpn, ppn) = page_removal_policy(); // pick a page in physical memory to swap out to disk
                   //since we need to save the data of the victim's virtual page we are swapping out of memory, 
                   //we need this function to return the victim's PID (v_proc) & victim's VPN (v_vpn).
                   //Using these two attributes of the victim, we obtain the address on disk to save the data to.
                   v_da = swapmap(v_proc, v_vpn);
                   
                   //1 Save victim's data to disk, since we have the disk address to save to (v_da), and the memory address to save from (ppn).
                   //2 Then move our data from disk, since we have our disk address to save from (da) and memory address to save to (ppn).
                   //3 Update our victim's Page Table to reflect that the page has been swapped into disk
                   //4 Also update our current process' Page Table to reflect the page has been swapped into memory
           		 } 
}

Note however that the way our program is written, there is a chance of a race condition!
What if steps 1 and 2 occur, and then there is a transfer of control?
The victim's page table will incorrectly assume that it has v_vpn in memory, whereas the current process' page vpn is actually in memory.
The victim can access another process' data, which is a major security loophole!
The solution to this race condition is to move step 3 after step 1, and move step 4 after step 2.
This way before the running process uses the page in memory, the victim's page table accurately reflects that the page in memory is unused.

Page Fault Optimizations:

Since page faulting is a very slow process, the following are methods to speed this up:

Demand Paging:
On start up, just read the program's 1st page into RAM. Don't wait to load all the remaining pages.
+ Improve wait time (since we get started really quickly).
- More page faults during execution. This will likely reduce our throughput and turnaround time.

Here are some quick calculations to test how useful demand paging is:
N = # pages in program
U = # pages actually used in this run. 0 <= U <= N
C = cost of reading a page (excludes seek / rotational delay)

Total Cost

Latency Cost

Demand Paging

UC + (U-1)F

C

No Demand Paging

NC

NC

Thus we can see that the latency with demand paging is always lower than without, but the total cost can go higher when U & F are relatively high.
Dirty Bit:
Using a dirty bit, we keep track if whether a page in RAM is different from what's on disk. If it is, we set the dirty bit to 1, otherwise it is 0.
Thus when kicking a victim out, we don't write the victim to disk if the dirty bit == 0.
+ Saves a write to disk in many cases.
- Added Complexity.

Implementing the dirty bit:
Set the write bit of the page 'w' to 0, even if the page is writable. When a process tries to write to the page, the Hardware throws an exception and the kernel gains control.
The kernel keeps track of whether a page is writable. If it is, it sets w to 1 and returns control.
Therefore if w == 1, the dirty bit is set.
This slows the first write substantially, but subsequent writes are all at full speed.

Choosing the Victim

Suppose the app accesses pages at random, then any policy will do! (even a policy like "Always choose page 1 as the victim page")
This is because the previous accesses tell us absolutely nothing of future accesses, which are completely random.

Typical programs however don't access pages at random. A more reasonable assumption would be that a program accesses a page it accessed in the recent past. With this assumption, we are going to test some replacement policies.
To compare the policies, assume that our system has 3 physical pages available in memory.
Also assume a trace (i.e. a sequence of virtual page accesses) as follows:
0, 1, 2, 3, 0, 1, 4, 0, 1, 2, 3, 4.

First-In-First-Out (FIFO)
In this replacement policy, the first virtual page to be stored in memory will be the first to leave. Testing it on our sample trace, we get:

Physical Page 1	0	0	0	3	3	3	4	4	4	4	4	4
Physical Page 2		1	1	1	0	0	0	0	0	2	2	2
Physical Page 3			2	2	2	1	1	1	1	1	3	3

Counting the number of page faults here, we see that there are 9 page faults.

How can we reduce the number of page faults?
- A standard hardware approach would involve adding more memory. Therefore, lets increase the number of physical pages from 3 to 4.
Testing it on our sample trace, we get:

Physical Page 1	0	0	0	0	0	0	4	4	4	4	4	3
Physical Page 2		1	1	1	1	1	1	0	0	0	0	4
Physical Page 3			2	2	2	2	2	2	1	1	1	1
Physical Page 4				3	3	3	3	3	3	2	2	2

Counting the number of page faults here, we see 10.

The number of page faults increased from 9 to 10, even though we added more memory! This is called Belady's Anomaly.

Let us now try another replacement policy. Instead of replacing the first page that entered memory, we will replace the page that was least recently used.

Least Recently Used (LRU)
As the name of the policy implies, we replace the least recently used physical page. Testing out this policy on our trace with 3 physical pages, we get:

Physical Page 1	0	0	0	3	3	3	4	4	4	2	2	2
Physical Page 2		1	1	1	0	0	0	0	0	0	3	3
Physical Page 3			2	2	2	1	1	1	1	1	1	4

Counting the number of page faults here, we get 10 faults.

Even though in this case LRU gives us more page faults than FIFO, LRU in general gives less faults than FIFO.
LRU is however more complicated to implement than FIFO. We could implement this in hardware by keeping some bits to keep track of which page was least recently used.
However the hardware guys most likely won't comply, therefore we should try and keep track of this in software. We can do this by following these steps::

Keep a clock entry for each Physical Page in memory.
Make it so that each access to a Virtual Page results in a page fault (by setting all permission bits to 0 in hardware, and keeping track of permissions in software?).
When the kernel receives control after the page fault, it checks whether the virtual page exists in a physical page in memory:
a. If it does, it updates the physical page's clock entry, and gives the process access to the virtual page with the appropriate permissions.
b. If it doesn't, it looks at the clock entries for all the physical pages, evicts the physical page that has the oldest clock entry, grants this physical page to the new virtual page, and updates the clock entry.

We've seen how these replacement policies perform compared to each other, but how do they perform in an absolute sense?
Let's compare these policies to an ideal policy, a policy which "consults the Oracle of Delphi" to see into the future and predict which page will be needed next.
At best, we can get:

Physical Page 1	0	0	0	0	0	0	0	0	0	2	2	2
Physical Page 2		1	1	1	1	1	1	1	1	1	3	3
Physical Page 3			2	3	3	3	4	4	4	4	4	4

Counting the number of page faults, we see that there are 7 page faults.

Therefore for this particular case, the best our policy can do is have 7 page faults.

Distributed Systems & RPC

Remote Procedure Calls is to Distributed Systems as System Calls is to the Kernel.
For e.g. the System Call Open() corresponds to INT 0x80.. in the Kernel.

We have a similar abstraction for RPCs & Distributed Systems, BUT there are some differences:

The caller and callee don't share an address space.
- We can't just write to the callee's address space as is done in system calls.
- We can only call by value, not call by reference.
+ The system has hard modularity, since the callee cannot in any way harm the caller (e.g. trash the caller's stack).
The caller and callee may have different architectures.
- If the caller is a little-endian machine and the callee big-endian, we will need some added complexity to make them compatible.

Resolving Endianness:

Client (the sender) transmits data in the form the recipient prefers; "Polite Sender".
Client always transmits data in its own format; "Rude Sender".
Standardize on Big Endian.

Marshaling:
In military terminology, marshaling is the process of organizing troops in an ordered manner.
Similarly in computer science terminology, marshaling is the process of organizing the bytes to be transmitted in a particular order. It is done at the client's end.
Unmarshaling refers to the process of reorganizing these ordered bytes into a format that the server understands. It is done at the server's end.

Almost all little-endian machines (Intel) have a machine instruction called swap, which switches data from little to big endian (since big endian is the standard for transmitting data).

Stubs:
Stubs are pieces of code that are used to "glue together" the client app to the network.
They convert the parameters into a standardized form (e.g. big endian). They can be written manually but that's unnecessarily tedious, and are instead often generated automatically from the protocol description. Example: RPCGEN.

	Total Cost	Latency Cost
Demand Paging	UC + (U-1)F	C
No Demand Paging	NC	NC