Virtual Memory and Processes

CS 111 Lecture 16

Wednesday, March 2, 2016

Scribe: Emil Shallon

Virtual Page Tables

Suppose you are using virtual memory and are setting up a system to manage your pages. Addresses on our machine are 32 bits, how should we set up the addressing for data?

  1. Single Level Page Table

    We'll cut our address into two pieces: 20 bits as a virtual page number and the remaining 12 bits for the page offset. Each entry in the page table maps to a page in memory, and the offset shows where on that page our data is.

    Single Level Page Table Diagram

    So far, so good. Our pages are 212 bytes each, or 4 KiB. This is a good size for each page. This design also gives us 220 pages in our page table.
    Each process has its own page table. Each page table entry has 32 bits, so each process has to set aside 220 * 4 bytes = 4 MiB just for their process table. This is too much; we have to do better.

  2. Multi Level Page Table

    We can improve on the first design by decreasing the size of each page table. We'll attempt a multilevel approach: the first 10 bits will correspond to a top level page table, which maps to a lower level table. The next 10 bits will be the entry in this second table. Again, the last 12 bits will be the offset on the page of where the data is located.

    Multi-Level Page Table Diagram

    There is now less space that has to be saved for each process.

Page Faults

When the desired page is not in memory, it is a page fault. Triggering a page fault happens with a command like movq $0, (%rsp).
This command will trap and fall back into the OS. The kernel is informed of the faulting address and can take on of 3 approaches:

Upon taking the third approach, the proper data page is pulled into RAM from swap space on the disk. The previous instruction is attempted again and the process continues.

Guard Pages

Guard pages are pages in memory that cannot be accessed. They have no content and exist solely to protect the stack. They are holes in virtual memory that you cannot access.
Guard pages are also seen at the ends of big arrays to protect the data around them.

Use of Guard Pages

While they are useful, guard pages come at a cost. Each page used is memory that cannot be stored to and is basically thrown away.

Handling Page Faults

Where does all of this happen? Pages tables are held in kernel memory. The kernel has a function swapmap(process p, virtual page number vpn) that gives the direct address of the physical page on disk.
If we get a page fault of a valid address, it means that we ran out of memory and this page happens to be on disk in swap space, not on physical memory. We can read in the page using swapmap, but in order to keep it in memory, we'll need to kick out another page and send it back to disk. To bring this new page in to memory, we must:

  1. Choose a victim page
  2. Write the victim page back to swap space
  3. Update the victim's page table entry to be null
  4. Read in the new page from swap space on disk into the victim page's location in memory

This writing takes a lot of time, on the order of milliseconds. To not waste so much time, we let other threads run in parallel with the copying.

Page Replacement Policies

Now that we've figured out that we need to remove a page, we need to determine a policy to pick the victim page. Here are some options:

  1. victim++ policy (FIFO)
    • Increase the victim page by one every time that we need a new page.
    • We move along the disk and remove pages.
    • Every page takes its turn; this policy is extremely fair.
  2. Random number policy
    • Victim is chosen randomly by calling random(number_of_physical_pages).

These first two policies are okay, but don't really offer great performance. They could easily remove pages essential to other processes and create a snowballing slow down as other processes need to read their data back in. Let's consider these next policies instead.

  1. Oracle policy
    • Using an oracle, find the page that will be least needed in the future.
    • Unfortunately, operating systems cannot tell the future and cannot predict what pages exactly will be needed coming up, so this policy is unrealizable.
    • If the OS could tell the future, this problem, along with other problems, like the halting problem become easy (and most of us are out of jobs).
  2. madvise policy
    • madvise(virtiual address start, int size, int advice) is a function that processes can use to help the kernel.
    • The process gives the kernel hints about their usage so that the kernel can properly pick a victim page.
    • The arguments are the virtual address in question, the size of that address, and the advice flag to be given.
    • Some advice flag options include:
      • MADV_SEQUENTIAL means that the program will run through data sequentially
      • MADV_RANDOM means that the program will run through data randomly. This isn't very useful to the kernel.
      • MADV_NONE
  3. Least Recently Used policy
    • Guess which page will be least needed in the future by choosing which was least recently used.
Page Replacement Policy Example

5 virtual pages: 0, 1, 2, 3, and 4
3 physical pages: A, B, and C
Pages are referenced in the order: 0 1 2 3 0 1 4 0 1 2 3 4
Δ denotes a page fault

  1. FIFO
    Access 0 1 2 3 0 1 4 0 1 2 3 4
    Page A Δ0 0 0 Δ3 3 3 Δ4 4 4 4 4 4
    Page B Δ1 1 1 Δ0 0 0 0 0 Δ2 2 2
    Page C Δ2 2 2 Δ1 1 1 1 1 Δ3 3
    Total: 9 Page faults
  2. Oracle
    Access 0 1 2 3 0 1 4 0 1 2 3 4
    Page A Δ0 0 0 0 0 0 0 0 0 Δ2 2 2
    Page B Δ1 1 1 1 1 1 1 1 1 Δ3 3
    Page C Δ2 Δ3 3 3 Δ4 4 4 4 4 4
    Total: 7 Page faults
  3. Least Recently Used
    Access 0 1 2 3 0 1 4 0 1 2 3 4
    Page A Δ0 0 0 Δ3 3 3 Δ4 4 4 Δ2 2 2
    Page B Δ1 1 1 Δ0 0 0 0 0 0 Δ3 3
    Page C Δ2 2 2 Δ1 1 1 1 1 1 Δ4
    Total: 10 Page faults
  4. FIFO with 4 physical pages
    Access 0 1 2 3 0 1 4 0 1 2 3 4
    Page A Δ0 0 0 0 0 0 Δ4 4 4 4 Δ3 3
    Page B Δ1 1 1 1 1 1 Δ0 0 0 0 Δ4
    Page C Δ2 2 2 2 2 2 Δ1 1 1 1
    Page D Δ3 3 3 3 3 3 Δ2 2 2
    Total: 10 Page faults

There are some interesting results from this example. As expected, the Oracle policy had the fewest page faults and thus the best solution. Second best was FIFO and then LRU. It was expected that LRU would perform better because it uses past information to better predict the future, but it does not in this example. In larger examples and real machines, LRU almost always outperforms FIFO.
Interesting to note is the fourth example. Although we allocate more memory for the system to use, there are actually more page faults. This is called Belady's Anomaly. It can happen for most policies, but most often for FIFO. Question for the reader: "Why can this anomaly happen with software and virtual memory caching, but not for hardware caching?"

Implementing Least Recently Used

Least Recently Used is one of the most common policies used. It is easy to understand and works well. So how do we implement it? The first thought is to add the most recent access time to the page table entry, and simply pick the smallest value. While this is a good idea, searching through all PTEs will cause considerable slowing for memory accesses.
Instead, the following approach is taken:

An alternative approach is to keep an additional array (independent of the hardware) with an entry for each page. For each page, if it has been accessed since the last reset, mark it with a 1. If it hasn't been accessed, mark it with 0. When we need to pick a victim, select one of the pages with a 0. We know that it hasn't been accessed in a while, at least since the last reset.

Virtual Memory Optimizations

The following techniques can be used to optimize our approach to Virtual Memory.

Distributed Systems

The next big concept in this course is distributed systems. So far, we have only considered examples where our entire system was on the same machine. Now we expand to consider examples where components are distributed on servers across the world.
We introduce also Remote Procedure Calls (RPCs). These are just like the typical system calls we've seen before, but extrapolated to this new scenario. The network connects the caller and the callee. These new calls are different from ordinary calls in a few main areas:

Applications simulate RPCs as if they were ordinary system calls. There is an additional RPC layer underneath the system call layer. All RPCs therefore require additional overhead for shipping things over the network.

RPC Layer beneath system calls

There are a lot of ways that these RPCs can fail: messages can be lost, messages can be corrupted, the network could be down, the network could be slow, the server you're trying to get in contact with might be down, or that server might be down. The worst part is that if an RPC fails, there's no way to determine why (so far).