Scribes: David Woo and Jonathan Tengco
Problem: Buggy programs with bad memory references.
Solutions:
Unfortunately, both of these solutions incur costs for people developing applications, either in terms of money to hire better programmers or in terms of performance lost from checking programs for bad memory references at runtime.
Problems:
Solution: The registers are privileged, any attempt to change them will result in a trap.
Solution: Use multiple base/bounds pairs for each process. This is usually a small number that is limited by the hardware.
Example: The traditional (circa 1978) Unix model of memory.
Each process gets a segment table, and each entry describes a region of memory that the process can access, along with what permissions that process has for that segment (r/w/x). Some combinations of permissions (r, rx, rw, rwx) are useful, some (w, wx, x, none) are not. Base/bounds pairs can be remapped using the system call map_domain(base, size, permissions...). Note that not all hardware supports all combinations of permissions- before 2003, there was no x permission for intel chips and having read permissions implied having execute permissions.
Downsides:
* Usually.
Advantages:
The last advantage was the original selling point of virtual memory. Physical memory was used as a cache for disk- pages that couldn't fit in RAM were stored on disk until they were needed. When the program tried to access a page that wasn't in RAM, a trap occurred and the needed page was loaded from disk. However, this led to a phenomenon called "thrashing" when a high proportion of memory accesses needed to load a page off of disk. When this occurred, the program would slow to a halt because of the long wait times involved in accessing disk. This led to the phrase "If you thrash, you're dead."
We need hardware help in order to implement virtual memory.
The simplest implementation: linear page table hardware support.
Page table lookup (as modeled in C; vpn = virtual page number)
|
Note: the number of bits that PAGE_TABLE[vpn] is shifted depends on how large the offset is. This function should also return a special value when it tries to look up an unmapped page number (when it page faults).
Problems:
Solutions:
|
When the virtual memory manager gets a request involving a virtual address that does not resolve to a physical address in RAM (e.g. the data it refers to is still on disk), a trap will occur. The kernel will get the virtual page number from the request, as well as the type of request and the context in which the request was made. It will then do one of two things:
Installing a mapping:
int swapmap(proc, va){
return disk address or FAIL;
}
void pfault(va, atype, context){
if(swapmap(current, va) == FAIL)
kill(current, SEGV);
else{ // Select a page to swap out
p, va' = removalpolicy();
pa = p-> pmap(va);
write pa to disk at swapmap(p, va');
p->pmap(va') = FAULT;
read pa from disk at swapmap(current, va);
current->pmap(va) = pa;
}
}
Problems:
Some parts of the kernel are set to never be paged out. This can be useful for realtime applications as well.
It depends. If access is random, then any algorithm will do. However, if we take advantage of locality of reference (closeness in time/space, see past lecture about file system efficiency for examples), then we can use clever algorithms to reduce the amount of times we page fault in the future.
Example: FIFO (a Δ indicates that page fault occurred while trying to process the request)
The page # in memory being requested | ||||||||||||
Page | 0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 |
A | Δ0 | 0 | 0 | Δ3 | 3 | 3 | Δ4 | 4 | 4 | 4 | 4 | 4 |
B | - | Δ1 | 1 | 1 | Δ0 | 0 | 0 | 0 | 0 | Δ2 | 2 | 2 |
C | - | - | Δ2 | 2 | 2 | Δ1 | 1 | 1 | 1 | 1 | Δ3 | 2 |
Result: 9 page faults.
Example: Buy more memory (and still use FIFO).
The page # in memory being requested | ||||||||||||
Page | 0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 |
A | Δ0 | 0 | 0 | 0 | 0 | 0 | Δ4 | 4 | 4 | 4 | Δ3 | 3 |
B | - | Δ1 | 1 | 1 | 1 | 1 | 1 | Δ0 | 0 | 0 | 0 | Δ4 |
C | - | - | Δ2 | 2 | 2 | 2 | 2 | 2 | Δ1 | 1 | 1 | 1 |
D | - | - | - | Δ3 | 3 | 3 | 3 | 3 | 3 | Δ2 | 2 | 2 |
Result: 10 page faults. This is an example of Belady's anomaly: adding RAM actually increased the number of page faults.
Example: Optimal page replacement algorithm (replace the page that won't be needed for the longest time; requires prescience)
The page # in memory being requested | ||||||||||||
Page | 0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 |
A | Δ0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Δ2 | 2 | 2 |
B | - | Δ1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | Δ3 | 3 |
C | - | - | Δ2 | Δ3 | 3 | 3 | Δ4 | 4 | 4 | 4 | 4 | 4 |
Result: 7 page faults.
Example: Least recently used (this is a commonly-used algorithm)
The page # in memory being requested | ||||||||||||
Page | 0 | 1 | 2 | 3 | 0 | 1 | 4 | 0 | 1 | 2 | 3 | 4 |
A | Δ0 | 0 | 0 | Δ3 | 3 | 3 | Δ4 | 4 | 4 | Δ2 | 2 | 2 |
B | - | Δ1 | 1 | 1 | Δ0 | 0 | 0 | 0 | 0 | 0 | Δ3 | 3 |
C | - | - | Δ2 | 2 | 2 | Δ1 | 1 | 1 | 1 | 1 | 1 | Δ4 |
Result: 10 page faults.
Other optimizations
When a program starts, only bring in the first page that it needs. If the program doesn't need to use all of its pages right away, it can start immediately doing useful work (this improves response time).
This can save you from having to do unnecessary writes. There are several ways to implement this:
fork() copies page tables, which can be slow, so don't copy them until they are needed. In a 2-level page table, mark the level 0 page tables read only in both the child and the parent. This way, both the child and the parent will fault if they try to write, which will give the kernel an opportunity to swap in the needed page.
vfork() works in a similar manner, except the parent is frozen until the child exits or executes