CS 111 Lecture 14: Virtual Memory
March 4, 2013
by Sidney Chan
Our Next Problem: Unreliable processes with bad memory
references
- subscript errors
- dereferencing bad pointers (null, uninitialized, freed, etc.)
We've had these problems since the 1950s.
Some possible solutions include:
- Fire the programmers! (This doesn't scale.) Nobody should be
writing code like this so don't do it. Unfortunately, these
problems are hard to solve.
- Add runtime checks to your code such as runtime subscript
checking, etc. (This is fairly popular; the JVM does subscript
checking by throwing an exception so you can't reference outside
your bounds.)
Why can't we just do runtime checks?
- Performance issues: We may want to play with pointers for
performance reasons, and also runtime checking is slow.
- Trust issues: We have to know if a program is written in
machine code, or if runtime checking was turned off by
attackers.
- A workaround: If the compiler is part of the OS, we can run
only the programs that were self compiled (as in the Burrough
B5500 and successors).
We need help from the hardware to do subscript checking, so the
hardware will check the memory references.
Simple approach: 2 extra registers, base and
bounds
- base points at the part of
memory you're allowed to use right now
- bounds points to the byte just
after the one you can use
We have a hardware check for each memory access. If you access
outside your bounds you trap and the kernel can decide what to do.
Problems with this simple approach:
- You must have just 1 region. (Possible solution: extend our
machine to have multiple base-bounds pairs.)
- Programs must fit into RAM; we can't have virtual memory.
Suppose you're running a sort program and you want to run two
instances at the same time:
- ip1 is at address 0x10000, ip2 at 0x12000
- Programs cannot contain absolute addresses, as ip2 will have a
jump to 0x10000 since it is a copy of the sort code. Relative
addresses will work.
To fix the absolute-address pointer, you can:
- Change the way address arithmetic works (hardware) to always
add the base to every address before you use it. This will work.
- Mutate the program as you load it into RAM: the copy on disk
has absolute addresses but you change them to relative addresses
as you read the code.
Suppose the program changes the base or bounds.
- Solution: Have these registers set as privileged either by
setting special privileged instructions or by making accesses to
these registers be privileged.
Suppose you want processes to share memory.
If you have only 2 processes you can pull this off, but this won't
scale to larger numbers.
- Solution: have multiple base-bounds pairs.
- add a system call to create a "domain"
map_domain( [size], [permissions] )
x (execute)
rx (read-execute)
w (we can have this, but it's unusual)
_ (for kernel use only)
rwx
...
- These permissions need not be supported by hardware.
Traditional x86 does r, w, but not x.
- One former problem your stack was read/write which meant you
could execute your stack, making buffer overflow attacks easy to
do. Modern x86 can give the stack r/w but not x, so it will trap
for unwanted behavior.
Problem: If you have an arbitrary amount of base-bounds pairs, say
4, this will not be enough domains. There are many possible uses for
a domain, including: text of a program (r), data of a program (rw),
heap (rw), stack (have 1 per thread), I/O buffers, shared memory. We
have many uses for a base-bound pair.
Segmentation (segmented memory)
We choose a base-bounds pair from a segment table. Privilege is
needed to change the segment table. The segment number is used to
identify the segment, along with an offset within the segment. Each
entry in the table includes information such as its base, bounds,
permissions, and so on.
Problems
- What if you run out of segments?
- Segmentation increases hardware complexity.
- Growing a segment is expensive. This is the killer problem for
segmented memory.
- The kernel should arrange that programs never overlap.
- More problems. Suppose we create a new object with new X(),
then malloc(N) for A when the segment is full. Now we have to
push C's data somewhere else. This is a pain.
Page-based Memory Management: the solution for segmented memory
- Now, we fix all the pages to the same size, say 8 KiB. This
way, data for a process does not all have to sit next to each
other in memory, making growing and moving easier. This also
solves the problem of fragmentation.
- We have a magic box that stores information about the mapping
between physical and virtual memory.
- Each process has its own page table, with a (privileged) page
register that points to a process's page table. It is up to the
OS to determine how to implement a page table.
Linear page tables
- (in x86, page sizes are 4 KiB, with 20 bit page numbers + 12
bit offsets)
With linear page tables, we have a page table that holds multiple
page table entries (PTE). In our example, the table holds 2^20
entries of 4 bytes each, meaning that each process has a page table
of 4 MiB.
- We must write a function pmap that takes as its argument a
virtual page number and returns the page table entry for that
page number. This example function is very simple:
int pmap (int v)
{
return PAGE_TABLE[v]; // this is stored in
priviliged register %cr3
}
- With linear page tables, for small processes, there will be
many unused PTES in the page table. Thus, we waste a lot of RAM.
To solve this, we use 2-level page tables.
2-level page tables
Here, instead of using the full 20 bits for page numbers, we use the
first 10 bits for the first level page table and the next 10 bits
for the next level. With two levels, we now have a more complicated
pmap function.
int pmap (int v)
{
int hi = v >> 10;
int lo = v & (1 << 10) - 1;
pt *p = PAGETABLE[hi];
if(!p)
return 0;
return p[lo] >> 8;
}
- There are now more memory accesses, so it's slower, but we
have 3*4*2^10 bytes = 12 KiB per page table, which is much
smaller. However, this won't help us for large processes. To
deal with this, we may want to "lie" to the hardware. We might
want to put into the page table entries the image of a smaller
process than a process actually is.
- It is actually fairly common that a process thinks it has more
memory than it really does (eg. it wants 3.5 GiB when it
actually only has 100 MiB). In this process, only a portion of a
process's pages are stored in main memory, with the rest in
virtual memory. When a process requires data in a page that's
not in RAM, you trap (since you are accessing invalid memory)
and then retrieve the page from virtual memory to fix the page
table.
- Downsides: Very slow! If we do a lot of page faulting it will
be as if we had no RAM at all. Also, when we placed the new
page, what happened to the useful memory that we overwrote? We
have to push the old data and save it onto disk, which requires
a write + read.
Mechanism for page faulting
When we page fault, we trap and enter the kernel. The kernel
inspects the faulting address.
void
pfault(int v, int access, ... )
{
if(swapmap(v,
current_process) == FAIL) // if bad address, eg. derefence a null
pointer or subscript error
kill(current_process);
else // valid address
{
(p, va) = removal_policy();
pa = p->pmap(va); // pa: physical address
p->pmap(va) = FAULT; // tell that the page
doesn't exist anymore
write(pa, swapmap(p, va));
read(pa, swapmap(currentprocess, v);
currentprocess->pmap(v) = pa;
}
}
- In the above code, p is the process holding the victim page
and va is the virtual address of the victim. removal_policy()
picks a victim physical page and moves it into virtual memory,
causing the process to become very slow.
What can we do wrong?
- Suppose the code for pfault is held in a page that becomes the
victim. Memory manager pages themselves must be "wired down":
never selected as victims.
- Can we make secondary tables victims? Sure, if the hardware
will allow us to write a 0 in the entry for that table in the
top level table. You can't make the top level table a victim,
however.
Virtual Memory Tuning
- Demand paging (quite popular): when you start a program, load
just its first page. The advantage of this is that you start up
quickly, but if you only load pages just as they're needed you
can cause the disk arm to move more.
- Dirty bit: in your page table, you keep track (1 bit/page,
every store insn can change it) of whether the page has been
stored into since it has been read in. Now, if we need to get
rid of a page, we know that we don't have to write it back to
disk if it has not been modified.
How can we optimize the dirty bit so we don't have to store 2
positions in RAM?
- We can avoid the necessity of computing the dirty bit on pages
where we don't have write access.
- We can store this bit in the page table entry since we have
some extra room.
- Even though the hardware doesn't give you a dirty bit, we can
tell the hardware that a process has fewer write permissions
than it really does by clearing the w bit in the PTE. There will
be a trap, so the OS sets the dirty bit and w bit and returns
and the process continues. The OS keeps track in its own area.