Virtual memory efficiency - policy for page
replacement
--> Kernel's
responsibility to choose an effective algorithm to choose a victim page when
page fault occurs.
Some proposition:
1. Choose victim at random
==> fast to decide but bad performance, prone to
hacker who tries to hog the system by controlling page fault
2. Physical page 0 then 1 then 2....
3. FIFO
==>
5 virtual pages, 3 physical pages
reference string:
0 1 2
3 0 1 4 0
1 2 3 4
A ^0 0 0 ^3
3 3 ^4 4 4 4
4 4
B ^1 1
1 ^0 0 0 0 0
^2 2 2
C
^2 2 2 ^1 1 1
1 1 ^3 3
==> 9 page faults
let's increase ram, so we 'll have 4 physical pages
0 1 2
3 0 1 4 0
1 2 3 4
A ^0 0 0
0 0 0 ^4 4 4
4 ^3 3
B ^1 1
1 1 1 1 ^0 0
0 0 ^4
C
^2 2 2 2 2 2
^1 1 1 1
D
^3 3 3 3 3 3
^2 2 2
==> 10 page faults
=====> Increasing ram, and result higher page fault!! paradox!
How about the best, optimal solution:
Belady's algorithm: Using the oracle, who know everything, even predicting to
the future, to calculate page ordering
0 1 2
3 0 1 4 0
1 2 3 4
A ^0 0 0
0 0 0 0 0 0
^2 ^3 3
B ^1 1
1 1 1 1 1
1 1 1 1
C
^2 ^3 3 3 ^4 4
4 4 4 4
==> 7 page faults
However this is impossible to get because there isn't an oracle.
Let's try the Least Recently Used page algorithm: Choose the one that least
recently used by the memory
0 1 2
3 0 1 4 0
1 2 3 4
A ^0 0 0 ^3
3 3 ^4 4 4 ^2
2 2
B ^1 1
1 ^0 0 0 0 0
0 ^3 3
C ^2
2 2 ^1 1 1 1
1 1 ^4
==? 10 page faults ????????
=====> In reality, LRU works better
How to implement LRU:
Normally kernel does not know what pages were least recently
used
How to fix:
1. HW support: HW sets a bit in page table each time page is
referenced
Kernel clear bits after
page fault
2. Use clock interrupt to sample pages
(often good enough)
Reminder:
OS willing to spend extra cycles to calculate what page to
swap out
because the cost of page fault is large.
Cache vs RAM, we don't want to spend too much for optimal
algorithm.
Some other common optimization:
1. demand paging:
Don't wait until we have loaded bunch of
pages into ram.
Instead, load main's page and jump to main
N = # of pages in program
U = # of pages actually used
C = cost to read a page
F = cost of faulttime
Assume we have plenty of RAM:
w/o demand paging:
cost of VM =NC
latency = NC
with demand paging:
cost of vm = C + (U-1)*(C + F)
latency = C
Writing pages to disk:
victims get written (but only if they've
changed!)
Solution:
1. Using hw support: "dirty bit" in page
table entry
2. Keep a copy of the page, so when we
need to upoad,
==> huge waste of spac
2.1. Keep a check sum (turns out it's the
same as before)
3. tell HW is read only. Assume HW support
for r/w/x/k
(interrupt per write)
Fast forking vs VM:
Consider Emacs.
It uses lots of page tables, multiple level of page tables.
///** missed a bunch**///
copy-on-write
to parent:
1. copy on write as well
2. dally the parent until (a) the child exec or exits or (b)
time expires
vfork is like fork
however,
(1) parent is frozen until chil forks or execs
(2) parent & child share RAM until child execs
--> if the system support vfork, use it!!
malloc in UNIX:
consider system Unix circa 1977:
----------------------------------------------------------
| txt area | data | bss (+ heap) |
........ | stack|
----------------------------------------------------------
<------------------->
initialized
malloc + VM:
malloc(12)
it will look through the area in bss and its link to find a
memory block that have size 12
==> might pagefault if malloc has not been called a long
time
we can fix that by change the malloc to store a smaller block of memory.
How about allocating big chunk of memory??
Nowadays, we use a different system call, mmap().
example:
mmap(fd, offset , base,
size , option)
(in file) (in RAM) RAM
we allocate 10 MB, offset 8kB from the beginning of the file.
And some 10 MB block area in RAM will have the same content as the file
and there's also unmap as well. Free also does this...
example:
mmap("/dev/zero",0,0x00003,1024);
This is also a way to create temp file when fork.
Also, similar to instead of swaping from swap space, we read from the file
/*look up more info on mmap */
Final remark:
VM is a way to manage memory efficiently, not a way to run big apps in small ram
machine.