Lecture 15 Scribe Notes

Apache

Assume we use a large program (e.g Apache). Apache calls malloc() which uses lots of virtual memory.

for (;;)
r = get_a_request();
if (fork() == 0) { // clones virtual memory
execlp("python"); //process r
}
}

Solution

We can share pages between parent and child, but they are marked read-only to the hardware.
When such a page is modified, kernel clones the page and makes both copies writable. This speeds up fork() but we still need to copy page tables.
We can clone just the page tables (not data), but this is tricky.
- We make copy only when parent or child writes.
- copy_on_write()
We can also use vfork()
- vfork() acts like fork()
- EXCEPT
  1. Parent is frozen until child exits or execlp.
  2. Child runs with parent's virtual memory, meaning if child changes RAM, this
    affects parent's RAM.
- Parent and child have the same page table until exec.

Ways to speed up virtual memory

vfork()
Multithread. Use pthread to create a thread for each request. Generally we don't use vfork() in this case because it may freeze other threads.

Page Fault

A page fault is a trap to the software raised by the hardware when a program accesses a page that is mapped in the virtual address space, but not loaded in physical memory.

Page Fault Mechanism: how page fault is dealt inside kernel

The physical memory is checked and a "victim" is picked. After picking the victim, we save it to the disk in case the victim is useful to other users.
Page fault mechanism can be illustrated by the following graph[1]:

If the memory is random accessed, then any policy works fine since victim can be chosen at random.

- FCFS (First Come First Served)

We choose the entry that has been in the memory fot the longest amount of time.
Assume the trace is:
0 1 2 3 0 1 4 0 1 2 3 4
Also assume 5 virtual pages and 3 physical pages.

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	^3	3	3	^4	4	4	4	4	4
B		^1	1	1	^0	0	0	0	0	^2	2	2
C			^2	2	2	^1	1	1	1	1	^3	3

There are 9 page faults in total.
If we increase number of physical page to 4, number of page faults also increases.
This situation is called Belady's anomaly:

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	0	0	0	^4	4	4	4	^3	3
B		^1	1	1	1	1	1	^0	0	0	0	^4
C			^2	2	2	2	2	2	^1	1	1	1
D				^3	3	3	3	3	3	^2	2	2

10 page faults in total.

- LRU (Least Recently Used)

The victim is the entry that has not been accessed for the longest time.

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	^3	3	3	^4	4	4	^2	2	2
B		^1	1	1	^0	0	0	0	0	0	^3	3
C			^2	2	2	^1	1	1	1	1	1	^4

Cost is 10 page faults.

- Oracle of Delphi

Look into the future to see which page is not needed for the longest time.

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	0	0	0	0	0	0	^2	2	2
B		^1	1	1	1	1	1	1	1	1	^3	3
C			^2	^3	3	3	^4	4	4	4	4	4

7 page faults in total.

Demand Paging

Pages are only brought into memory if they are demanded. When a program starts, only the first page is loaded into RAM.
Pros and Cons:
+: Less wait time (Program starts faster)
-: More page faults during execution

Cost Comparison:
N: Number of pages in the program
U: Number of pages used (0 < U <= F)
C: Cost of reading 1 page
F: Cost of faulting

Options Total Cost Latency Cost

No demand paging N*C N*C

Demand paging U(C+F) C+F
Dirty Bit
- A dirty bit keeps track of which pages are dirty. That is, it records if a page in RAM is the same as what is on the disk.
- If a victim's dirty bit is 0, we can immediately discard it so we save the time to store it to the disk
- We can implement dirty bit into page table using r, w, x bits, which is easy with hardware support.
- Or we can use write bit to simulate dirty bit.(software support)

[1] Galvin, Operating System Concepts, Chapter 9.1

Computer Science 111
Lecture 15: VM and Processes

Effiency of Virtual Memory using Programs

Apache

Solution

Ways to speed up virtual memory

Page Fault Mechanism

Page Fault

Page Fault Mechanism: how page fault is dealt inside kernel

Page Fault Policy: how to choose the victim

- FCFS (First Come First Served)

- LRU (Least Recently Used)

- Oracle of Delphi

Virtual Machine Optimization

Demand Paging

Dirty Bit

References

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	0	0	0	^4	4	4	4	^3	3
B		^1	1	1	1	1	1	^0	0	0	0	^4
C			^2	2	2	2	2	2	^1	1	1	1
D				^3	3	3	3	3	3	^2	2	2

Options	Total Cost	Latency Cost
No demand paging	N*C	N*C
Demand paging	U(C+F)	C+F

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	0	0	0	^4	4	4	4	^3	3
B		^1	1	1	1	1	1	^0	0	0	0	^4
C			^2	2	2	2	2	2	^1	1	1	1
D				^3	3	3	3	3	3	^2	2	2

	0	1	2	3	0	1	4	0	1	2	3	4
A	^0	0	0	0	0	0	^4	4	4	4	^3	3
B		^1	1	1	1	1	1	^0	0	0	0	^4
C			^2	2	2	2	2	2	^1	1	1	1
D				^3	3	3	3	3	3	^2	2	2