Virtual Memory and Processes

Uses of Virtual Memory

Allocation and Freeing Memory
Memory allocation operations need to be fast (e.g malloc/new/etc, free/delete)

Simplest way: allocate/distribute memory by increasing address

Pros:
- Allocation is very fast. Just need to keep a pointer to the start of the next free block and a pointer to the end of memory.
Cons:
- Limited number of allocations because there is no way to reuse memory freed by processes.
This algorithm can be extended to virtual memory to get "atomic" updates to large sections of memory:
- Allocate pages in sequence.
- Replace end of memory pointer with a forbidden page (not readable/writable).
- Load next page before allocating. Loading forbidden page will cause a crash, indicating that the limit has been reached.
Example of including garbage collection:
Old arena with objects that may point to one another.
Copy remaining objects into a new arena, compacting them to the beginning. Pointers may have to be re-set.

Example: Apache

Apache server has to respond to requests independently when possible. This can be done via fork(). Each process may call malloc.


for(;;) {
	r = get_a_request();
	if (fork() == 0) {
		process(r);
	}
}

fork() will clone the virtual memory (page table+data) each time! This can be very inefficient since requests will typically only touch a portion of the data.

Solution 1: vfork()

vfork() behaves like fork() EXCEPT

child takes control and parent is frozen until child finishes
child runs with parent's virtual memory

Issues with vfork():

Not orthogonal: Behavior of creating new process not consistent
Burden of knowledge for programmers

Solution 2: changing fork()

Another solution is to change fork() to maintain its current behavior without the expensive operation of cloning virtual memory.

Clone just the page table (not data).
Make copy of page only when parent or child process writes (copy on write)
This can be implemented by marking the pages read-only. The kernel holds the actual permissions of the page and can determine if the page is actually writable. If it is writable, a copy is produced.

Solution 3: Integrate Python with Apache

Since the Python code is now a part of our program, each request becomes a function call within our code. However, this makes the execution sequential breaks our original intention of handling our requests in parallel. To restore paralleism, we can use multithreading.

We can use pthreads to create a thread for each request. Creating threads have overhead so a faster implementation would be to create a fixed number of threads (e.g. 100 threads). Each thread checks for requests and is blocked if none are found until it checks again.

Fast Message Sending Among Processes

So far, we know to use pipes to pass data to/from processes. The pipe operation writes the data coming from a process into a buffer, and then it reads the buffer contents into the receiving process. Therefore, each call to pipe involves write making a copy, and read making a copy.

Alternatively, we can use virtual memory to share data among processes. By adjusting the page tables of the sender's and recipients' virtual memory accordingly, they can have access to the same physical page.

Page Eviction Policies

Now that we have mechanisms to detect when we need to load a virtual page to physical memory, we need a way to choose what page to evict.

Always evict physical page 0
n++ % p
Any random page (assuming random access of virtual pages)
Least recently used (LRU)
Furthest in future access (assuming we know this information)

Examples:

Simple Example: FCFS/FIFO

look at trace: 0 1 2 3 0 1 4 0 1 2 3 4

In the case of 3 physical pages:

Physical Page\Trace	0	1	2	3	0	1	4	0	1	2	3	4
0	0*	0	0	3*	3	3	4*	4	4	4	4	4
1	-	1*	1	1	0*	0	0	0	0	2*	2	2
2	-	-	2*	2	2	1*	1	1	1	1	3*	3

Let's try again with 4 physical pages:

Physical Page\Trace	0	1	2	3	0	1	4	0	1	2	3	4
0	0*	0	0	0	0	0	4*	4	4	4	3*	3
1	-	1*	1	1	1	1	1	0*	0	0	0	4*
2	-	-	2*	2	2	2	2	2	1*	1	1	1
3	-	-	-	3*	3	3	3	3	3	2*	2	2

Therefore, increasing the number of physical pages does NOT always reduce page faults. This is known as Belady's Paradox. This is a contrived case however, and increasing the number of physical pages does typically help reduce the occurrence of page faults.

Example 2: LRU

Assuming 3 physical pages

Physical Page\Trace	0	1	2	3	0	1	4	0	1	2	3	4
0	0*	0	0	3*	3	3	4*	4	4	2*	2	2
1	-	1*	1	1	0*	0	0	0	0	0	3*	3
2	-	-	2*	2	2	1*	1	1	1	1	1	4*

While LRU is a popular policy, it is not always better than FIFO.

VM Optimizations

Anticipatory Paging vs Demand Paging

In anticipatory paging, a process loads more than the requested page into memory. The idea is to assume that pages near the requested page will be needed in the near future and preloading them will reduce the number of page faults.

In demand paging, a page is not read into memory until the program needs it.

Some calculations:

N = size of program in pages
U = # of pages used (U <= N)
C = cost of reading one page
F = cost of fault

total cost latency

U(C+F) C+F demand paging

NC NC no demand (load everything)
Dirty page

We can keep track of what pages have been modified (written) and write to disk only when page is evicted.

This does not have to be implemented in hardware. We can use an implementation similar to the copy on write used near the start of lecture.
- turn off write access
- store writability somewhere else
- trap; set dirty bit and write if actually permitted

Next topic: Distributed systems...

total cost	latency
U(C+F)	C+F	demand paging
NC	NC	no demand (load everything)

Virtual Memory and Processes

Uses of Virtual Memory

Allocation and Freeing Memory

Example: Apache

Fast Message Sending Among Processes

Page Eviction Policies

VM Optimizations

Anticipatory Paging vs Demand Paging

Dirty page