Bad memory references
- Subscript errors
- Dereferencing null, uninitialized or freed pointers
Possible Solutions:
Fire the Programmers!
- small amounts of code is easy to search through for errors but this doesn't scale
- large software is extremely difficult to rid of all bad memory references
Add runtime subscript checking
- Java is an example of this. JVM is designed to do subscript checking and throw exceptions instead of allowing access
- This introduces performance issues due to the overhead of checking
- Also introduces trust issues
-A program must have subcript checking enabled for this to be effective
Solution:
-Issue can be worked around if compiler is part of the operating system thus guarantying that the subscript checking is enabled.
-Example: Burrough B5500 and its successors
Another approach to fixing the bad memory references is to use two extra registers known as the base register and the bound register to create a fixed location in memory that the process can access.
However, this requires that the hardware must be built to check for each memory access.
Problems: Must have just 1 region defined by this pair. So the process can not access outside of the memory defined by the addresses in the registers.
Possible Soltion: Multiple base-bound pairs
The programs must fit in RAM
Suppose that two instances of sort exist in memory
///// | Sort s1 | ///// | Sort s2 | ///// |
| 0x10000 | | 0x16000 | |
The hard address instructions will only work for one of these two sort codes.
For examples, the sort code has an instruction to jump to 0x16000. Due to the base-bound restrictions whenever sort s1 is ran, the instructions
will cause it to try to read outside its bounds causing an error and the sort will not work.
This means that programs can't contain absolute addresses.
Solutions: They must instead contain relative addresses.
1) Change the hardware, to always add the base register to any address in the program
2) Mutate the program as you load it into RAM changing the addresses
3) Prohibit absolute addresses: Position Independent Code (PIC)
Example of code: gcc -fpic or gcc -fPIC
Problems that we can face with base-bound pairs
1) Suppose Program changes base or bounds
Solution: Make the registers privileged or have special instructions to mutate registers that are privileged.
2) Shared Memory between multiple processes doesn't scale up due to boundary between adjacent blocks limits it to two processes able to share at maximum.
Solution: Multiple base-bound pairs
Use one or more pairs to section off the private area in memory for a process and another pair to point to the shared memory.
Could also add a system call to create a domain for process to work with.
map_domain(size, permissions)
This needs to be supported by the hardware:
Bits | Permissions |
--x | Execute |
r-x | Read + Write |
rwx | Read + Write + Execute |
-w- | Write Only (unusual) |
--- | Is this useful at all |
In traditional x86: There is no x-bit or execute bit.
Instead the x-bit is a special case of read. This allows for everything in the stack in x86 to be controlled
by read and write bits. This allowed for easy buffer overflow attacks but now the architecture is built to trap
instead of executing the code inputted from a buffer overflow attack.
Is it enough to have, for example: 4 Pairs of base-bounds for 4 domains of a process? Probably not.
Possible uses for base-bound pairs (domains) for a single process:
text of a program (r)
data of a program (rw)
heap
stack (1 per thread)
I/O buffers
Shared memory
Segmentation (Segmented Memory) : x86 Supports This
The Segmentation Table is under control of the hardware so a process must be privileged to be able to change it.
This table lives on RAM. For the example in class we chose a base-bound pair from a segment table with 2^16 = 65,536 entries
organized with three columns representing the base register, bound register, and the permissions.
Usually the number of entries in the table is 2^8 not 2^16 since x86 only uses 8-bits to represent the index into the table.
What does the RAM look like?
--A-- | -B- | -A- | B | -A- | ---C--- | ----C---- | ----A---- |
This shows Programs A, B, and C located in RAM.
Growing a segment is expensive and a major pain. This is why segmentation is not too popular.
Remind you of something else? - Similar to file systems
Solution: Fix with page-based memory management
All pages are the same size. In this example, they are 8KB.
Virtual Addresses | ---> | Magic Box (Page Table) | --->Physical Address |
Page # + Offset | | Hardware: converts virtual to physical | |
Page Table -- indexed by virtual page number
Downsides:
1) Page Registers must be privileged so that it can not point to its own page table and allow it additional control
2) Can map two different page tables to one page with different pointers
3) Easier to leave page tables in main memory
Linear Page Tables (x86 - page size 4KB)
Example of simple page table mapping function for linear page tables.
int pmap(int v) // v - virtual page number
{
return PAGE_TABLE[v];
}
A Linear Page Table has 20 bits dedicated to the page number making the page table have 2^20 or 1,048,576 entries.
An Indirect Page Table has 3*2^10*4 = 2^12 = 12KB
(size based on 3 bits for top level, text, and stack)
When deciding what to put as your implementation you must decide whether to use a linear or 2-level page table
Example of a simple page table mapping function for 2 level page tables.
int pmap(int v) {
int hi = v >> 10;
int lo = v & (1 << 10)-1;
pt *p = PAGE_TABLE[hi];
if(!p) return 0;
return p[lo] >> 8;
} // %cr3 -- register that points to hi level page table
As the OS Kernel, we want to "lie" to hardware, tell 'it', current process is smaller than it really is.
Fairly common, actually, Program thinks it has 3.5 GB but actually only has 100MB of RAM. This is done with virtual memory by loading in the parts of the program that it needs as it executes into RAM.
There is a region of disk called "swap space" (on disk) where a copy of all virtual memory of all processes exists.
Each time the process tries to use a piece of memory but it is not in RAM is causes a pault fault. When a page fault occurs the memory needed must be read from disk into RAM before it can continue.
Page Faulting (counts as a trap) Really SLOW!!!
-- Have to put old data onto disk and save it before reading the page fault
Performance can become absolutely terrible especially if 2:1 ratio or higher
Mechanisms for page faulting -> Trap, Enter Kernel; it inspects the faulting address (should it have faulted?)
Example of Page Fault Functions
void pfault(int v, int access, ...) {
if( swapmap(v, current_process) == FAIL)
kill(current_process);
else {
(p, va) = removal_policy(); /* pick a victim physical page */
p -> pmap(va) = FAULT;
write(pa, swapmap(p, va);
read(pa, swap(current_process,v));
current_process -> pmap(v) = pa;
} }
What can we do wrong?
Clock interrupt occurs
Suppose victim page is holding code for memory management
-- Have a special case for their own memory
Memory Manager process themselves are "Wired down" -they're never selected as victims
In 2 level page tables, can a page table in the second level be a victim?
-- Sure, as long as hardware puts zeros into table automatically
-- can't make 1st level as a victim
Virtual Memory Tuning
Advantages: You Start quickly
Disadvantages: Disk arm moves much more
1) Demand paging: When you start a program , load just its 1st page
(quite popular) - most do hybrids of 1+2
2) Dirty Bit in a page table track whether modified since read from disk (1 bit/page)
needed only for pages with write bit enabled
(every store instruction might change it) --Performance if have to store 2 places each time
in page table entries (PTE) reserved entry bit for this
Trick to implement: Clear the w bit in the page table entry (PTE)
Trap, OS sets the dirty bit and sets w-bit
OS keeps track of which writes you should really have
Joseph Towe
Computer Science 111