Computer Science 111 Scribe Notes
Lecture 14: File System
Performance
Week 9 Monday, March 5, 2012
By Marc Abuel
Two Ideas for surviving power failures
Idea #1: having a commit record
> do several different writes to file system, but only one counts
> commit record = single low level write that "matters"
> (ex) A. write the blocks from RAM to copy area
file is unchanged, in original location
B.
wait for these blocks to hit disk
C. write seperate commit record
> commit record tells us where file resides
file contents changed, location moved
>
if power fails during A & B, the file has not been written yet
D. copy from copy area to original area
D1. wait to hit disk
> step not really needed, cos committed to change
> ensures we have two copies of file on disk
file contents are new, location is unchanged
E.
clear the commit record
> BEGIN
> pre-commit phase
-during this time, whatever writes done to disk are not permanent, so you can back out with no change
-can run into trouble: run out of memory, I/O errors
> COMMIT or ABORT
-either commit or abort should be atomic
> post-commit (post-abort) phase
-typically done for (long-term) efficiency
-making overall file system layout nice
-hurts short-term efficiency (extra writes slow you down)
-if aborted, counts as a no-op
> END
***counts as a transaction
Idea #2: having a journal
> efficient way for organizing file system in order to make commit records natural and easy to do
> partition file system into 2 areas: cell data, journal
-cells = blocks to be updated
-journal = big long tape that records all changes to cell data
-size: if too small, stall alot; if too big, waste space
-write commit records into journal
-if crash occurs, can look at journal to see which commit records actually happened
> assuming Idea #1 works in order to use Idea #2
+ writes can be contiguous during bursts of activity
+ journal can be on a separate device
+ no need to put cell data on disk at all
> 2 different ways of doing logs
(1) write-ahead logs
-log all writes you plan to do
-COMMIT, write commit record
-install new values into cell data
+ from user's point of view: able to recover the new data after crash
(2) write-behind logs
-log all old values into journal (write down values you are about to erase
-install new values into cell data
-COMMIT
*if you crash in middle of installing new data, can still recover old values into cell data
+ from implementer's point of view: replays journal backwards after crash, so often faster, b/c interesting stuff is at the end
*repairs must be idempotent
============================================================================================================================================================================
A more complicated approach
> let user applications "look inside" pending transactions
+ better performance, better parallelism / interleave, lower latency
- need a way to notify apps that a transaction they peeked inside was aborted (e.g. a symbol)
-> cascading aborts
-> compensating actions (undo your mistake once you figure out an abort has happened)
============================================================================================================================================================================
VIRTUAL MEMORY
*Problem: have unreliable program, if tickled incorrectly, will have bad memory references
e.g. GNU grep 2.10 on 64-bit host
if run on a bare machine (weensyos) => whole system fails
*Solutions:
(1) hire better programmers, write better software (works but its expensive)
(2) runtime checking in compiler generated code (works but runtime cost)
(3) base + bounds register pair
> hardware checks all memory refernces in range, traps otherwise
- must have a fixed size of memory that has to be pre-allocated (can grow RAM, but its awkward)
- RAM limits efficient use of processes
- memory gets fragmented
- apps can change base + bounds registers (solution: priveleged)
- shared memory not practical
(4) multiple base + bound pairs
1 for text (readonly)
1 for data (read-write)
- when you compile + link 'cat' you must tell it where to run (solution: design app so that it does not care where it is run)
> compile app without using absolute address = position independent code (PIC)
> OR dynamically link or patch
> divide virtual address into two pieces: (1) segment # (2) offset within segment
> segments are varying sizes
(5) break up memory into pages, use fixed size pages (e.g. 4KiB)
> solves fragmentation problem: cant grow/shrink segments easily
> virtual address split into 2 parts: (1) page # (2) offset input
> this requires hardware assist
*page tables implemented in hardware:
master tables -> intermediate tables -> actual tables
virtual address split into 3 parts: offset in master table (10), offset in intermediate table (10), offset in actual table (12)
common situation UM for 1 process
*software emulation of 2-level page tables:
// maps virtual address to physical address
size_t
pmap(size_t va) {
int
offset o = va & (1 << 12) - 1;
int
lo = va >> 12 & (1 << 10) - 1 // low order bits
int
hi = va >> 22; // high order bits
size_t
*l0page = PAGETABLE[hi]; //
priveleged register %cr3
if(l0page
== FAULT)
return
FAULT;
size_t
*l1page = l0page[lo]
if(l1page
== FAULT)
return
FAULT;
return
*l1page + offset;
}
> when the hardware faults due to a bad page table entry, traps in kernel
> kernel can:
-terminate process
-send signal to process (SIGSEGV) (i.e. kernel arranges for signal handler)
-arrange for invalid access to become valid; arrange for accessed page to be loaded into RAM
> modify page tables to reflect this
> can return from interrupt (RTI), back to failing instruction
* use big programs even if you dont have enough RAM
> treat RAM as cache for disk
> works if good locality of referencing, thrashing if RA