CS 111

Scribe Notes for 03/03/08

by Tharaka Somaratna and Tom Wright

RAID 4

Figure 1.

Whenever you write, you write twice, to two disks: Data and parity writes can be parallel. When a disk crashes, recomputed via XOR.

Disadvantages

  • (-) Attaches only single point of failure.
    (+) Solution: more disks(parity)
  • (-) Need atleast 3 drives
  • (-) Assume detectable failures
  • (-) Parity disk can be a bottleneck. (hot spot [raid 5 attempts to address this])
  • (-) Cost of the extra drive.

Disk Failure Notes

Figure 2.

Seagate’s MTTF = 300,000 ~ 35 years

Figure 3. Shows the Google's Annual Failure Rate of Disks

Figure 5. Disk failure rate is usually like a bathtub curve. Comparing the Google's AFR and the bathtub curve, we can see that the initial curve on the bathtub curve may be due to the testing stage of disks.

Figure 6. Raid 4 will have a low AFR if the failed disks were replaced quickly.

Disk Scheduling Algorithms

Want: high throughput (lots of data (keep disk busy)) No starvation. These two are competing goals.

Simple model: N blocks, numbered 0,....,N-1 (latency + seek time) from block i to block j is |i-j|

Given: a set of blocks b0,....,bm-1 to write ]. (each block number in range 0,...,N-1 Current head position: h [0:h:N)

Questions and Answers
  • Q). Which block we write next?
    A). Choose the block closest to h. (if high throughput is the goal) (shortest seek time first(SSTF) algorithm) -> leads to starvation.
  • Q). (If starvation is the goal)
    A. First come first serve: choose b0 [avoid starvation] [no robustness problems]
Compromises
  • Answer3).

    SSTF + FCFS in batches: take a batch, use SSTF on batch, then take the next batch.

  • Answer4).

    Elevator Algorithm Direction: +/- 1 Pick the closest blank in current direction. Switch direction if none. Unfair to requests at each end of disk.

  • Answer5).

    One-way elevator algorithm. Direction = + 1 at all times. (Does not bias toward the middle of the disk, but less throughput) Sometimes called circular elevator.

  • Answer6).

    (Orthogonal suggestion) anticipatory scheduling. Guess what future requests will be. If future requests are near h ( current location of the disk head ), wait for them.
    (-) Can guess wrong, which will increase cost
    (-) Complicates disk ordering, robustness
    (-) Latency cost n some cases. (Delay btween write)
    (+) This can greatly increase throughput
    (+) Does not cost much in battle against starvation

Virtual Memory

Addresses several problems at once

Programs address memory they shouldn't

Possible solutions

- Hire better programmers

- Use language with runtime checking

-- Speed problem

- Base+bounds register in hardware

-- Common in stripped down systems

-- Is this reliable?

--- Per-process base+bounds

--- Privileges is required to set those registers

-- Is this sharable?

--- Multiple threads within a single process? yes

--- Share read-only parts of program?

---- Requires multiple base-bounds pairs

----- (segments)

-- Problems with base bounds pairs

--- Forcing relocateable code costs a bit

--- Forces you to pro-allocate memory (fixed size programs)

----- If we assume no sharing

------ Memory references are relative to base: hardware adds base

----- If we have sharing, things get trickier because it's hard to determine base

----- To avoid the problem:

------ all your code must use relative jumps

------ e.g. gcc -fpic (position independent code)

---- Solve this problem via an extra level of indirection

---- Physical memory references almost completely decoupled from logical ones

-- Typical model of program address space

Example code from a section I labled "2-level page table"

 size_t p (size_t vpn) {
	size_t hi = vpn >> 10;
	size_t lo = vpn & (1 << 10) - 1;
	size_t *lopg = PAGE_TABLE[hi];
	if(!lopg) return fault;
	return lopage[10]; 
} 

How to prevent a process from cheating

(gaining access to RAM that it shouldn't)

1) setting %cr3 is privileged (this register holds the process 'address space')

2) don't let processes see own page tables

3) each page table entry contains a few spare bits (access permission bits)

Procedure that hardware follows when a page is absent

1) page fault

2) kernel takes control at a well specified location

3) Same rules apply for INT, and other faults

What kernel can do

when informed of the faulting address it can:

1) kill process

2) schedule a read from disk, later resume process once page is in RAM

3) on first write: allocate disk space & RAM

----- this can fail if OS over allocates pages (AIX) (can kill process due to lack of swap space)