Valid HTML 4.0 Transitional

CS 111 Week 8

By Yu Peng (603444925)


Write (fd, buf, bufsize) during the write: crash + reboot

For files



10,000,000 -bufsize


ABCDE

DEFGH


ABC


MNOP

IJKL



What about directories?


Rename (“a”, “b”)

a:27


b:65

before

alpha beta

0


b:27

after I


b:27


0

After II



*whichever scenario possible of crash

alpha (crash) beta

0


6:65

= unlink (“a”)

// lost info



beta (crash) alpha

a:27


b:27

= unlink (“b”); link(“a”,”b”)

Problems: link FSCK count wrong




Proposed POSIX model

RAM in memory cache of:


Write(…) applies to cache.


.

Syscalls:

Slow move diskhead:

Fsync(fd) sync cache with (data(in data block)+attributes(in inode)) permanent store

=

Fdatasync(fd) same as fsync except cheaper


As a result, took syncout of Emacs unless it is interactive


hutinPOSIX sync()


idea for atomic rename() etc

1) Commit Record


Rename(a, bbbbbb):

BEGIN: (low level operations)

Pre-commit phase:

can back out,

leaving no trace

COMMIT (atomic) low level atomic write of one block

Make higher level action happen

Post-commit

Can not back out

Can improve performance for next time

Cleanup, optimization

END


2) JOURNAL

a) cell data, each cell is a block

b) simplest mode: infinitely long; append only for storage (for reads); random access (for reads)


one way to use journal

log the change you want to make

log & commit BEGIN

write the data 13 = “B”

Log END 37: “0”


*write-ahead log


Delicate place to crash COMMIT END

Recovery procedure: replays the log left to right


Problems:

What happens if a crash occurs during FSCK?

Should recover using idempotent operations

Radical Proposal: (log structure file system)

Keep journal on disk

Put cell data cache in RAM

+writes are fast w/c no seeking

-reads can take a while (if out of cache)


Write-behind lock (same example of wife-ahead)

13: “A” “B” BEGIN

37: “B”0 13: “A”

Fsck replays log 37: “B”

Backwards now, make changes to cells

(forward for write-ahead) COMMIT END


Example research file system:

120PB 200,000 hard drives (each 600 GB) fast drives

GPFS General Parallel File System

Distributed meta data (good performance –unlike fdfs) little power


Efficient directory indexing

Distributed locking

Partition awareness

System stays live during fsck


Next Problem: unreliable processes bad memory references

A[i]=27



Solutions:

-better programmers (doesn’t work)

-software check

(tell compiler to insert subscript checks e.g. Java Virtual Machine)

(performance problems)

-get help from hardware

(add SUBSCRIPT in to hardware)

-even this is too slow


*idea: base + bounds register to hardware


Hardware checks all memory access to fit in here ^

If the check fails, same as executing invalid instruction traps

Goes to kernel

Problem: you must know in advance how much memory needed fragmentation

Can’t grow a process

No absolute addresses in machine code must use position independent code (PIC)

Sometimes we want shared memory


Segments

Multiple base bond pairs, 1/segment

*requires hardware support

Each process has its own segment table

Segment have varying sizes

*use pages to solve segmentation/fragmentation problem


Pages

Like segments, but all segments are the same size (e.g. 4KiB)



Kernel

Give app page table that points to other memory, but not itself.