Using the gzip command to compress file(s) may result in data loss.
Example (previous lecture):
$ gzip foo will execute the following code:
1 inputfd = open("foo",O_RDONLY);
2 if (inputfd < 0)
3 { error; }
4 outputfd = open("foo.gz", O_WRONLY|O_CREAT|O_EXECL, 0600);
5 if(output(fd < 0))
6 { error; }
7 read(inputfd, buf, bufsize); // + check for I/O errors
8 write(outputfd, buf, bufsize); // + check for I/O errors
9 if(close(inputfd) != 0)
10 { error; }
11 if(close(outputfd) != 0)
12 { error; }
13 if(unlink("foo") != 0)
14 { error; }
// if the program crashes here, foo would be lost!
15 exit(0);
Solution (synchronous):
// error checking omitted
1 dfd = open(".", O_SEARCH);
2 fd = openat(dfd, "foo.gz", O_WRONLY|O_CREAT|O_EXECL, 0600);
3 fsync(fd); // save write (slow)
4 fdatasync(dfd); // save file creation (slow)
5 close(fd);
6 unlinkat(dfd, d"foo"); // unlink within contents of dfd
Use:
$ gzip --synchronous [arg]
Implementing the synchronous flag has two implications:
Two methods: commit record and journal
1. Commit Record: a tentative method of ensuring write jobs are done in completion (assumes individual sectors can be written atomically)
writes to sectors | 1 | 2 | 3 | ... | n (final write) | n+1 (commit record) |
---|
Writes to the sectors occur sequentially with "n" being the final write.
The "n+1" write is a commit record that indicates all of the write jobs have been completed.
If the write jobs cannot be completed (e.g. power out) before the commit record is written, the entire batch is discarded.
Reboot: If the system is rebooted after the commit record is written, the system will re-write from the beginning of the commit record to the end of the commit record.
2. Journal: memory is divded into a circular buffer (log) and an array of data blocks (cell memory)
log start | write 1 | write 2 | write 3 | ... | write n | commit | log end |
---|
cell data | data block 1 | data block 2 | data block 3 | ... | data block n |
---|
Pre-commit: intended writes are appended to a log that tracks which data blocks to write to/from, the log has a start and end point
Post-commit: once the commit is reached, the system will execute the writes (to/from cell memory) specified in the log
Recovery/Reboot:
the system must scan backwards (from log end) for the oldest completed log entry, then from that point begin re-executing until log end is reached
Pre-commit: old log values are stored into the log
Post-commit: any discrepancies between the log and cell memory are resolved by updating the cell memory to match the log
Recovery/Reboot:
the system does one scan backwards and updates cell data accordingly
Main-memory database (variant): all data and log entries kept in main memory (RAM) for speed and persistency
Example: Ext4 filesystem (journaling, fast read/writes, no fragmentation) vs. VFAT (windows compatibility)
How namei distinguishes between two filesystems: refer to an array of filesystem-inode pairs (dev_t filesystem, ino_t inodenumber) to ensure correct directory searching
How to Speed Up Journaling: store log (sequentially read) and cell memory on separate disks
How to prevent invalid memory references:
Hardware Method: use two kernel-level hardware registers as boundaries; every access to memory must fall between them
base (start of program) | accessible memory | bounds (end of program) |
---|
Context switching: requires switching the base and bounds registers
memory address: | 8-bit segment number (0-255) | 24-bit segment offset |
---|
segment number | base | bounds |
---|---|---|
0 | address | address |
1 | address | address |
2 | address | address |
... | ... | ... |
255 | address | address |
Per-process paged based memory addressing that only requires a base address
Divide 32-bit virtual addresses into a 22-bit virtual page number (VPN) and 10-bit virtual page offset (VPO)virtual address: | 22-bit VPN | 10-bit VPO |
---|
A 22-bit VPN is sent to VM hardware onboard that translates it into physical page, then the VPO is added to obtain a physical address
Uses for Virtual Memory:
How malloc works: malloc is implemented via mmap
mmap(fdToMmapFrom, offsetIntoFile, sizeOfRegion, VMaddr) allows a process to control its own page table
fdToMmapFrom can point to source /dev/zero to malloc zeroed-out space
Memory Layout
View RAM as cached swap space (split into pages) where page table entries contain pointers to pages in RAM and can be swapped out for pages in the RAM swap space
RAM: | swap page 1 | swap page 2 | swap page 3 | ... | swap page n | program data |
---|
page table |
---|
page 1 |
page 2 |
page 3 |
page 4 |
... |
page n |