Last time: Figured out how to boot the word counting program but couldn't get it to do anything.
void output_to_screen(long n) { /* WRONG - SEE NOTES FOR EXPLANATION long *p = (long *) 0xb8000; *p = n; */ short *p = (short ) 0xb8000 + 80*25/2 – 80/2; do { --*p = (7 << 8) | (n % 10 + ‘0’); n /= 10; } while ( n != 0); }
void main(void) { long nwords = 0; bool inword = 0; int s = 1000; char buf[512]; for (;;) { read_ide_sector(s++, buf); for (int j = 0; j < sizeof(buf); j++) { if (!buf[j]) done(nwords); bool isletter = ‘a’ <= buf[j] && buf[j] <= ‘z’ || ‘A’ <= buf[j] && buf[j] <= ‘Z’; nwords += isletter & ~inword; inword = isletter; } } }
void done (long nwords) { output_to_screen(nwords); halt(); }
Suppose our app is a cryptography app.
This requires lots of CPU power, especially if there is a 10,000 bit key.
CPU is split between doing I/O and ECC (elliptic curve cryptography)
A solution to get to 100%: use double buffering
Double buffering is the act of reading the next buffer's data while processing the current buffer.
This can almost double the performance if the time needed to read and to compute are the same.
Suppose we applied double buffering to our word count program. Will we get double the performance?
No, because our computation is very fast compared to our reading (reading from a hard disk drive is relatively slow).
It would go from:
to
Which is not much of a performance gain.
Would it ever make sense to do triple buffering?
Yes, if you had 3 different actions to be performed at the same time, e.g. read, write, and compute.
How else can we speed up?
Suppose instead of just having 1 file, we have 10 files on 10 disk drives, and we want to count all of them at once.
We can speed this up with multitasking, where we do the I/O in parallel and the computation in sequence. As long as the computation is 10x faster than the reading, the computation for all 10 files can be done during the parallel reading of the next 10 files.
If we want to achieve all of these performance increases for our word count program, we would need to write the code for multitasking, DMA, etc. ourselves, which would be very difficult and time consuming.
Additionally, if we wanted to do these for other functions that we write, we would have to rewrite everything, so it is very hard to generalize and scale up our program.
How can we scale up these programs?
How can we get fancier performance tricks without rewriting every application?
How could we design a read function?
char *readline(FILE *f);
This is a BAD DESIGN for OS.
Now let's try approaching it from the other direction: good performance, but not as simple:
void read_ide_sector(int s, char *buf);
bool read_ide_sector(int s, char *buf);
int read_ide_sector(int s, char *buf);
int read (int byte_offset, char *buf, int bufsize);
bufsize
is an int, this function is limited to 2^31 bytes = 2 GiBssize_t read (off_t byte_offset, void *buf, size_t bufsize);
bufsize
must be a value that is able to be return as an ssize_tWe want to generalize this so that it can handle any type of file.
UNIX read:
ssize_t read (int fd, void *buf, size_t bufsize);
byte_offset
in this implementationThe above read function treats every file as a stream of bytes so that it will work with any type of file. To do this, it gets rid of byte_offset
, which inherently gets rid of random access.
To get back random access, we have another function:
lseek(int fd, off_t where, int flag);
So now you can move disk arm/flash pointer to a certain byte in the file with lseek, and start reading from there, implementing random access (Performance++, Simplicity--).
One problem with this: suppose I want to do a lot of random access reads, then I would have to do two instructions for each read.
Solution: another function
pread(...)
This combines lseek and read to implement random access reading (Simplicity++).
So far we have talked about where to split things up, but not how to do it.
This working example of creating a word count program without an operating system emphasizes the benefits of having an OS. All of the ways that were suggested to speed up the program, such as DMA, double buffering, and multitasking, require a lot of code in order to make them function correctly, and if we did not have an operating system, we would need to write all of this code ourselves, and possibly modify it for every program that we write. The solution to these problems are modularity, i.e. splitting up the program into smaller pieces, and abstraction, or finding the natural divisions in a program. These two concepts are evident in most well-known operating systems, and Professor Eggert went on to describe the aspects by which one can measure the quality of modularity and abstraction in a given OS.
One aspect, simplicity, is the measure of how simple an OS is to learn how to use and to write code for. Another feature, Robustness, defines how well a system behaves under harsh conditions. Performance measures how fast a program can run, and usually is a cost to modularity. Lastly, flexibility describes how well an OS can be used for a lot of different tasks. Professor Eggert then iterated over the implementation of a read function, trying to balance these aspects in the best way possible, ending up with the UNIX implementation of read, which he described as "perfect". He ended the lecture by introducing different mechanisms by which modularity can be achieved: function calls, client-server interaction, and virtualization.
Dam, Tran, Amir Vajid, Richard Yu, and Eric Zheng. "Lecture 3: Modularity
and Virtualization." CS 111 Scribe Notes. N.p., 30 Sept. 2010. Web. 18
Jan. 2016. <http://www.cs.ucla.edu/classes/fall10/cs111/scribe/3a/>.
Saltzer, J. H., and Frans Kaashoek. Principles of Computer System Design:
An Introduction. Burlington, MA: Morgan Kaufmann, 2009. Print.