CS111 Winter 2016 Lecture 2 - Abstractions and Bootstrapping

Notes by: Richard Min

Lecture Date: Wednesday, January 7 2015


Table of Contents

  1. Philosophy & Increase in Complexity
  2. How to not make an OS

Continuation on Philosophy

Computer systems have an issue that traditional systems do not: Complexity.

Computers are continuing to grow in complexity, preventing people from understanding the entire system.
Causes of complexity:

Moore's Law

Moore's Law states that the number of transistors in an mass-produced commercial circuit density doubles approximately every year, dramatically increasing our bits per chip.
MooreGraph
Note: this graph is using a log scale.


There has been roughly exponential growth for decades; however it has recently tailed off and we seem to be reaching the end of Moore's Law.

Eggert's comment regarding this:
"This is a relief ... You cannot put money into a bank, and expect to get 5% interest forever ... 1.051 billion is a big big number, and something's got to give."
Paul Eggert, January 7 2016

As we can see, we have recently hit 4 billion transistors commercially, which causes our complexity.

Classic Explanation of Moore's Law: The reasoning behind exponential growth

An explanation for Moore's law is illustrated through the first computer, UNIVAC I. Built using vacuum tubes, it was full of error checking circuitry that verified that the results were outputted correctly, slowing down the circuitry immensely. This computer was designed by hand, and was slow and safe.

After designing this computer, when designing the UNIVAC II they used the UNIVAC I to help simulate the hardware to adjust the tolerances and decide on when error correction circuitry is necessary based on the simulations they ran. This demonstrates the idea that the amount of technology we have at any given point in time helps us design the next device:

dtechdtlatex

Though this idea is kind of bogus, it demonstrates how our operating systems need to deal with increasing complexity.


Further Reading:
  1. Wikipedia Article
  2. Original Paper


Besides memory, another thing that also creates complexity is:

Kryder's Law

KryderGraph

Similarly to Moore's Law, Kryder's law describes how mass-produced commercial disk drive storage capacity doubles approximately every year.



However, while the complexity of our systems increase, our speeds are not increasing proportionally. Neither the access time of the disk nor the memory clock cycles have grown nearly as fast as the storage space and amount of memory available, respectively.
So, we have to take advantage of our more complex systems without computer speed to match: these constraints come from the hardware, and must be managed in our operating system.
We have to deal with greater and greater complexity, but the machine and disk drive aren't getting all that much faster.
This is incommensurate scaling, mostly from the complexity area.


Specific Example: How to NOT make an OS

Let's say we're trying to develop a system, and don't want to use an operating system (OS) to do it. Possible reasons we might want to do this are:
  1. Simplicity - Just having a program is simpler than having an operating system and a program
  2. Performance - Faster speed and less memory to directly interface with the hardware, than to interface with an OS.
  3. Reliability - Consequence of Simplicity: you know the entire scope of what happens.
  4. Security - Paranoia that someone has malicious code in the OS(or exploitable code)

Specifications

Let's assume that we have a paranoid English professor who is writing a grant paper and is afraid that his peers are trying to steal the ideas in this grant paper. Thus, he decides to hire you to write a machine for him, as he doesn't trust Windows to not have bugs that would cause security bugs. He only wants it to run your code: that way it's secure from any hacking attempts. You should also try to minimize the amount of code the program runs.

The feature that he wants in his program is a program that reads his file off of disc and shows the word count. This is to ensure that the program is under the size limit of 50,000 words. We want the program to run immediately on launch, and can assume that the file is smaller than the disk size.

Computer Specs:
The computer we're running the program on is a 10 year old desktop, with the following specs:
The file is written as an ASCII text file: each byte is a character, with the top bit off and no null bytes. The bytes range from \001 to \177 in hexadecimal. Each word is a series of alpha character followed by a non-alpha character.
The file is sitting on the disk drive in adjacent sectors on the drive.
The disk drive is a traditional disk drive, we can think of it as an array of 512 byte sectors. This means that to do the word count, we look at the sectors one at a time and figure out how many words are in the sector. The user interface for our program is as follows: we want to turn the power on, the program should count the number of words in the file, and display that onto the screen. To load the program, we need it be in RAM, but to load something into RAM from disk we need instructions on where to load it in the first place.

Bootstrapping

BIOS (Basic Input/Output System)

Located in the EEPROM, figures out what the computer does when the computer is turned on. This is accessed through a certain address on the ROM (say ... 0xf0000) that is run by default, and isn't actually RAM. Usually has instructions relating to basic I/O along with booting a Master Boot Record.
Cycling power causes the CPU and volatile memory to reset itself, and there's nothing in the CPU which knows the operating system and code that you want to run. We have to figure out a way to get our word count memory to load to memory.

EEPROM (Electrically Erasable Programmable Read-Only Memory)

Programmable Read Only Memory is one that you can change if you want to, though this usually involves great effort. This memory is non-volatile, and remembers what you've placed in it through power cycles, but is relatively small. We use this in the BIOS to get our programs to begin booting.
However, we don't put our word count program here because it's a hassle (if you mess up while overwriting you can greatly break your computer). Instead, we place it on disk.

MBR (Master Boot Record)

The program than proceeds to look for the Master Boot Record. This is 512 bytes by convention, and consists of the following components: 446 bytes of the Boot program in x86 code, 4*16 = 64 bytes of the partition table, and 2 bytes for the signature.
MBRgraph

This follows the little Endian convention. The signature must match 0xAA55 to be recognized as the MBR to function.

The BIOS will continaully look through the first disk it finds until it finds a volume with this boot record: note that if you have data that happens to match the signature, it will try to boot it in the format. To not have this issue, it's usually the first sector on the disk.

MBR instructions consist of checking RAM for parity errors, CPU checks, and making sure the hardware is functional; if so, it looks for the devices in order of the bus, using instructions to see if devices exist.

The x86 get copied into 0x7c00 memory address to be executed. The partition table tells information about the rest of disk by convention: it only consists of 4 entries because they assumed that nobody would ever partition their device into more than 4 parts.


Partition table entries consist of: After booting into memory, the BIOS looks at the partition table, looking for a bootable partition and boots off the first record in the partition. This process is called chain loading.

A summary of this process on normal computers is: the firmware boots reading the MBR, which reads the partition table into memory, boots into that Volume Boot Record (VBR) which is OS specific, and this VBR loads a kernel which loads programs.
In our case, instead of this standard path we want to reduce the levels of code as much as we can, so we try to boot from the firmware to the MBR to the WC.

Loading Data from Disk

We have the CPU talk over the bus with the device using the inb and outb instructions, where it reads bytes from their registers.

Using this, we define the following subroutine:
void read_ide_sector(int s, char* a)//sector s, mem address to read into
{
	inb(0x1f7); //goes to the disk controller at this location, model this call as subroutine
		    //in c, we could do this with something like static inline inb(int a) { asm }
		    //really turns into a single C instruction that the C compiler will generate, looks efficient

		    //even though this is one instruction, the instruction is slow
		    //because the CPU has to send signals over the wire to some device that is a long ways away
		    //has to then twiddle its thumbs and waiting for the disk controller to wait up
		    //relatively slow by machine instruction standards, not going to be executing billions of these per second
		    //Says to please retrieve the byte on the bus, the corresponding device address is that bus address
		    //Everything on the bus has an address that can load from and store into the bus addresses using the inb instructions,
		    //and the outb instruction is the reverse

		    //The disk controller is another computer, whose little computer is a program that you dont know what it is
		    //has some registers that it exposes to the bus, and has memory associated with it; talking directly to the disk drive
		    //read the manual, looked to see what this device is: 0x1f7 is the status register (one byte register)
		    //information about the current state of the disk controller, in particular if you look in that status register
		    //8 bits in the byte: looks this is ready, otherwise it's dead or busy


	//make sure that the disk drive is ready

	while((inb(0xf17) & 0xc0) != 0x40) //only care about top two bytes, make sure this is true, 0xf17
		continue; //do this for style for him lol

	//now the disk drive is ready
	//1f7 is the status, cmd register, read from it is status, write to it is doing a particular command
	//1f2 is the sector count, this is how we read a sector
	//(can read multiple sectors at once, by putting a 3 on the end)
	outb(0x1f2, 1); //read one sector

	//Another register 
	//1f0 read data (and written)
	//1f2 sector count
	//1f3 Sector number to read from or right to
	//1f4 Sector number to read from or right to
	//1f5 Sector number to read from or right to
	//1f6 Sector number to read from or right to
	//1f7 status, cmd
	//little Endian machine: low order byte is 3, high order byte is 6

	//we know the sector number to read from is copy the integer into that value there, we do that with outb instructions
	outb(0x1f2, s&0xff); //only bottom two bytes
	outb(0x1f4, s >> 8); //24 bit number, outbyte instruction extracts the bottom 8 bits of that
	outb(0x1f5, s >> 16);
	outb(0x1f6, s >> 24);

	//these are one byte registers on the bus
	//Told everything it needs to know for the reads, what to read from and how many to read
	outb(0x1f7, 0x20);

	//made the disk controller busy to do the read command, now we need to wait for it to actually read the data
	//0x20 stands for the READSECTOR command
	while((inb(0xf17) &  0xc0) != 0x40) continue;

	//Takes this loop put it into another loop and a function //can be called wait_for_ready
	//Ready because moved disk around, arm at right spot, data rotated around,
	//pulled data out and put it into the disk controller cache
	//but we want the data! We can't look at the data directly as its not in our RAM

	//We get it using:
	insl(0x1f0, a, 128);
	//insert long, copy it into our buffer (a) 
	//how many words we want to copy (128), (512/sizeof(int)
}



Now that we have defined this subroutine, we compile and place this into RAM so we can run it. We can place it in 3 locations: firmware, MBR to boot into memory, or our WC program. Normally this is placed into the firmware, because many programs will need this functionality, especially if we're memory starved.

However, this causes a problem for future machines: you still have to use this slow piece of code even if you come up with better ways of reading from disk, so normal OS have their own version of this to do other things in parallel. This approach is mostly left to small machines where I/O performance isn't a bottleneck.
Next lecture: completion of the wc program; how to implement and write it to screen.