Lecture 2 Scribe Notes, CS 111, Winter 16

Section
I. A bit more philosophy
II. Why not use an OS?
III. Word counting program

Tradeoffs and scaling are issues that have been with us for a long time (see lecture 1 for more details).

Complexity is yet another issue, and it's one he didn't talk about last lecture.

Moore's Law states that the number of transistors on a chip doubles approximately every year (so it exhibits exponential growth). This phenomenon has been occurring for decades, but lately it has started to slow down. Note that Moore's Law doesn't describe the number of transistors that we can possibly put on a chip, but rather the economic “sweet spot” for transistor density (the number of transistors with which it would be reasonable to mass produce chips).

Kryder's Law (bytes/drive) observes that secondary storage capacity has also experienced exponential growth.

Note that Kryder's and Moore's laws do NOT address speed or performance. Historically speed and performance haven't grown nearly as fast as storage space. This introduces a problem of incommensurate scaling in system design: OS's need to deal with increasing complexity, but their hardware isn't getting much faster.

Why exponential growth?

The UNIVAC I and UNIVAC II are one example. The UNIVAC I, one of the first computers, was hand designed, slow, and safe. There was a huge amount of error correction circuitry built into the computer to make sure it worked.

Engineers used the UNIVAC I to design the UNIVAC II by running simulations of the II on the I. Based on those simulations, they were able to determine where error correction was actually needed and eliminate unnecessary bulk elsewhere.

There are many reasons you might not want to use an operating system.

Simplicity

An operating system is a complicated piece of software containing a lot of components that may be unnecessary for what the user aims to do.

Performance

When you run an application on an OS, there are lots of things going on in the background that require processing power. Speed and memory are both heavily affected by the OS.

Reliability (as a consequence of simplicity)

Since there are many things happening at once on an OS, there is more opportunity for things to go wrong.

Security

Security is another important reason, as you’ll see in this next example

Hypothetical situation: Eggert is writing a document. As a paranoid professor, he wants his application to be as simple as possible and doesn't want to use an OS to do the following task: count the number of words in an ASCII text file (bytes '\001' - '\177', 1 - 127)

Computer specs
Core i3-4160 (3 MiB cache, 3.6 GHz)
4 GiB dual channel DDR3 1600 MHz SDRAM
1 TB hard drive, SATA, 7200rpm
Intel HD4400 graphics

Boot process

We run into the bootstrapping problem: how does the computer know to run the program when it starts up? We need something to tell the computer what to do in order to get everything started. When you turn a computer off, cycling power clears the CPU, cache, and RAM. When Eggert restarts the computer, it needs some way of getting his word count program into RAM.

There's an initial instruction pointer in your computer that is set to location 0xFFFF0 (220 -16) in RAM. That location is part of a portion in physical RAM that simply redirects you to a region of ROM (better idea → nonvolatile memory) containing instructions hardwired by the manufacturer to do whatever we want them to. We could ask the manufacturer to put the word count program in the ROM, which is expensive, takes time, costs money, etc.

A more modern solution is to store some information in EEPROM that will tell the computer what to do when it starts up. First, we need to store the location and size of the program on the hard disk (the manufacturer hard codes these constants). Second, we’ll need a program that loads and executes the word count program (it can do this by taking the word count program from the disk and putting it into RAM).

By convention, things are set up as follows: the first sector on the disk is the master boot record (MBR). The first 446 bytes of these 512 bytes are x86 code.

The program on the EEPROM performs hardware sanity checks (it examines the CPU and RAM). Then it checks for devices and identifies the first device with an MBR. EEPROM identifies MBR’s based on the convention that the last two bytes of a master boot record are 0x55 0xAA (little endian 0xAA55).

EEPROM copies the x86 code sector of the MBR to 0x7C00 and jumps into it.

Between the 446 bytes of code and 2 byte indicator is 64 bytes, sectioned into 4 16 byte entries. This is called the partition table (it only allows for 4 partitions due to poor design). Each 16 byte entry contains sector counts for the offset of the partition and for the size of the partition plus other info, including a type byte and whether the partition is bootable or not.

The firmware boots by reading the MBR & executing the code in the MBR. The MBR examines the partition table and finds a partition that's bootable, then boots that partition. This process is called chain loading.

Chain-loading: firmware -> MBR (OS-agnostic) -> VBR (OS-specific) -> kernel (many sectors) -> your apps

Reading data from the disk

At this point, it’s clear that we need some sort of subroutine for reading data from the disk. The firmware, MBR and word count program will all be able to use it.

The controller communicates through the buses via their registers, listed below:

0x1F0 ->        read data
0x1F2 ->        sector count
0x1F3  \        low order byte for the sector number
0x1F4   \                 
         |      sector number
0x1F5   / 
0x1F6  /        high order byte for the sector number
0x1F7 ->        status and command register, contains state information for disk controller

Reading 0x01000000 (0xC0) from the status and command register indicates that the disk is ready to receive a command.

As the C library does not have built in functions for x86 I/O, we'll declare one and pretend the rest exist and are implemented.

static inline inb(int a) {
    asm("____");
    …
}

int and char are 4 bytes.

void read_ide_sector(int sector_num, char* memory_addr) {

Waits until the disk's controller specifies that the disk is ready:

    while ((inb(0x1F7) & 0xC0) != 0x40)
        continue;

Write 1 to the sector count register, as we only want to read 1 byte:

    outb(0x1F2, 1);

outb writes only one byte, so we have to write each byte of the 4 byte integer, one at a time:

    outb(0x1F3, sector_num);
    outb(0x1F4, sector_num >> 8);
    outb(0x1F5, sector_num >> 16);
    outb(0x1F6, sector_num >> 24);

Issue the write command to the status and command register:

    outb(0x1F7, 0x20);

Wait for the disk to be ready again:

    while ((inb(0x1F7) & 0xC0) != 0x40)
        continue;

Read 128 longs (128 * 4 = 512 bits) from the controller's read data register (0x1F0):

    insl(0x1F0, memory_addr, 128);
}

Let's say we compile our subroutine and are trying to figure out where to put it. We have three possible locations: firmware, MBR, and the word count program itself.

We could chose to have just one copy, but performance will be better if we have 3.

Note: the rest of the word count program is finished in lecture 3.

Lecture 2 Scribe Notes, CS 111, Winter 16

Table of Contents

I. A bit more philosophy