CS 111 Lecture Notes

Lecture 2 – Jan 9, 2008

By: Jonathan Yang, Bryan Higgins, and Zachary Slavis

 

Modularity

Modularity is defined by Wikipedia as, Òthe property of computer programs that measures the extent to which they have been composed out of separate parts called modules.Ó Essentially, modularity is the idea of splitting jobs into separate sections, so that the same job can be performed efficiently in many different ways.

 

How can we measure the qualities of using a modular approach?

á       Performance – In general, modularity hurts performance somewhat.

á       Flexibility – Can you reuse the components in many ways?

á       Robustness – Are the interfaces well designed, so that bugs are prevented from spreading and there are interfaces for debugging components?

á      Simplicity – Are the modules easy to learn/use (i.e., is the manual short)?

To demonstrate these qualities, a simple program (with little modularity) will be designed, and then we will keep increasing the modularity by making the program more versatile and flexible.

 

Our simple program has the following scenario: A paranoid English professor is doing top-secret research on Shakespearean literature. Her research includes counting the number of words in ShakespeareÕs plays, and looking for a pattern between the word counts. Since the professor does not want anyone to steal her research, she doesnÕt trust any operating system that is commercially available, such as Windows. She will be designing a standalone program that boots on startup. When the machine powers on, it should count the words, in Hamlet, and then it outputs the word count to the screen. Her machine is running on an x86 platform, with 1 GB RAM, 500 GB disk ATA, and a monitor. Hamlet will be stored (where you choose) on the hard drive as a continuous ASCII text file. For our cases, a word will be defined as any set of characters [A-Z, a-z]. The end of the file is denoted by a null byte (Ô\0Õ).

 

It is always a good idea to ask, ÒWhat future changes might you want?Ó We proposed the following possible future changes:

á       Fancier pattern matching

á       Fancier character set

á       Multiple processes

á       Better user interface

á       Multiple files Ã

á       New drives Ã

The professor only decided that the last two choices would be necessary (denoted by the Ã). These new implementations will be added after the simple program is finished.

 

 

Issues

 

Now that we know what we need to design, here are the issues that we have:

á       Bootstrapping – How do we get the program to boot after powering on the machine?

á       Reading from disk – What commands do we use to access the disk (so we can retrieve the Hamlet file)?

á      Writing to display – How do we output text to the screen (so we can display the word count)?

 

Bootstrapping

 

Bootstrapping is the process of building something complicated from something simpler. In the case of computers, the processing of booting occurs when the computer is powered on, and the Program Counter (PC) points to an EEPROM (Electrically Erasable Programmable Read-Only Memory) chip which contains a small program called a BIOS (Basic Input Output System). This BIOS program can be changed by flashing the EEPROM with the new BIOS, but the BIOS will typically come installed on a machine at purchase.

 

The BIOS, once started, will test the system, look for devices, and find a device that looks like a disk (hopefully). It will read the first sector of that device, and look for a Master Boot Record (MBR), which will contain 4 partitions, each of 16 bytes. Each partition contains values for start, end, type, and bootable. Once a partition is a found, which is bootable, the 1st sector will be executed, which is called a Volume Boot Record (VBR).

 

Structure of a Master Boot Record

Address

Description

Size
in
bytes

Hex

Dec

0000

0

Code Area

max.
446

01B8

440

Optional Disk signature

4

01BC

444

Usually Nulls; 0x0000

2

01BE

446

Table of primary partitions
(Four 16-byte entries, IBM Partition Table scheme)

64

01FE

510

55h

MBR signature;
0xAA55

2

01FF

511

AAh

MBR, total size: 446 + 64 + 2 =

512

(Table from Wikipedia—http://en.wikipedia.org/wiki/Mbr)

 

Disk Drive Basics

 

This section explains how the mechanics of a disk drive works. LetÕs assume we have a disk drive of 500GB that runs at 7200 RPM (notice that the rotational speed of a disk drive is still measured in revolutions per minute, for some reason the standard never switched to Hz). The disk drive consists of multiple tracks, where the tracks are circular disks that hold magnetic information, stacked on top of each other. Each track has a read head, which reads the information from the track. Information on hard drives are split up into sectors (512 bytes).

 

harddisk.png

From alasir.com/books/hards/01-01.png

A typical transfer rate for a hard disk drive is about 500 Megabits per second. However, before transferring the data, the drive must first find the information on the tracks. Therefore, there will be an average wait time before receiving data. The rotational delay for our drive will be on average:

 

60 second/minute * 1/(7200 revolutions/minute) * ½ = 4.17 ms

The typical seek time is about 6 ms, so the average wait will be the seek time + rotational delay, for our case it will be about 10 ms.

 

The average price per GB for HD space is about $.20, whereas the average price per GB of RAM is about $15/GB.

Notice that a GB of RAM is on the order of 100 times more expensive than a GB of HD space.

 

 

Accessing The Disk

 

High-level Overview:

  1. Wait for the disk to be ready by reading location 0x1F7
  2. Store the number of sectors we want to access (in this case always 1) to 0x1F2
  3. Store the sector offset from 0x1F3-0x1F6
  4. Store the READ command to 0x1F7
  5. Wait for hard drive to be ready
  6. Get result as a sector, pass it through the CPU and store it into the RAM

 

 

Boot Loader for the VBR

 

We are assuming here that there are only 19 sectors worth of code.

 

 

for (i = 1, i < 20; i++)                                                              // starting from 1 because sector 0 is the VBR

            read_ide_sector (i, (char *)0x100000 + (i - 1)*512);  // we have to cast the address as a char * or we would have a type mismatch.

goto (char *)0x100000;                                                           // jump to our programÕs address

 

 

 

Read IDE Sector Function

 

 

void read_ide_sector (int s, char *a) {

// s is the sector number, a is the address to read to

 

            while ((inb(0x1F7) & 0xC0) != 0x40) continue;  // this compares the 7th and 8th bits in 0x1F7 to 0x40

                                                                                         // this is the "check for ready" loop

            outb (0x1F2,1);                                                    // store value of '1' into the number of sectors register

            outb (0x1F3, s & 0xFF);                                      // set the sector offset registers

            outb (0x1F4, (s >> 8) & 0xFF);

            outb (0x1F5, (s >> 16) & 0xFF);

            outb (0x1F6, (s >> 24) & 0xFF);

            outb (0x1F7, 0x20);                                             // Set ready register

            while ((inb(0x1F7) & 0xC0) != 0x40) continue;  // again wait until device is ready

            insl (0x1F0, a, 128);                                             // 0x1F0 is the device register location, a is the address

                                                                                         // and 128 is the amount of words (which = 4bytes) we will

                                                                                         // be inserting (1 sector = 512 bytes = 128 words).

}

 

 

 

Main Program

 

 

void main(void) {

            int words = 0;

            bool inword = false;

            int s = 1000000;                                                          // we are arbitrarily choosing this sector for our location

            for ( ; ; ) {

                        char buf[512];                                                 // create a buffer to read in to

                        read_ide_sector (s, buf);                                  // read the ide sector to our buffer

                        for (int j = 0; j < 512; j++) {

                                    if (buf[j] == '\0') goto eof;                   // if we are at the end of file quit

                                    bool this_alpha = isalpha ((unsigned char) buf[j]) != 0;          // check if we are currently in a word

                                    words += ~inword & this_alpha;        //this will increment the number of words at the beginning of each word

                                    inword = this_alpha;                           // update inword

                        }

            }

eof:

            display (words);                                                          // display our result (the number of words) on the display device

}

 

 

 

Memory Mapped Display

Each pixel on the screen consists of two bytes.  The first byte is an ASCII text character.  The second is the pixel color.

 

Display:

Each pixel has 2 bytes.  There are 25*80*2 = 4000 bytes total.

 

 

Display Function

 

 

void display (int i) {

            unsigned char *screen = (unsigned char*)0xB8000 + 20;      //starting position on screen

            do {

                        screen[0] = (i%10) + Ô0Õ;

                        screen[1] = 7;              //gray text on black

                        screen -= 2;                 //writing from back to front

                        i /= 10;

} while (i != 0);

}

 

 

 

Programmed IO vs. Direct Memory Access (DMA)

 

In programmed IO, the disk must go through the CPU to access the memory:

 

However in DMA, the disk has ability to access the memory directly:

 

 

Improving Performance of Program

 

1. Read multiple sectors at once

2. Increase activity

            Current activity involves: read, wait, count words, read, wait, count words, etcÉ

If we implement read ahead, CPU can count words while I/O reads words in advance:

 

WhatÕs Wrong with Current Program?

 

1. ItÕs a pain to change.

2. ItÕs hard to reuse in other programs.

3. Several programs cannot be run simultaneously.

4. Faults will break the program.

 

Improving Modularity of This Program

 

Function that needs to be improved:

void read_ide_sector (int s, char *addr);

 

What if we want to read from other devices other than a hard drive?

void read_sector (int s, char *addr){

            switch(device){

                        case IDE:         //current code

                        case CDROM:            //other code

            }

}

 

We can also do this:

void read_sector (int d, int s, char *addr);       //d represents multiple devices

 

What if we want to change the amount of bytes read or where to start reading?

//the function is changed from read_sector to read because the amount we read is no longer specific to sectors

void read (int d, int o, char *addr, int len);      //o is the byte offset into device.  len is the # of bytes to read

 

What if there was a hardware failure in reading the text?

int read (int d, int o, char *addr, int len);         //change function to int to return if read was successful

 

Int is only 32bit which means we can only read roughly 4GB on the device.  Our hard drive is 1TB.  How do we fix this?

int read (int d, off_t o, char *addr, size_t len); //off_t and size_t types are both 64bits