James Chin
Michael Policarpio
Corey Harada
CS 111 Lecture Tuesday 9/30/08 Scribe Notes
Write a program to count the number of words in a file, from scratch in machine
code, can't use Eclipse or Microsoft.
1) Standalone program, when you boot machine, it boots right into your
program. Assume it is the only program that runs.
2) Must run fast.
3) We use only our own code: no help from strangers.
4) Turn on the PC, the program runs, and it prints out the number on the screen,
nothing else.
Assume file is a contiguous block of data sitting somewhere on the disk, our
choice of where.
What is a word? Anything matching this regex: [A-Za-z]+
Encoded in ASCII, one byte per character
CPU Architecture: Intel 32-bit x86 API (at least 3 years old)
ATA disk: 200GB (at least 3 years old)
1GB RAM
Filesize: special null byte marks the end of the file, '\0', all bits zero
Future enhancements: need to be able to run other programs too (e.g. print, word
processor), also newer, bigger, faster SATA drive, printer, keyboard, mouse.
Issues:
1) bootstrapping
2) disk input
3) screen output
1) Bootstrapping: x86 boot procedure
physical memory RAM (volatile, gone when power turned off) | ROM (contains BIOS,
still exists when powering off)
hardware points at first byte in ROM, starts BIOS
BIOS tests system, looks for devices, looks for a device that contains a Master
Boot Record (MBR)
MBR:
512 bytes long, divided into three pieces: 0 - 446 bytes, 64 bytes, 2 bytes (511
and 512).
last 2 bytes must look like this: 0xAA55 (55, then AA, because of little
endian)
446 bytes contains x86 machine code, instructions
64 bytes is partition table, divided into four 16-byte entries. Each entry
describes an area of this disk, says status (e.g. bootable), start, size.
Chain loading: BIOS reads MBR into a location 0x7c00 in RAM. MBR code
reads Volume Boot Record (VBR) off a bootable partition & executes it.
BIOS is OS independent. VBR is OS dependent/specific.
VBR will read our word-counting program into RAM.
Where is the OS (word counter) on disk? Not MBR, Not VBR. Let's say
sector 1000 of the booted partition through 3047 (adjustable)
Where is OS in RAM? Not BIOS. Into ROM at 0xF0000. Not 0x7c00.
Put it at 0x100000 (VBR machine code will put it here, both VBR and OS knows
this)
Memory:
0x7c00 is where MBR and VBR live (within 512 bytes)
From F0000 to FFFFF is where the BIOS lives.
From 0x100000 to 0x1100000 contains the OS.
Disk:
MBR partitions disk into four pieces (0, 1, 2, 3), MBR sitting at beginning of
disk.
Partition 2, first 1000 bytes set aside, followed by OS.
Send this to Courseweb, don't email to professor.
2) Disk Input:
7200 RPM drive, 120 Hz
Average Seek Time 8 ms
Average Rotational Latency 4 1/6 ms
Average Transfer Rate: 500 Mb/s
Cost: $0.50/GB
Disk controller is attached to the bus. We talk to the disk controller and
give it instructions and returns us data.
Location: CPU can:
0x1F1 read this to find out controller status (store into here to issue a
command)
0x1F2 # of 512 byte sectors to read
0x1F3-6 sector offset
VBR code:
for (i = 0; i < 2048; i++)
read_ide_sector(i + 1000, 0x100000 + i*512);
void read_ide_sector(int s, int a) {
while ((inb(0x1F7) & 0xc0) != 0x40)
continue; // waiting for disk controller to be ready
outb(0x1F2, 1);
outb(0x1F3, s & 0xff);
outb(0x1F4, (s >> 8) & 0xff);
outb(0x1F5, (s >> 16) & 0xff);
outb(0x1F6, (s >> 24) & 0xff);
outb(0x1F7, 0x20);
while ((inb(0x1F7) & 0xc0) != 0x40)
continue; // waiting for disk controller to be ready
insl(0x1F0, a, 128); // insert long, copies disk data into our RAM register, via
CPU
}
compile this and get the machine code
MAIN OS (w.c.)
#include <stdbool.h>
#include <ctype.h>
int main(void) {
int nwords = 0; // count of the number of words
bool in_word = false;
for(;;) {
char buf[512];
read_ide_sector(s, (int)buf); // convert from char to int
for (int j = 0; j < 512; j++) {
if (buf[j] == '\0') {
nwords += in_word;
write out (nwords); // print the answer out
while (true)
continue; // after answer is printed on the screen, loop forever
}
bool this_alpha = is_alpha((unsigned char)buf[j]);
nwords += ~this_alpha & in_word; //~ is bitwise negation (NOT)
in_word = this_alpha;
}
}
Screen is 80 x 25 characters
Memory mapped I/O for screen, starting at 0xB8000, 25 lines, each contains 80
ASCII character representations
ASCII char, followed by color
void write_out(int n) {
unsigned char * screen = 0x58000 + 40*25*2 + 40; // offset output by 40 chars on
the screen to the right
do {
screen[0] = n % 10 + '0';
screen[1] = 7; // gray color on black
screen -= 2; // working backwards, every char takes two bytes to represent
n /= 10;
} while(n);
Performance Note: Currently this program reads and counts, reads and
counts. Would be faster to read more into a buffer.
We could use DMA (direct memory access) instead of insert long (insl), to copy
data from the disk controller directly to RAM, without going through the CPU.