How NOT to enforce modularity

Typical Bootup sequence:

|---------------|
|---------------|
|---------------|
|-Boot Sector--| <- 7C00   This loads the Volume Boot Record
|---------------|
|---------------|
|-Vol Boot R.--| <- 8D00     The VBR, in turn, talks to the kernel, who loads the program
|---------------|
|---------------|
|----Kernel--- | <- C000
|---------------|


Once the program gets loaded, memory now looks like this

|---Reserved---| <-- this area is reserved for booting. It has: I/O Register, Boot Program, Kernel
|---------------|
|---------------|
|***Program**| <- 8000
|***********|
|***********|
|***********|
|---------------|
|Kernel------ | <- C000
|---------------|


x86 Machine Code

pseudo code
for (i = 0; i < volume boot record size; i++)  {                                                    // The For Loop reads VBR into main memory and then

read_sector (volume origin + i, 0x80000000 + i*512)                             // executes the program in VBR by going to 0x80000000
                        sector# . . . . . . address . . . . . . sector size

}

goto 0x80000000

Who knows how to read this? Bootsector, VBR, kernel, application, EEPROM 

These all have copies of the implementation of read_sector.


Example Code

write_int_to_console (int n)      //this program takes a number and writes it onto the console, digit by digit

uint_16_t *p = (uint_16_t *) 0xb8014

while(n){

*p-- = '0' + n%10;
n /= 10;

}


Memory Mapped I/O. Potential bugs: Negative numbers

int fact(int n){

if(!n)

return 1;

else

return n*fact(n-1); } 

*Machine Language Translation

fact:
pushl      %ebp                      // push ebp
movl      $1, %eax                 // eax = 1
movl      %esp, %ebp            // ebp = esp
subl       $8, % esp                // allocate 8 bytes on stack
movl      %ebx, -4(%ebp)     // saves callers ebp
movl      8(%ebp), %ebx      // ebx = n
testl       %ebx, %ebx           // is n zero
jne         .L5
movl      -4(%ebp), %ebx    // nestor saved sp
movl      %ebp, %esp          // nestor saved sbp
popl      %ebp                    // pops return address from stack
ret
...


Visual representation of what this does

|///////////////////////////////|
|///////UNUSED////////|
|///////////////////////////////| <---esp
|-----your frame-----|
|---------------------|
|---------------------|<---ebp
|///////////////////////////////|
|///////////////////////////////|
|///////////////////////////////|


|///////////////////////////////|
|///////////////////////////////|
|///////////////////////////////| <---esp <---ebp
|-----your frame-----|
|---------------------|
|---------------------|
|///////////////////////////////|
|///////////////////////////////|
|///////////////////////////////|


|///////////////////////////////|
|------8 bytes--------|<---esp
|///////////////////////////////| <---ebp
|-----your frame-----|
|---------------------|
|---------------------|
|///////////////////////////////|
|///////////////////////////////|
|///////////////////////////////|

Allocates 8 bytes once called

caller
    pushl      $5
    call         fact (pushes return address)
    addl      $4, %esp


5:
       leal      -1(%ebx), %eax      //eax=ebx-1
       movl      %eax, (%esp)       //stores arg
       call        fact
       mult      %ebx, %eax
       jmp       <1

|----------3-----------|
|-------fact | ebx------|
|-------fact | ebp------|
|---------6000--------|
|----------4-----------|
|-------main (ebx)-----|
|-------main (ebp)-----|
|---------5000--------|
|----------5-----------|


caller/callee contract:

  1. Do not modify or use the stack outside of your frame
  2. When you're done, return to ra
  3. Result should be put in %eax
  4. Don't mess with other registers (unless you restore them)
What happens if you don't follow these rules?
  1. You can mess with the callers mind
  2. You may not return to caller / go somewhere else 
  3. Overflow the stack / set sp to garbage 
  4. Callee can loop forever 
  5. Callee can execute HCF instruction
  6. Callee can have buffer overflow

This is what we call "soft modularity". It is enforced by politeness and convention, relies on cooperation.

What we need is modularity that will work, regardless of the politeness of the person using it. This method is known as hard modularity that enforces abstraction layers.


HARD MODULARITY


       client/service: multiple computers
       called: send (fd, {"!". 5}   <-- constructed a message betweem caller / callee, w/ file/socket descriptor for communications link
       receive (fd, response)  <-- buffer holding response
       if (response, opcode == "ok")
                print (response val);
       else
                print ("error");

       while(1) {
       receive (fd, request);              // fd identifies link, request is the buffer
       if (request.opcode == "!") {
             int n = request.val;
             for (int i = n; i>0; i--)
                   n* = i-1;

                  response = {"ok", n}; }
       else{   
                  response = {"bad", 0};
        }
       send (fd, response); }

What are the benefits to this scheme?
+ uses hard modularity, which means you don't rely on politeness of the user

Disadvantages?
- You assume there is a good link
- No recursion
- What if the service becomes overloaded?
- Callee can loop because of this caller receive function should be  receive (fd, response, timeout)
- More resources are needed to implement => not cheap enough!

HARD MODULARITY APPROACH #2


 

VIRTUALIZATION (you run the callee in a fake simple computer)

=> Too slow! (can emulate different comp. though). You can get harware support for "very little" slowdown. For instance hardware actually performs "load" operation, load operation is read and checked by interpreter.

Is there any way to solve this slowdown problem? Yes!

VIRTUALIZE PROCESSOR

- Special hooks to let kernel take control when "emulated" program does something questionable: HCF

                                     Loop (timer interrupt)

                                     Bad Access

Process = A program in execution in an isolated domain. Underneath isolated domain there is a virtualizable processor (virtual computer).

|-----------------------------------|

|   open/read              Application   |

|------------|                                  |

| OS Kernel |           add/mult.         |

|------------|-----------------------|

|                    Hardware                  |

|------------------------------------|

- The boundary between the Application and OS Kernel with boundary between the Application and Hardware forms the Virtual Computer Interface.

- The boundary between the Application and OS Kernel is an expensive boundary so, # of calls between the Application and OS Kernel is small. However # of calls between the OS Kernel and Hardware is large, but not as large as the # of calls between the Application and Hardware. 

for (;;) {

char c;

if ( sys_read (0, &c, 1) == EOF)      // <--this is a very slow command, due to disk access latency

        break;

process(c);

}

Speed-Up Approaches:

      Speculation: Chew up otherwise-unused resources now, in hopes they'll be needed shortly.

      Problem: Cache Coherence

1. Making sure cache agrees with primary copy

2. What do you do when they don't agree.