CS111 Spring 2009

Modularity and virtualization

April 06, 2009 

Julio Ceballos, Lorena Topete, Cody Prestwood



MODULARITY - Split the program into pieces

ABSTRACTION - use "nice" boundaries between pieces
      L lines of code and M modules

Problems with out paranoid word-counter program

A. Performance issues

1. cache answer

2. so IO in parallel  with computation

* will help a bit for out OS but can be better

B. Flexibility

-Most Important ( common needs in Software Engineering)

1. to much of a pain to change // ex: recompile  and install

2. to much of a pain to reuse parts

-Least Important

3. to much of a pain to run several programs at the same time

4. to much of a pain to recover from crash


C. Complexity

1. Coping techniques

* Modularity - break programs into pieces

* Abstraction - use "nice" boundaries  between pieces 

2. Modularity-----> Find bugs Faster 

-assume L lines of codes and  B bugs

- assume B is proportional to L

- assume time to find bugs is proportional to L

-debug time: is proportional to B*L which is proportional to L

*Now assume M modules

-now Debug time is proportional to 

B/M * L/M * M => (L*L)/M

* this is assuming bugs don't cross module boundaries, and modules are the same size, 

  and bugs are evenly distributed 

* Obvious which Module owns the bug

* Modularity can greatly help you find the bugs in your code

* API = application programing interface


Ex: Modularity Example: Rewriting read_ide_sector

void read_ide_sector (int s, int a);  

  ^  ide is only good fro reading ide Disks


void read_sector (int s, int a);  *higher level of abstraction


void read_sector (int s, off_t s, int a);  

  ^   ^          ^  address

  ^   ^  Sector number

  ^ Disk number


ssize_t read_bytes(int s, off_t o, char* a, size_t n);  

^   ^   ^          ^  address

^   ^   ^  offset from Disk start, in bytes

^   ^ Disk number

^ signed, returns # of bytes read or - number for error


OFFICIAL UNIX

ssize_t read(int d, char* a, size_t n);  


off_t lseek( int d, off_t o, int whence);

^ ^ ^

^ ^ ^  0 from beginning; 1 from current; 2 from end

^ ^ signed 

^  returns new seek location


in Unix: 

ssize_t read(int d, char *a, size_t n)

-always read from last seek position

off_t lseek(int d, off_t o, int whence)

o - is positive or negative

off_t is signed int

whence: 0 from beginning

1 current position

2 from end

Unix design:

1. Advantage - can share lseek with read/write functions

2. Disadvantage - Takes more time if application does

 random access


Abstraction of function

read_ide_sector(int s, int a)

s-sector

a-returned result

change to: read_sector(int d, int s, int a)

d-disk ID to support multiple disks

generalize interface: read_sector(int d, off_t s, char *a)

-provides 64-bit support for disks larger that 4Gig

s-offset into disk

a-pointer to receiving buffer

1. OS needs to maintain mapping between device and 

sectors. So device number and secotr number are 

separate arguments

2. Assume secotr is 512 bytes and only one sector per 

read


Positives and Negatives

+ many apps don't care about current seek location

+ share lseek with read and write

+ encourages sequential reads

- take more time if apps generally do Random Access 


**How To enforce Modularity**

1. Do Nothing – use a single program, global variables, no functions

2. Function call modularity where functions are a black box

a. Put in BIOS and the program is called from the BIOS

b. Main is in RAM instead of BIOS

c. Stack has call – this approach has a problem


Partition of memory

-------------------------------------------------------------

| BootRecord | Main |       |Stack|       | BIOS |

-------------------------------------------------------------


Caller and Callee Contract

** Because the  callee can potentially overwrite the caller's memory, 

 a contract exists between the caller and callee preventing this.


Example:  recursive factorial function

int fact (int n)

{

If(n==0)

    return 1;

else

    return n* fact (n-1);

}


In x86 code:

fact:

pushl  %ebp                  # push caller's frame Pointer

movl   $1, %eax              # move constant 1 into ax reg. ( this is preparing the register to return 1)

movl   %esp, %ebf            # move sp into fp

subl   $8, %esp             # allocate our frame

movl   %ebx, -4 (%ebp)       # save caller's bx

movl   8(%ebp), %ebx         # bx: =n

tstl   %ebx, %ebx           # bx=0

jne    L5                    # Jump if not zero


.L1:

movl   -4(%ebp), %ebx      # restore bx

movl   %ebp, %esp           # restore sp

popl   %ebp                 #restore bp

ret                          #restore ip   

                                    

Now let's look at the caller                 

.L5:                 

leal   -1(%ebx), %eax        # ax =bx-1

movl   %eax, %esp            # store n-1 as argument  

call   fact                        # push "*+1" into Stack, go to fact

imull  %ebx, %eax            # ax * =n

jmp    .L1        

                           

Let's practice: 

what happens if we type:   fact (6);

pushl     $6                  # push arg into stack

call      fact

addl      $4, %esp            # discard the 6 we pushed in stack  

      # 4 because a word is 4 bytes long


We can san that we have a contract between the caller & callee.

Caller must:               

 top of stack =ra         

 2nd of stack =n         

 (ip) p.counter = fact   

Callee must:           

 put ans into ax         

 POP stack              

 Put result in ip     

What can go wrong with function call modularity?

callee: step into caller local variables to mess the program up

i.e movl $0, 12(%esp)         # set caller's r. a. to 0

i.e movl $0, 0(ebp)           # set caller's f p to 0

i.e callee can loop

i.e subl $ 1000000000, %esp   # stack overflow

i.e callee can halt


Here, we have SOFT MODULARITY 

 Upside :                         

 It is fast                          

Downside:                      

Does not prevent bugs  


We often want HARD MODULARITY because with this, failure in one module doesn't leak into  the other.

i.e  write an x86 simulator and run callee inside it.

HARD Modularity

Upside:                          

   This is robust and safe 

Downside:                     

   It is very slow       

         

What we want is a virtualizable processor that can do the following:

-full speed on normal instructions

-slow down only when it checks anything weird

-gets control on any "bad" instruction

-executes some software to execute this bad instruction