Modularity and Virtualization

Jan 12th, 2015

by Xinran Chen, Ivy Wang and Yingjia Lee

Overview

For last week's lecture, we worked on building a paranoid word count program for scratch. This lecture will focus on making changes to this program to improve it.

Three Ways to Improve Our Code

Double Buffering

Our original program worked by first reading in input, waiting for the reading to be done and processing the read input, and repeating. Double buffering works by overlapping the reading and processing tasks so that they are done concurrently. In the word count program, processing time is significantly shorter than the reading time so the result of double buffering would be that our program is always reading.

7ZpRc6IwEMc/jY83A6RYfay99vpyT3bmnnMQJdNAnBCr9tM3kV0hxHac0xOc4cXBjWTD778JuzEj8phvfym6yn7LlIlRFKTbEfk5iqL7aWQ+rWFXGeIgrgxLxdPKFNaGOf9gYAzAuuYpK50faimF5ivXmMiiYIl2bAspXBcrusTua8M8ocK3/uGpzirrJBrX9hfGlxm6CcfTqqXUO+wjZQu6FvrH3gQPn1PsC55qG0APY1IZdmCIyAR9Fc6YPqTMHYNiZQ0LnpfDwMDLX6lSphyT4MVbExx5MtopKc2N9irfPjJh9UNtqtuev2g98FKscFx/dQPweKdiDUP3+JkRmqtZypWRk0tLoZRry2+W6VyYr6FtpmXGbKf2S6mVfDsoRoxlIQv9THMubNS9MPHONE8oOGdKMwjPIw9QYzEhzWTOtNpZxVAwCF8I52gKYDd1vGDoZo1QuQMbBfjLQ881LXMBwI7DA8+3Cy8Ku4MHU/iG4ZHu4MGadMPwJt3Bg1fECfAaoK65pN3DQyKbCcyVBpv4CBvkdQ4b7KO3cA4R0AUcGG1/4WBWAXBIdOfBwZ9cHM7NpxIEM+EO1qQQ0s7+hhbC6CK0wFVv4ZAA3ildLEo3n4WSuMN556ehitGUF0sPonlA49iPsEcppK3pCrmHvOBCtExU8KVlnhggtvybWVyGnXiAhpynqXUz22Rcs/mKJtbnxtTvxibNrxdCbowlM79j5oa9FnMYmh3JfyyvINVEbbDAbkY2yNfUBpeCs7Txs9xBG6cAue9OGz+JHrRx3gjdzRvso6HNSsmEleUgD97QKj8JgS4a8kxAsYvL41cRgzytlS12Z89V5TkhEy8zurKXVjHBHpSynI5i7CgfbSXr0dTnFwZHUipyCYB931+JMZnsIFtHwP2Fc/gDpAM4fd9facO5ZhFM+r5DMA7ddPSqcPq+Q9CGc9Vp9U2NG4SltpjMvoDJHYbkYy9DK3Unp6XuMXg6SyqIkqNSRUU6SNVaj1tv8mtK9c3uREDUIFV7AYQlugup/Iy0WXENa6A/sfBsw3cF2BG1kPQ5auG27ldqDcugp9b038rli6jl5+yvGS+tc2qHqDNmFdqVmuX2NttChWmzF8OWoVOWuiskwWMAzaI9hiqgKSO+BM+S0f/39tVVrtxwnWTm5J85xsb0hhk4DQEDup+Vw0ZW+yjHCZpiKXCmpuZrfW5w39Y4AEqePgE=

If our program required more processing, for example, if we had to decrypt the input, double buffering would allow the program to read in the next block of input while it was decrypting the input from the last block that was read

7ZrBc7o4FMf/Go+dAQKox9ptt5c9edhzFqJkGogTYtX+9U30RQihXWf0Z3CGiwMPScLn+0K+LzpBL+X+b4E3xT88J2wSBfl+gv6aRNF0HqlPHTicAkmQnAJrQfNTKGwCS/pFIBhAdEtzUltflJwzSTd2MONVRTJpxVac2V1s8No03wSWGWZu9F+ay+IUnUVpE38ndF2YbsJ0frpSy4NpIycrvGXy6RiChy+xaQueah9ACzE6BQ4QiGLAs8GVNaYvzksrIEjdwILnpTAw6OU/LnIirBCj1UcbHHpV2gnO1Y36qNy/EKb1M9qcbnv74eqZlyCV1fVPNwCPT8y2MHSHnxqhOlrkVCg5KdcUar7V/BaFLJk6DfVlXBdEN6pPain4x1kxpCIrXsk3XFKms+6dsE8iaYahcyIkgfTseYAGi0ppwksixUErBldT0AfSOZoD2F2TLyZ1i1aqxBDDAH99brmhpQ4AWD886Plx4UWhP3gwhR8YHvIHb/bw8Gb+4MEScQG8Fqh7vtKm8JCGzQzmSotN0sPG8LqGjWljsHDOGeADDox2uJmTwC0AB0WxA8cYj5vDeXgrgYwT9vBOCsF2DnfeGRg+Ugu6GiwcFMCa4uOl9PAuFCUe551rQwXBOa3WDkT1gKpjN8NeOOO6pqv4EfKKMtYJYUbXmnmmgOjyb6FxKXbsGS6UNM91N4tdQSVZbnCm+9yp+l3FuPr2ivGdihTqe0TdcNRiCUPTI/lzK0oMc89MewRNtDMb5GtrY14FV2njutxRG6t6M3sVHrRxTfSojVXf+Js3po2WNhvBM1LXozxno2zLE5l9w5Y8MwjdXB63ihjl6cyeaOpPnguceF3gjT7UijHyLITm1IvRkx/tmPVo7r59wqDHUqFbABz6/kpywRaC2YK5tVs3gAcLJzXFlg84Q99fSbqO645FMBr6DkF3Wt0VztB3CNIZFDM+ppVb47YNRxDWUpNSWwPKPoz+o2+FiHvUgvnYVstsv16l1i+V1SiVI1U8tTeP+got8xOx5XQA8zVSmSzplSqq8lEqe1Yh29XfU6oUh6ssxsE8CKdZmpKn+PKfKzxtuJtbPCynLq3fl5Axu3XZ469mdfWCsYxL/uV1j78l35XP3fke5fsffz0g+cxPfQ9T5HtdW4wEvS4KidFFdbXr+AKfLioZuovq1t1+M93dwhkz/Tft7lbaqdPmP9fHa60/z6PXbw==

Double buffering works best when the reading and processing time take about the same because there would be less time spent waiting for the process to finish so that the next block of input could be dealt with.
Increasing Buffer Size

Increasing the buffer size to (say) 128 sectors would decrease the rotational latency time (the time taken to bring the correct disk sector under the read-write head of the hard disk)
Direct Memory Access (DMA)

Our original program uses the CPU's programed input/output (PIO) to read and write from hard disk. This approach is slow because it takes up unnecessary time when the data needs to be sent through the CPU for reading and writing to be done.

Direct Memory Access allows a device to read directly to memory so that unnecessary time spent in the CPU is avoided.

7VpRk6I4EP41Pk4VAUF9dJz19h6mbuqmru45AxFSE4gVcNT99duRDgiyDLsnLtbhgxU/Qyf5vk4nnTBxVvHhD0W30bMMmJjYVnCYOE8T254tbPjWwDEHXMvNgVDxIIdICbzybwxBC9EdD1haqZhJKTK+rYK+TBLmZxVsI0W1iS0NjfkSePWpuET/5UEW5ejc9kr8K+NhZJoh3iL/J82OxkbANnQnsocThIOPqbGFozpYSM88/33E3/YUgS1NKl36JmVcARRLS65wuBz7hY28SRUwVYEET97PeXO+gHRKSnhQl+LDigktn5HG23hzfzYls0VAbP/NesjtrLtWLwhVLMHO/VeTSOkHFTsc/oUEMEooPUZZLAAgUEwzJd8LTR00wVTG0E+7dLRkDJydyZhl6qi1RC9BitHP7QX+3peOZHuIRedOhBhFXcLCcskKFJCYjiSZEX5OUsAVTBsutbulcqe7emPiiGHAMGeYPGPOVKkQh4HkqsRNG4jzBLT4+AaFUBdWL/8YDMwX8AXBQJHGL8hcSSH1rEzkSYANF6IGUcFDrYcPlOoJ/KgJ5xCmlvhHzINAN/O4j3jGXrfU123uIQADJqH2Rsg9IBHUY/AAdu66ui2QfyObce4z2aYNsplQdFXZsCt34e8FK0Pwd1zZ2vydJ0DK7sRZOjq+tmD2NC2eT8w2p3fXn91TqDcMFIsksvQ7XB93WvcRM9wBEYd73raY8ffyeQwV52EAdXMIusfvWCSNg7QJF/D0Pc+mQAchgNlRRrAwr0b8JhnJ9FYRn+CA2nR8UTJUNI5ZABX//GtUUVtwaiqaoFpREZeF/lXskMimEd3qYsqTULClUpqkRg5bliNQDGbzmsZc6IF/ZeKDaXH6oNgxW6HWvW1DwHN6obh7Gnzb5Xw6qMMC0pT0DnUn5A7ptMAYHY8LOvj8kM4LyD1t/t0hZU2kafc/VOa8IaVNRsUxb+rg8kPKm8x4xrzpp2UcVN5kIlWbjk+nmAWVnlksT0NZ+j5Lx4PPk5yDSqDs3i654AGaRpA6N4k4sZ312oJPL+d89cRg3jBhmuIengJfl+DeblVqBCuZUXz4ATKhmzJOPLSBjJM5XomcMT5r2iIseqG86T7mzn16gS48CJfucFvCkgBPWp58QdOU+58x20omCy7eOfl5Kj87PilAxQRMpY9qi00UYhsvkkNfSq2QHxPgLdztGwvgaMpn+ND5Wxs1O8Tw+iNDGVUhyy4MnQQtxv2LV/j/h7Dlzup3kJezqilszXF1vu5LE+Osap1VrlvLZqxabOs6rVzbvC/2A0O/PK3gZ/kaWF69fJ3P+fId

Implementing These Changes

We currently have 3 different versions of a read/write system:

BIOS
boot loader (in master boot record)
our own word count program also has its own read/write system

In order to update the read/write process, we need to optimize all three of these. Ideally, we'll only have one copy of a read/write function. Theoretically, this could be done by asking the company that makes the EEPROM to accommodate a read_ide_sector to support DMA and read larger sectors.

We would then have every program use that program in BIOS. In fact, this was what BIOS was originally intended for.

However:

There is a standard set of locations for functions in the BIOS and it's a pain to conform to what the BIOS allows.
There is a lack of flexibility in what the BIOS allows, and asking for changes by contacting the company that makes the EEPROM is unreasonable.

Because of these problems, we choose to focus on changes to the software instead of the firmware. However, this also opens up some problems with our current word count program:

Our program will not scale very well: There will be more bugs as the program gets bigger and the large amount of code will become harder to debug
We can't reuse parts of the word count program that is applicable to other programs (like read_ide_sector) because it would mean that we wouldn't be able to make changes to the wc program without worrying how it affects other programs
It's hard for the word count program to recover from faults
We can't run multiple applications simultaneously.

Modularity and Abstraction

In order to combat the above problems, we want to write programs with the goals of modularity and abstraction.

Modularity: breaking the problems/solutions down to manageable pieces, which cuts down debugging time
Abstraction: Modularity with "nice" boundaries

In order to determine how "nice" a boundary is, we need some metrics to determine whether a certain implementation of modularity is "good" or "bad".

Performance: Modularity should help improve performance. However, creating boundaries often hurts performance because we can't tailor code for our purposes.
Robustness: How well does a system tolerate failures on one module. Are failures isolated to failing modules?
Lack of Assumptions/Flexibility/Neutrality: How easily can we apply a module for different programs?
Simplicity: How easy is it to learn and use the module?

* Waterbed Effect: Optimizing one metric often causes the other metrics to suffer. We need to determine whether or not these tradeoffs are worth it.

Applying these concepts to our current version of read_ide_sector

void read_ide_sector(int s, int a);
void read_sector(int diskno, int s, int a);

It no longer assumes that there's only one disk.
int read_sector(int diskno, int s, int a);

The original function wasn't as reliable as we wouldn't have known if it failed to work.
void read_sector(int diskno, int s, int a, int nsecs);

We no longer assume that we only want to read off 1 sector at a time.
void read_sector(int diskno, int bytesoffset, int a, int nbytes);

Instead of assuming that 512 bytes is the sector size, we can just specify a byte offset and the number of bytes that we want to read.

Implementing Modularity

You don't: global variables + a bunch of spaghetti code (ie one "main" program)
- Pro: It's pretty fast
- Con: hard to debug, and hard to understand, hard to change
function call modularity: we break the program up into functions and then call the functions

7Vpbb/o2FP80qPs/dEpiLuERum6TpkmT8rDt0SQGLJwYOaZAP32Pk2NyMWtpy6UdvCDnkBzbv4vxMemQh3Tzm6LL+Z8yYaITeMmmQ37pBMFgGMCnCWzLQM/rlYGZ4kkZ8qtAxJ8ZBj2MrnjC8saNWkqh+bIZjGWWsVg3YlMpml0s6cymrwJRTIUb/Zsnel5Gw6BfxX9nfDa33fj9YflNrrc2R8KmdCX0fRHCyafU5sJZbbzysovPb/Ha90LbVdYY0rOUaSOgWF5hhdPlOC7sZCJVwlQjJHi2qONGHoE6JSU8aFrp5oEJQ5+lps8mU9oPQ8o8lkzI5L7M8+uht+8AVSzDwX02JUL6RMUKp+9QsJ5zzaIljc31GmTZIeO5TgVc+dDMtZKLHcEE8zGlGYr2kFFX8IHymUyZVltDLGYYIuCoet/KeV3JyrekzGuSCkIMUqRptstdgQQNxOlAzOwcX8EMlAGtS8PUw8XhLZgsnEdFqeugFMNqZp7IAItVrLnMcgc4gAMG5QL3IIU05stkAeyUC9EKUcFnxuUxwGd8OjbgcliNRvhFypPEdDPeJ2gJd0+FXENkDvcxeGA8lZmOcGjIznEJCzxcCne6dgmznNb56p6CL+z666t6eElVI2E1lBKq6U3GDYb6mPMSMh58ExmTiy7OuFGqoZRrGi9uOn6LorPp2NL+5YUchJcUsm+VUMG0ylY5S25SbkjZ1okXkfIBRQbLkpFSBUqxoHnO4wOEzRKnAH0/YnXZentli0HFBNX8qdnjPqCwj78kh7FUvhm0fGOvbYpcrlTM8Kl6DddO5BgQV3ObSFM1Y9pJVPC2m/gHqXRrn7uJYnRxd91+a3PbRfvVxdXD2On95pZeN78F5Fh+I70z+s2tynCj5i1Nf2CPq/Zdr03NRX3nFoef9h0Zsw3X/+C3pv0vtL2fByiMb2XJYNjalQxaTjrUku1EwfCMljygwvz/r66k3zqVISjI91JJbKnwX4lOSSX+PVCjEvYyyb3MhMFy9MfI/D9i1tOrXmRJa5Xdd0zpd3HTcfpV1j27uPkv8MNWig/7r53ohP6zXdeoXCsQfdOAtwNW0kOJv3LAej4D2gndDNgs+Y5kQCfRKQ3onswUP4CFCzsBJIR/67jmVBR+vPOgzEdXjqMIPn8aC1lUIxGMUptTNy/aphMpfly5Y/tNxwYe7hrr8gxx+1GXp30F4h2OhcvqnYNSFdW7I+TxBQ==

How function call modularity works in recursion example:

    int fact (int n) {

        return n ? n*fact(n-1) : 1;

        //if n is zero, it'll return 1, else call (n-1)

        //if we put in -1, it'll run till we get to ready only

        // and it'll crash

        //we would only be able to compute ~20 on x-86

    }

    fact:     pushl %ebp

              movl $1, %eax

              movl %esp, %ebp

              subl $8, %esp

              movl %ebx, -4(%ebp)

              movl 8(%ebp), %ebx

              testl %ebx, %ebx

              jne .L5

    .L1:      movl -4(%ebp), %ebx

              movl %ebp, %esp

              popl %ebp

              ret

    .L5       leal -1(%ebx), %eax

              movl %eax, (%esp)

              call fact

              imull %ebx, %eax

              jmp .L1

We can trace through this in gdb using step instructions.

Inefficiency

There are jumps, which CPU hates
Robustness problems: if one module messes up, it won't mess up all the other ones. Say there's a movl %esp, 100(%esp) then some function caller's caller's… will be messed up and we won't really know what happened. The bugs here won't be isolated to itself, it will mess up some larger thing and create some larger problem. If we can run into an infinite loop, the caller won't be able to stop.
Soft modularity: if everything does as is behaved, it'll work, but this is really bad. We want hard modularity.
Two ways to do hard modularity. First is multiple machines. With message passing between each so we have a send message primitive and a receive message primitive. If the receive message doesn't give an answer back, we'll move on. However, this is SLOW! So we have a second way to do this - virtualization. In this way, you pretend you have more than one machine. Specifically, you have a virtual machine that executes special instructions to ask the real machine to do certain actions.

Overview

Three Ways to Improve Our Code

Double Buffering

Increasing Buffer Size

Direct Memory Access (DMA)