CS111

Scribe Notes for March 30, 2009

by Lief Andre and Vahab Pournaghshband

Administration

CS111 - Operating Systems Principles
Instructor:
Prof. Paul Eggert
Course webpage: http://cs.ucla.edu/classes/spring09/cs111/

Formal Prerequisites:

  • CS32 (Intro to Computer Science II)
    C++ is used in CS32, but we will be using C
  • CS33 (Intro to Computer Organization)
    MIPS is used in CS33, but we will be using x86
Actual Prerequisites: (in addition to formal prerequisites)
  • Computer Architecture (CS152B)
  • Programming Languages (CS131)
  • Networking (CS118)
Reading/Text:
Jerome H. Saltzer & M. Frans Kaashoek, Principles of Computer System Design, MIT 6.033 class notes, draft release 3.1 (August 2006).
The notes are available at Course Reader Material at 1081 Westwood Boulevard (Tel: +1 (310) 443-3303).

The reading assignment from the notes for Wednesday, 04/01/2009, is §1 and §2–§2.C. Reading assignments for later lectures are available in the syllabus. Nearly half of the text book will be assigned in this course.

Grading:

  • 1/3 Labs
    It consists of four labs (each lab is worth 1/12 of total grade) which could be completed in pairs

  • 1/3 Exams
    Midterm 1/9 Wed, 04/29/2009
    Final 2/9 Tue, 06/09/2009

  • 1/3 Other
    • Two Minilabs, each 1/15
      Unlike the labs, the minilabs are expected to be done individually
    • Design Problem + Oral Presentation 1/12 (it's a follow up on some lab)
    • Scribe Notes 1/12
      Each lecture 2-3 students will be assigned to work together to create a validated HTML 4.01 format of that lecture's note. It is intended to cover all main points in that lecture, written and spoken. Guidlines are available in the course webpage.
    • Class Participation 1/75
    • One-page Report 1/50
      The report would be on a current topic on operating systems, assigned toward the end of the quarter since the topics are expected to be current!

Resources:

  • Suggest: Install & run Linux
    • UCLA Linux User Group Installfest, early in quarter
  • CS111 Linux Lab (SEASnet)
  • SEASnet Solaris
  • POSIX (Portable Operating System Interface for Unix)
    • Standard for Linux/Unix/etc.
    • IEEE Std. 1003.1 - 2008
  • GNU Linux
    • QEMU
  • See resources on class webpage,

Lecture

We begin with formal definitions of system and operating system.

What is a System?

  • Oxford English Dictionary (1928 Edition)
    I. An organized or connected group of objects
    II. A set of principles, etc.; a scheme, method

    Originated from the Greek word,

    An organized whole or government, or body of men or animals, or musical interval, or connected group of verses.

What is Operating System?

  • American Heritage Dictionary (4th edition, 2000)
    Software designed to control the hardware of a specific data processing system in order to allow users and application programs to make use of it.

    (Note that the word "specific" in the definition refers to lack of portability of older operating systems; however, nowadays operating systems are more portable (e.g. Linux runs on different machines))

  • Encarta (2007)
    Master control program in a computer

  • Wikipedia (Ver. 280634790 2009-03-30 09:30 PST)
    An interface between hardware and user; it is responsible for the management and coordination of activities and the sharing of limited resources of the computer.
While all definitions presented above emphasize on the concept of controlling, Wikipedia seems to offer a more concise definition.

System: A Technical Definition

A System is an interface with an environment. Systems have nested subsystems (components) that may nest in different ways. For example, the classroom can be divided by geographical location as the lecture area containing the board and the podium, and the student area containing the chairs. The classroom can also be divided into functioning components such as electrical system, furniture, etc. In the real world, to maintain or build anything, the system must first be divided into subsystems.

Problems in Computer Systems (Things That Go Wrong)

I. Tradeoffs

By changing one area, other areas will be inversely affected. Below are three examples where tradeoff appears in the design:
  1. The Waterbed Effect: Pushing on one side of a waterbed raises the other side.

  2. Time vs. Space: Imagine we would like to sort 1GB of random data containing 10 million records (this means each record is ~100 bytes). If we use quick sort and assume random data, it would make 1e7*log(1e7) ~ 22e7 ~ 1e8 comparisons since quick sort, on average, makes O(nlogn) comparisons. This indicates that the algorithm moves ~1e10 bytes. We can make the algorithm to run faster by reducing the number of movements. This can be done by allocating an array of pointers (4 bytes each) that corresponds to each record. Now, instead of moving the records, the pointers are moved. The latter method performs faster by saving up ~6% of number of movements (1-4e8/1e10), while needs 40MB more memory.

  3. Passwords vs. Token-based Encryption: Encryption algorithms need keys to decrypt. For a particular algorithm, this key could be a password that is typed into the system. This is an inexpensive approach, but it could be vulnerable (e.g. a video camcorder could be used to steal the password while entered). Another approach is to use hardware tokens which its value changes with time. It is more secure compared to passwords (remember the token can be stolen), but hardware components are normally costly.

II. Incommensurate Scaling

The key concept is that not everything scales at the same rate. The problem of scaling can be illustrated with the mosquito vs. elephant comparison. A mosquito has no lung to breathe, but it uses its skin to breathe. Now, the question is that why an elephant needs a lung to breathe and the breathing through skin system does not scale up to work for elephant. The answer is that the surface area and the volume do not scale up at the same rate. The proportion of volume to surface area is considerably larger in elephant compared to mosquito and the skin would be insufficient. In Operating Systems, people build mosquitoes and try to scale them up, but this is not possible.
  • Economies of Scale:
    • The Pin Factory: In a town in the 1700s, Adam Smith saw two options for the creation of pins: Option one: every person makes their own pin, which may take two or three hours. Option two: collect a few villagers to start a ping factory.
      • Advantages of the Pin Factory: you make many more pins per unit of effort as factory scales. Many pins!
      • Disadvantages of the Pin Factory: Perhaps, too many pins.

  • Diseconomies of Scale:
    • A Star Network does not scale well
    • Advantage: Cheap entry, it is still possible to make money
    • Disadvantage: More clients (scale) = worse service (things break)
The key concept is that when building a system, know how it will scale! In current systems both economies and diseconomies of scale may exist.

III. Emergent Properties

These are properties of a system that are unnoticeable when small and unanticipated when you scale up. Here are some examples of emergent properties:
  • Tacoma Narrows Bridge:This bridge was built using a scaled up model used on smaller bridges. On smaller bridges it worked great, without any problems. Once scaled, when wind gusts around the bridge approached the resonant frequency of the bridge, the entire bridge oscillated and then collapsed which was caught in film.
  • Modern Example: Buffer Overflow:
    • Early computers were expensive, uncommon, and generally unnetworked
    • Modern computers are cheap, everywhere, and always connected
    • Result: Exploitations and unexpected attacks take advantage of buffer overflows in all sorts of software all the time, as recently as this week.
There have been some unexpected good things that result from “emergent properties”, such as the emergence of Napster from dorm networks.

IV. Propagation of Effects (The Butterfly Effect)

By adding a feature to one part of the OS where it works and makes sense, in a far off and seemingly unrelated area part of the system crashes as a result. As an example, to support Japanese in Microsoft encoding, initially they used two different forms for letter representation. The original ASCII characters were represented by 8 bits: first bit was set to zero as the indicating bit and the rest to represent the character. The Japanese characters, however, were represented in 16 bits where its first bit was set to 1. This seemingly simple and elegant solution at the moment later resulted in problems in other components of Microsoft Windows. One major problem occured because of the way that file names were stored. If the 15-bit JIS index happens to include the backslash ASCII representation, then the OS would be confused. The problem was later resolved when UTF8 emerged.

V. Complexity

  • Hardware is scaling faster than our knowledge/intelligence.
  • Moore's Law: The number of transistors per chip has increased logarithmically from 2e4 in 1970 to 2e9 in 2009.
  • Kryder's Law: The capacity of disk drives has increased from 0.1 GB in 1990 to 1000 GB in 2007.