UCLA Computer Science 111
Operating Systems Principles - Lecture 1

Albert Wang


Class Information


Professor: Paul Eggert
Email: eggert@cs.ucla.edu
Office Hours: Boelter 4532J
M 10:00-11:00 A.M., R 2:00-3:00 P.M.

TA: ZhaoXing Bu TA: Tuan Le TA: Muhammad Taqi Mehdi
Email: zbu@ucla.edu Email: tuanle@cs.ucla.edu Email: taqi@cs.ucla.edu
Office Hours: Boelter 2432 Office Hours: Boelter 2432 Office Hours: Boelter 2432
TBD TBD TBD

Textbook


Jerome H. Saltzer and M. Frans Kaashoek, Principles of Computer System Design: An Introduction, Morgan Kaufmann (2009). ISBN 978-01-2374957-4

Assignments


Assignment Grade Fraction Details
Labs (4) 1/3 May be completed in teams of 2.
Midterm 1/9 In class on Wed, February 3.
Open book, open Notes.
Final 2/9 Location TBD Wed, March 16. 11:30 A.M.-2:30 P.M.
Open book, open Notes.
Minilabs (2) 2/15 No partners.
Design Problem 1/12 Check with TA on problem choice.
Scribe Notes 1/20 May work in groups up to 4. Sign up for the lecture.
Due 1 week from assigned lecture.
Report 1/15 2-to-3 page report on current OS topics (TBD).

Lateness Penalty: 2N% penalty for an assignment submitted between N and N+1 full days late. Assignments are not accepted after the last day of class.

Red Star 3.0


A while ago, it was announced that Red Star 3.0, a North Korean OS, had been reverse engineered. It was found to be a Fedora variant, Fedora being a Linux based Kernel sponsored by Red Hat. The interesting points to Red Star were the unique features that were implemented as part of the OS. The first of which is an automatic steganographic watermark applied to every file generated on the OS, allowing people to identify who and where a file that was created on Red Star originated from. In addition, a tamper resistant OS modification was implemented that would restrict what programs were allowed to run on it. The only programs permitted to execute on the kernel are the ones approved by the government.

What's a System?


"We don't offer a ready-made programme, but an entire Operating System."
~ German Pirate Party

Years ago no one would have ever thought that Operating Systems would find a place in politics, but this raises the question of what they really mean by all of this. To begin with, we look at what exactly a system is.

To start off, we take a look at the definitions given to us by the original Oxford English Dictionary (1928):

I. An organized or connected group of objects.

II. A set of principles, etc., a scheme, a method.

Along with that, we have the Greek root of the word to be σύστημα (systema). Systema itself is defined to be: organized whole, government, constitution, a body of people or animals, musical interval. Systema's roots themselves mean "set up with."

Our third definition for what a system is comes from our textbook. According to it, a system is defined as a set of interconnected components that has a specified behavior observed at the interface with its environment. An interesting point in addition to this definition is that systems are broken down differently based on a person's interpretation of said system. For example, if a plane is viewed as a flying machine, we can say that the body of the plane is the system, and it interacts with the surrouding air as the environment. However if viewed as a mode of transportation, we can interpret the system as the interior the plane, and the environment as the people it carries.

Between these definitions, we see that the trend tends to be that a system has to do with organization and connection amongst many parts into a whole. With this in mind, the German Pirate Party's message starts to become a bit clearer. They promise not just a single solution to a single problem, but an organized group that can tackle many issues.

Operating System


Now that we have a better understanding of what a basic system is, we move ahead to the turn of the century to figure out what an Operating System is. As before, we start our investigation with a dictionary entry, but this time using the American Heritage Dictionary (2000): software design to control the hardware of a specific data processsing system in order to allow users and application programs to make use of it. While the inclusion of the term "specific" is no longer truly applicable in the age of general computing, this definition gives us something to consider and compare other definitions to.

As we continue our search into the defintion from Operating Systems, from Microsoft's Encarta (2007) we have the definition "The master control program in a computer." This is rather brief definition, so we look even further to Wikipedia's page on Operating Systems (Page ID: 698216816, 2016-01-04): a collection of smaller programs and software that is used to control and operate the computer system. A concept that comes up in all three of these definitions is control, control over either hardware or software in order to perform actions. This is the defining feature of an Operating System, total control over the processes of a computer.

However, these three defintions of what an Operating System is fail to mention some of the most defining features outside control. A good OS handles resource management, what programs get what access time to the CPU and memory. Without this feature, we could never achieve the operating speeds that we have become so accustomed to today since there would be no way to minimize cpu downtime. Secondly, an OS helps ensure reliability in the execution of programs. It checks to make sure that a program runs the way it should, and if any problems arise during the execution, it handles that too. Without the OS to check a program's execution it could run amok with a computer's systems and hardware, whether by accident or not. Which brings us to the last thing these definitions are missing, security. When a program's execution goes wrong, it can start altering a portion of the stack that doesn't belong to it, rewriting data on the disk, and if it's a truly malicious program, steal your data. The OS restricts a program's access to the computer's system and helps minimize the chance that a program can get out of hand like this.

The Zeta File System


$ls -l big

-rw-rw-r--l eggert faculty 9223372306854775000 Oct 6 11:41 big

$grep x big

$time grep x big

real 0m 0.009s


In a quick case study of our grep function, we find something very odd about the way it operates. As given by "ls" command and its associated output, we have a file named "big" with read and write permissions created by Eggert that is 10 19 bytes long. We run this huge file through grep, searching for the character 'x', and discover two things. One, there is no character 'x' in our file "big", and two, grep finished its execution in 0.009 seconds. Think about that for a second, we just processed 10 19 bytes in 10 -2 seconds, which gives us a data processing rate of 10 21 B/s (Bytes/second) or 10 22 b/s (bits/second). To put this into perspective, we actually look at XKCD for a quick theoretical calculation of the bandwidth of the internet, which comes out to 167 TB/s, or 10 11b/s. This is still slower than our grep's supposed data processing rate by a power of two.

So how does Grep do it? How does it process 10 19 bytes in a mere 0.009 seconds? Well the fact of the matter is, it doesn't. Our "big" file doesn't actually hold 10 19 bytes of data, it's filled with null-bytes. Along with that, this is a special type file that belongs to the Zeta File System. A ZFS file works on the principle of intensional files, it is represented by a data structure that describes the contents of the file without actually holding the data. With a normal file, Grep does indeed search through a file linearly to see if it can find the term you're searching for. When Grep is called with the ZFS as an argument however, it instead consults the ZFS's data structure to see if it holds the character or word we're searching for. Grep can then avoid actually searching through the entire contents of the file, and instead just asks a yes/no question on whether or not the content exists. This is how Grep is able to process a file this big in so little time.

The principles that Grep applies to searching ZFS files has applications outside of a simple file search for a word or phrase. In the world of security, anti-virus scanners often work the same way. Malicious code often hides as a tiny section in a much larger program. If a virus scanner were to look through every program in its entirety to try to find this virus, it would take forever. SO instead, the scanners do the same the Grep does and consults the ZFS to check whether or not the file or program contains the characteristic code of a virus.

Problem Areas in Operating System Design


Unfortunately not all is perfect in the world of technology, and Operating Systems are certainly not exempt. A great deal of problems arise when programmers design Operating Systems, and great care must be taken to keep these issues to a minimum. In general, these problems are not new, they've existed for a long time in the world of engineering, but now they're being applied to the large scale programming of an OS.

Incommensurate Scaling

As a system grows, most of the time we begin to see that different parts grow at different rates. As an interesting example, if you were to scale a human to say 15 feet, you would find that a linear scaling of our bodies would not work. Our bones grow as a function of a square of their area, and as such would have to scale at a different rate than the rest of us if it were to support it. Two subconcepts that fall into this category are economy of scale and diseconomy of scale. In economy of sale, the cost per unit goes down as the size grows. While this sounds great at first, one has to be wary to not go overboard with increasing scale to the point of being wasteful with the newfound resources. A factory can produce more goods for a cheaper price than a single person can, but if no one is buying that much product then there's an extraordinary amount of waste. On the other hand, diseconomy of scale states that the cost per unit actually goes up as the size grows. For example, search algorithms that follow a O(N2) scale take increasingly more time as the number of search elements increases. These two concepts must be balanced in designing our Operating Systems to maximize efficiency.

Emergent Properties

The concept of emergent properties states that larger systems have properties that smaller ones didn't. Issues that were inconsequential and could be disregarded when a system was small become huge issues that can no longer be ignored. In the case of the Tacoma Narrows Bridge, the engineers forgot this this crucial fact. When the bridge was built they had designed it to withstand the natural movements of the earth and any shaking from a strong gust of wind and even earthquakes, it should have been able to stand for many years. What they forgot however, was that sometimes the wind isn't only just a strong gust, be a repetitive buffeting. The wind actually managed to strike the bridge at its resonant frequency, and soon it started vibrating at a huge magnitude before it finally collapsed. These winds do the same thing to other bridges as well, but due to their smaller size the issue is masked. Only at a bridge as large as the Tacoma Narrows would have displayed these symptoms. Because the engineers failed to forsee this possibility, a great deal of money was lost in this failed construction.

Propagation of Effects

Even the most robust and well designed Operating System has pathways where an error in one spot can propgate to another and cause all kinds of trouble. For example, Microsoft had an issue between their language encoding and file system. At one point, they introduced the SHIFT-JIS encoding for representing Japanese characters. In this style, the more significant half of the bit representation is what determines the character, and is signified by a 1 in the most significant position. The less significant half has the same context encoding as ASCII. Microsoft's file system read everything in as ASCII, so when someone tried to use a Japanese character in the filepath for a new file, it would read that less significant half in as an ASCII character, and say if it turned out to be a '\' character, this would completely ruin the filepath and prevent the user from creating the file properly.What were originally two independent processes that should never had had an effect on each other somehow ended up colliding and failing. In the end, Microsoft had to modify the behavior of its file system to handle this scenario.

Tradeoffs

There's something called the "waterbed effect" when it comes to trying to solve the problems presented in the previous sections. As we try to solve one, one of the others gets worse, just like how when you push down on one part of a waterbed, another part pops up. If you want to reduce the effects of one problem, you need to be prepared to accept the fact that another problem is going to get worse. For example, take the passwords that the professors at UCLA use to get into their accounts. If we want to make them more secure, you can use a two-fold system: a regular password combined with a special digit generated by an RSA key fob. The addition of the RSA key fob certainly does increase the security of your account, but it's cumbersome and something more to carry around. The professors might reject using it altogether simply because of the inconveniene of having to carry one around all the time.