CS 111: Operating Systems

Lecture 17 (5/31/2012) Scribe Notes

Prepared by Howard Lee and Shervin Rabizadeh

Table of Content

NFS Performance
NFS Security
General Security Principles
Threat Modelling and Classification
General Security Features
Authentication

NFS Performance

In this lecture we finish up the NFS Performance from last lecture. First we look at how a typical NFS works.

Sample NFS System

The figure below shows a sample NFS System.

In the above system, there is no single point of failure. If any of the disks die, we could swap it out from the machines. If a backbone fails, we still have another one (since it is a dual backbone system). If a controller fail, we also have another one for backup.

Now we look at the figures of an upcoming NFS as an example.

Sun ZFS Storage 7420

2 x Storage Controllers
4 x 10 GbE adapters
8 x 512 GB SSDs
8 x 73 GB SSDs
280 x 15 kRPM hard drives
4 x 7.2 kRPM drives.
- With a total of 37 TB total exported capacity

Performance Benchmark

Taken from SPECsfs2008 in www.specbench.org, the performance diagram looks something like this:

The response time in the diagram includes all network overhead. For example, if we issue a read request, this is the total response time. The throughput in the diagram is a measurement of the number of NFS operations, including read, write, and lookup.

It can be seen that compared to a local disk. This is really quite good. This is achieved by the clever use of cache.

NFS Security

In this section we will look at some security problems with NFSs.

What could go wrong with NFS?

Say we have an NFS client issuing a lookup to an NFS server like the figure below. The client knows the existence of a directory and wishes to look up the files within that directory.

Suppose the Client is looking up the file "sh" within "/usr/bin/". This request will probably be satisfied.

Now suppose the client is accessing the directory "/home/eggert/mail/", and professor Eggert happens to have an email containing his gambling debts within the folder, then he will probably want the NFS to decline the lookup request, so that other people won't even know the existence of certain files.

For this reason, the NFS server needs to have the ability to reject certain operations (with a "Permission Denied").

The standard UNIX way of doing this is to handle requests based on the current user's ID. However, this can cause several problems.

Bad Clients

The main problem with this approach is that attackers could masquarade as any user to gain access to sensitive information.

In particular, if NFS uses a UNIX based system, a client could switch to User ID 0 (which is the "root" user), and would be able to do anything he wants.

A solution to this problem is that, in standard NFS, user ID 0 is always treated as "nobody", a default user that has no permission whatsoever. This solution will help solve some security problems, like situations where people are mistakenly given root access.

However, this is still not a particularly effective against determined attackers who can masquarade as other users. For example, in the above example, an attack could still use commands like "su eggert" to gain access to the professor's mails.

Incompatible User/ID

Another problem with identifying requests by user ID could be illustrated by the following picture.

In the first client, the user eggert has the a different user ID than the same user in a different NFS client. Even in the same network, the same user will have to be re-authenticated to gain access to a different client. In UCLA, SEASnet and the CS department have this compatibility problem.

Packet Sniffing / Injecting

Even suppose we fixed the above problems. There is a problem of packet sniffing and packet injection, illustrated in the following figure.

Installing the daemon within the network could be done by taking control of a router. Therefore one security measure that can be taken against this problem is to reinforce physical security (Locking up the routers!).

Malicious server

Suppose now we have a bad NFS server that is spoofing the real one, and clients unknowing connected to the wrong server, then the server could gain a lot of unauthorized information from the requests and data sent from the clients.

In order to solve these problems, the NFSv4 has been introduced. NFSv4 attempts to attack all the above issues. Some examples include:

NFS attmps to identify users by name instead of numbers
ACLs
Encryption of packages / Checksums
Authentication of clients as well as servers using different cryptographical techniques

Even though NFSv4 has been out for many years, but SEASnet server is still using v3, so is the CS department. The problem with v4 is efficiency. Authentication / encryption costs time, and the servers themselves cost money that we don't want to spend.

Instead, We just use the old system and lock the clients and servers into a room (Substituting with physical security). A lot of big companies also do the same because NFS is normally use in places with big data, so efficiency and cost are important.

General Security Principles

Example of security issues:

Flame virus - still not clear what is happening
In March: DDoS attack launched on BBC
- jammed BBC satellite with old fashion electromagnetic.
- part of Iranian effort to control information flow to Iran / the whole world.
- directed by Supreme Council of Virtual Space. They were very worried about what happens on the internet, and they are willing to use security flaws to control what's going on.

We can see that in recent years, what used to be minor flaws / vandals / criminals are turning into major efforts funded by major organizations. We have to consider how to deal with them when we are designing our own systems. These attacks are exploiting security problems in systems.

As defenders, how do we defend against security attacks? We start by thinking about real-world security issues (outside of computers). In real life, the two major kinds of attack we need to defend against are:

Force (a guy pointing a gun at you asking you for money)
Fraud (someone pretending to be another person stealing money from a bank).

In the real world, these two are about equally as widespread. However, in the virtual world, fraud is more important. Although there are cases where force are used in the virtual world, fraud is by far the most popular mechanism in the virtual world. This is probably because attacks are so easy in the virtual world, attackers never have to resort to using force. The vitual world is so new that there are so many security holes yet to be solved.

If we take a closer look at virtual fraud based attacks, there are a few main forms of attacks:

Against privacy - unauthorized release of information
- e.g. Wikileak leaks
Against integrity - generating false information / tempering with others' information
Against server - interfering with the victim's ability to work
- e.g. DDoS attacks

The main goal of the defenders are to disallow unauthorized actions. For example, reads could be disallowed to protect victims from attacks against privacy while writes could be blocked for attacks against integrity. With only this in mind, though, then the most secure computer is a brick! The only problem is that we won't be able to get any of our work done. Therefore another important goal of the defenders are to allow authorized access, even though we are potentially allowing DoS attacks.

Just a side note, it is much harder to test negative goals than positive goals. The reason for this is that there are less coorperative users in the former case. Think about who is going to report problems when the goal is not met. For example, if a system we have designed does not allow the department chair to see the students' grades, he will yell at you and let you know what the problem is. However, if our system allows any students who's name starts with a Z to change

Threat Modelling and Classification

To analyse a system, we normally use threat modelling and classification, where are coming up with used cases, with are the bad guys trying to attack.

Common Security Threats

#1 Insiders.

For many systems, this is the number 1 on your list of design principles. An example of this is a situation where the department chair's secretary is told the department chair's password, and changes a student's grade. Every system needs to look out for this.

#2 Social Engineering.
Kevin Mitnick was one of the most famous hacker in LA history. He was specialized in breaking into phone systems, stealing long distance call and listening to people's conversations. The way he does it is, he would drive up to a telephone pole near his victims, climbs up the pole, calls the phone system on top of the pole and ask for the password. Since the company could tell that he was on top of the pole, they would just giving to him. This is an example of the classic "I forgot my password" attack.

#3 Network attack.

There are a few categories of network attacks.

Drive by Downloads - Carefully crafted HTML file / image / js that is attached to a website that you want to visit. Designed to exploit known bugs in your browser. The image will contains an error, which might overrun a buffer somewhere, giving exploiters control of your websites. Professor Eggert estimates that about 10% of the websites have drive by downloads.
Computer viruses
DoS attacks

#4 Device attacks.

An example of this is a USB virus. The attack writes a boot record into the USB, and waits for his victim to boot with the USB plugged in. If when the computer boots, and it tests for USB first, it will boot the virus.

This is just an overview of some examples of security threats. There are a lot of threats out there that.

Risk analysis is crucial. There are so many threats that we can't practically defend against all of them. Therefore we need to prioritize them and solve the more important ones.

General features of security

The following features are needed for pretty much any security mechanisms

Authentication - this is what proves that someone is who they say they are.
Integrity - make sure that the data don't change due to a malicious attack (e.g. checksum)
Authorization - checking what they do is what they are allowed to do.
Auditing - to some extent, this is the most important mechanism against insider attacks, since none of the above are going to work against that. People should not be able to cover their tracks
Correctness - need to have a system that actually works correctly
Efficiency - slow system are more susceptible to DoS attacks, and no one would use it.

In the rest of this lecture we will focus on authentication.

Authentication

There are 3 basic techniques for authentication.

1. based on something that the principal knows.

The principal is the person in charge. The principal really has to be a PERSON. A classic example is passwords.

2. based on something that the principal has.

For example, a physical key or security ID token with a number that changes in an unpredictable way. (put picture here) An attack will have to physically take apart said token to see how it works.

3. based on who the principal is.

In the physical world, say you tell your secretary you lost your key, you will be allowed access. In the virtual world, this will be biometric authentication, such as fingerprint / retina scans.

However, all of the above have their own problems. Attackers could still breach the security in the following ways:

For passwords, one might guess. Another problem snooping, like people using cameras to track the keystrokkes a person has pressed.
For keys, there is loss / theft
For biometrics, there is still a problem of fake biometrics. Someone could apply thin layers of gelatin on their thump with someone else's thump print on it (although fake retinal scans are a little harder to create)

We also have to consider the opposite problems, where authorized users are denied access:

Forgetting password
Losing keys
False Negatives.

To some extent, all of these have bootstrapping problems (getting the program to start in the first place).
When we got an account in Seasnet, we got it by showing our student ID; but when we got our student ID card, how did they know who we are? We have to trust all the underlying authentication techniques to some extent. i.e. how people got their authentication in the first place.

For operating systems, since speed and efficiency are very important, authentication has another issue that is worth mentioning. There are two major categories (metaphor with military base):

1. External Authentication

2. Internal Authentication

Similar to a military base where we don't care about what happens outside the parameter. Suppose we have a gate that allows people into and out of the base. External Authentication is what you use when you want to enter the base in the first place. At the gate they check all your ID and according to that they determine if they can let you in or not. (Unless you are president then they just let him in). If they decide to let you in, you will be given a badge of some sort to carry around. That is your internal authentication.

Some examples for external authentication in a computer system for passwords, tokens and biometrics. However, there exists a lot of possible attacks such as bruteforcing, network snooping, and fraudulent servers (e.g. someone setting up a website that imitates another server, and steals your password when you enters it to "login").

The cost of external authentication is normall expensive, since the system has to check everything to make sure you are really authorized, operating systems also have internal authentications. These are reliable and cheap checks within the system as you do accesses. However, if we don't care about performance, we can just do the same external authentication every time you do anything.

For example of this is in UNIX, when we do something like "read(fd, buf, bufsize)", the system only needs to check whether fd has write-only access or not. The system does not need to re-authentication your user information all over again. This is achieved by a process descriptor inside the OS, which has a user ID that should reflect your external authentication. This check is a cheap operation because this ID is normally just a number.

One thing to note is that internal authentication must be controlled by kernel. User should not be able to change the user ID without using a syscall. (Of course the default reply would be "no" for this syscall; setuid() is the system call in UNIX systems, which usually doesn't work and only root can run that system call)

Network Authentication

Now we consider authentication in the network point of view.

When we are building a network based system. We will have some building blocks that will be used over and over again. There are three that we will talk about:

1) Cryptographic hash functions

An example of this is SHA1, which takes a message and outputs a 160-bit number, which is a hash number. Knowing the 160-bit number doesn't help you computer the message feasibly. However, SHA1 has design flaws, which is why SHA2 is introduced.

2) Symmetric Encryption

An example is the Triple-DES (Data Encryption Standard), where we encrypt stuff threeways.

The sender and receiver share a secret key K. Given a message M, and a key K, it is easy to get the encrypted version of the message {M}^K. If given {M}^K, and the key K, it is also easy to get the message M. However, if we only have {M}^K, it should be very hard to get M or K.

Symmetric Encryption tends to be relatively fast. However, it suffers from the bootstrapping problems of getting the key into both the sender and receiver.

3) Asymmetric Encryption

An examples is RSA which is relatively slow, but with asymmetric encryption.

Here the sender and receiver don't have the same key. There is a public key P and private key K. Suppose we have a message M, and the public key P, it is easy to compute {M}^P. Suppose we have {M}^P, and the private key K, it is easy to compute M. We also assume that if we have {M}^P and P, M and K are very hard to compute. Guessing K from P should also be hard.

Since making these functions is another topic of its own, we will just assume we have these functions available.

Some examples:

Say we have sender Alice, and recipient Bob,

A sends to B: {I am Alice}^K, where K is a private key known only to A and B.

When B receives the message, he is confident that A is Alice because they are the only people knowing K, so B sends a reply to A: {OK.}^K

Suppose later B gets another message from the same IP address A later, there is no guarantee that the message is not spoofed (since the sender fills in the return address and port in the packets ourselves). So now, we have to complicate the protocol.

An idea to solve this is a nonce - Random integer: e.g.
A sends to B: {I am Alice}^K
B sends to A: {nonce}^K
A sends to B: {nonce+1 Transfer money to bank account 1234}^K, here, in order to decrypt nonce, you will need K. Therefore the test of the message must have been from Alice.

Bob will later arrange nonce to time out to avoid replication of packets.

A standard way to do this is the HMAC (Hashed message authentication code). In HMAC, we assume a shared key K. We compute nonce with the following:

SHA1(K xor pad1) o SHA1(K xor pad2) o message ("o" = concatenate)

where the pads are numbers specified.

When we send the message we send M o HMAC(K,M). HMAC is hard to reproduce, since you will need to know K. The problem with this is the two parties need to have a shared secret keys.

What if shared secrets are hard to maintain?

A solution is, we could start with the public key system as follows:

the recipient publishes P. (a slow process)
sender creates a nonce K, encryptes it with P, sends it to the recipient
the recipient decrypts it with his private key, and the two parties negotiate a private key K'.
use K' for the rest of the session (HMAC(K',M)). Note that this is used in SSH, but K' is refreshed periodically.