CS 111 Lecture 18 Scribe Notes (Fall 2010)

Lecture 18 - Computer Security

By Michael Koyama, Wan Choi

Functions of Security Mechanism

Security mechanisms need the following 5 functions in order to be complete:

Authentication: The security mechanism should ask the principal* to prove who he is. (e.g. password)
Integrity: The security mechanism should prevent the tampering of data. (e.g. checksum)
Authorization: The security mechanism should keep track of who can do what within the system. (e.g. Access Control List)
Correctness: The security mechanism should have no bugs, since security bugs are very likely to be exploited.
Efficiency: The security mechanism should not seriously slow down the system.

* Principal is the actor "in charge," or the user accessing the system.

Authentication

How should the security mechanism authenticate the principal? There are 3 methods of authentication:

Base on something the principal knows. (e.g. password)
Base on something the principal has. (e.g. physical key)
Base on something the principal is. (e.g. retinal/fingerprint scanner)

Each method has its own pros and cons. Choosing the best one depends on the context in which the security mechanism will be used in.
Furthermore, the combination of the three methods will always be stronger than any single one method. For example, if the principal is
required to have both a password and a physical key to access the system, then there is a less chance of unauthenticated users in the
system since you need both the password and the physical key for the access.

As for getting new authentication, a popular method is bootstrapping issue. New authentications are passed out to principals based on
old authentication. For example, in order to setup an account on Seasnet, you must authenticate yourself with a photo ID, such as your
Bruin Card or your driver's license.

There are 2 types of authentication: external and internal. External deals with principals who want to come into the system, and internal
handles principals that are currently in the system. It works like a fort; to get inside the fort, one must pass through the one of the gates,
which acts like an external authentication. Once inside the fort, to access one of many important places, such as barracks or medical
centers, one must pass through guardsmen, who act like an internal authentication.

External authentication is mainly done by a login agent, which is inside the OS but talks to the outsiders. An example of of a login agent
is the Unix login command, which runs off of password database. If the principal can provide a correct password, then the login
command will allow the principal an access. Also, external authentication is expensive. There are few possible attacks on the login
command, and each attack has a counter that can be utilized by the OS.

Password guessing vs. Throttling (i.e. counting login attempts)
Keylogging vs. [hard to counter]
Brute force using /etc/password vs. Putting passwords in /etc/shadow, which is read only to root
Using /etc/shadow is more effective than using encryption, because breaking encryption gets easier in the future.
Password sniffing vs. Encryption over wire
Social engineering/Fraudulent server vs. [hard to counter]

Internal authentication is done through process descriptor records using (uid_t value). The records are consulted for access decisions
and logged for future usage. Since internal authentication must be done every time a principal wants to access something, it is done
very frequently. Therefore, internal authentication must be cheap. Only privileged code can change the process descriptor records.

Case Study: savannah.gnu.org break-in over Thanksgiving

Over the past Thanksgiving, a break-in occurred on savannah.gnu.org. The break-in was done by using a method called SQL injection.
SQL injection involves sending a query for login and putting "extra" query into the user string. The "extra" query could be then used to
gain information that would not normally be available, such as the password database. The diagram on the bottom explains how SQL
servers work.

The hackers were able to get the access to /etc/shadow, where the encrypted passwords were stored. By using brute force, they were
able to break an encryption on one of administers' password and gained administration privilege. From there, they tampered with
codes that were stored in the server and inserted security holes that they can exploit later.

In order to fix the damage done by the break-in, the administration at savannah.gnu.org did the following. First thing they did was to
reset all the passwords. Then, they went through the logs to see when the attack occurred and restored from a back-up right before the
break-in. Finally, the administration asked all the users to re-submit things that were submitted after the attack. This way, all the traces
of the attack was erased.

Security Through Obscurity

One approach to security could be security through obscurity. This approach attempts to use the secrecy of design, algorithm or
implementation to provide security. For example, instead of storing our passwords in /etc/password/ , we could instead store our
passwords in /etc/.ordinary or even in /tmp/.sort9s . This approach attempts to keep both the design and the data a secret from the
attacker. In the past this approach did not work well. Attackers will almost always be able to find out the algorithm or the design of the
system. This approach also brings up the issues of correctness in security. With more things that we have to keep secret, it is harder
for us to impose correctness on all of them.

Most operation systems follow the Kerckhoffs's Principle instead of security through obscurity. Kerckhoff's principle states that a
system should be secure as long as the keys to our system are secure. Everything about the system is known to the enemy except what
is most important, the keys to the system. We try to keep our secrets as small as possible so that we can manage them better. This
ties into a subject we will discuss later.

Controlling Access to Resources

There are two types of control for resource access: direct and indirect.

Direct control

Hardware checks each access to resources, so it can be done quickly but can only be done with simple rules.
Access is mapped into the address space (e.g. virtual memory). The principal is only allowed to access within its address space.

Indirect control

It is done by issuing service requests (e.g. syscalls.).
It is handled via trusted codes (e.g. kernel).
OS checks each access to resources, so it's very flexible but it is slower than direct control.

We'll focus more on the indirect control by providing an example. The example will illustrate the authorization for unlinking a/b/c. First,
the kernel retrieves the principal's UID and GID from the process descriptor. Then, the kernel checks that the principal has search
permission (rwxr-xr-x) on ., a, and a/b. Finally, the kernel checks that the principal has write permission on a/b. The principal only
needs write permission on a/b and not . and a, because linking only changes the inode for directory b.

Here's a scenario: a professor wants to create a directory that he and his TAs can access but not the students. An easy way to do this
would be to create a group that includes the professor and the TAs, but only root can create groups. This means that only server
administrations can help the professor make this directory. When the group "cs111" is created with the professor and the TAs
included, then the professor can run the following commands to create the directory.



  mkdir d

  chgrp cs111 d

  chmod ugtrwx d

This will create a new directory d and set the permission so that the group "cs111" has the permission rwx.

Access Control Lists

For the professor in the above example, it would be easier he used the access control list, or ACL. ACL is a list of permission attached
to an object that specifies which users or system processes are granted access to objects and what operations are allowed on given
objects. getfacl and setfacl can be used to get and set the file's ACL, respectively. ACLs are manipulated by syscalls, and they can
only be manipulated by the file's owner. The professor could have used ACL and manipulated the files in the directory so that only he
and his TAs can access them. This would not require a help from the server administrator, since the professor is the owner of the files.

What's the default ACL for a newly created file, then? In the normal permission idea, it is the equivalent of umask, where you set the
file mode creation mask of the current process. Instead of using umask, though, each directory contains a default ACL for sub-files inside
it. When a new directory or file is created within a directory with a default ACL, then the newly created directory or file inherits the ACL
from the default ACL.

Below image shows a compact representation of an object's permissions.

Access Control Lists (ACLs) vs. Capabilities

ACLs and Capabilities are both forms of credentials that can be used by the operating system when accessing files.

Access Control Lists
Access control lists are associated with an object. These are checked at each access. At any moment in time, the access control lists are
a centralized repository of who has rights to an object. Their size is unbounded which causes the checking of access control lists to be
slow. Additionally, because the size of ACL's are unbounded hardware support to implement ACL's are unlikely.

Capability
A capability is a convenient handle for an object. One advantage of capabilities is that you can send them. The main parts of a capability
are a unique id number, the rights for the file, and a checksum for authentication. Because of the checksum, these records cannot be easily
forged. Checking these credentials can be done fast compared to access control lists.

Unix file descriptors are similar to capabilities. Like capabilities, file descriptors only have one set of rights for the file. File descriptors can
be easily given to another program by using fork. Finally, file descriptors have faster access checking than normal files. Unix systems uses
both ACLs and Capabilities.

Trusted Software

There are certain programs in Unix that are labeled as “trusted.” For example, the login program is trusted. This program must be run as
root so that it can read from the password file. However, the cat program does not be need to run as root. It is run by the user who is
currently logged in.

To differentiate which programs are “trusted,” there is an additional bit to the setuid bits that tell the program if the program runs as user or
runs as root. If this bit is 0, the program is run by the user currently logged in. If this bit is 1, the program is run by root.

The correctness function of security is highly dependent on trusted software. Since trusted programs are always run by root, bugs in trusted
programs are openings for attackers to break into your system. Bugs in programs which are not trusted such as cat are less important to find
as there is less of a chance to break in with these programs.

The discussion of trusted software brings us to the question of “What software can we trust?” Which parts of code do we have to trust so
that our secrets are not handed out? To answer this question we define a “trusted computing base,” which is a list of trusted software. This
base includes kernel , login and password code as well as the c libraries. One common program that is not in our trusted computing base is
chmod. We want our trusted computing base to be small so that we can more thoroughly check for correctness of this set of software.

However, how can we know that our version of these programs are not frauds? These trusted software also need to be distributed by a
trusted method such as SHA1 hash. We also need to trust the developers of the software at Ubuntu for example to not write buggy/malicious
code.

How to break into Ubuntu (Ken Thompson creator of Unix) – Reflections on trusting trust

If Ken Thompson wanted to, he could have inserted a bug in his original version of UNIX that would enable him to break into every UNIX based
distribution today as well as any future versions of UNIX that are based on his original design.

In login.c we could just have to insert the following code:

if ( strcmp ( name, "kent" ) ) { uid = 0; return OK; }

However, if anyone attempted to look at the code for the login program, they would see this suspicious section of code. An easy workaround for
this would be to have the compiler (gcc for example) insert this code into into login.c when compiling. Again, this falls to the same problem as
before. If anyone were to look at the source code for gcc, they would be able to spot this suspicious code being added. A final example of how
to get past this would be to insert the code in gcc.c when compiling gcc. We could come up with more and more convoluted schemes of hiding
this bug in our login program. Nevertheless, the point of the login bug example is to show that we must place our trust in something.

For the full article written by Ken Thompson: http://cm.bell-labs.com/who/ken/trust.html