CS 111 Scribe Notes: Security
December 1, 2009
Authors:
Chen-Hsia Lui
Rajiv Makhijani
Jordan Mendler
Overview:
Computer security techniques make the computer (often indirectly) make the computer harder to use/less useful.
Therefore have to consider costs when trying to make computers more secure.
Understanding Threats (Know your enemy):
Threat modeling:
Build up a model of where/how your attackers will misbehave
Important to have a good model, or else you will spend your time and resources defending against the wrong types of attacks.
Threat classification:
Classify threats to come up with defense mechanisms for each
Come up with lots of ideas of how you would break into a computer
Look for patterns in these types of attacks
Come up with defense mechanisms for each type of attack
Example: Attacks on Courseweb
Types of threats:
Students (ordinary users) want to see each other's work to plagiarize
Outsiders want to gain personal info about student's copies of assignments
Outsiders want to get assignments, and lecture materials
Modes of attack:
Guessing passwords of legitimate users
Exploit an Apache bug (or other infrastructure bug, e.g. SQL injection)
Denial of service (no one can use it, delay assignments)
Social Engineering (trick someone into giving you a password)
Take over a router (sniff packets, steal a session)
Break into the machine room, and steal a backup drive or tape. Or better yet, copy it so they don't know.
Setup a camera watching the terminal to snoop for people's pins
Put a keylogger on the terminal
** Insider attack **
Not all of this is relevant to OS, but goal in OS is to provide an underlying infrastructure to help defend against these sorts of vulnerabilities.
General O.S. functions:
Lots of types of attacks, so have to come up with a general API and specialized tools, and hire dedicated security people.
Need general OS functions so developers can solve their own specific problems
Desired Functionality in our Operating System:
Authentication:
proving who you say you are
Integrity:
don't want to let attackers muck with data or metadata (especially security metadata)
Authorization:
Once we know who you are and have security metadata to know what you can do, are you allowed to do X?
Auditing (Logging):
So we can figure out what the bad guy did (including insiders).
So we can undo the changes they made.
Constraints:
Efficiency
we can't really log everything, or system will be slow, and we essentially DoS (Denial of Service) attack ourself.
Correctness
we want system to still work as advertised. Important because code/users are adversaries, so have to assume bugs will get found.
Authentication:
Authentication prevents masquerading (pretending you are someone else).
3 basic ways to authenticate:
based on who the principle (user in question) is.
retinal scan
based on what the principle knows
passwords
based on what the principle has
physical keys
These forms are often combined.
Some people have to authenticate with both a key and a password.
Can use one to bootstrap the other.
Use who I am, to get a new password
Techniques:
External: Accept connection from almost anyone, and have to authenticate
An example is the login program.
Passwords
Can be snooped or guessed.
Strong passwords are hard to remember and people write them down.
Restricting passwords also limits number of possibilities making them easier to crack.
Tokens
digital keys with number generators
RSA keys
Internal: Assume user has already gone through external authentication, so need to keep track of user throughout rest of system
Integrity must be respected
This will be consulted for authorization decisions
Logged during audits, so shouldn't reuse users ID's
Building Blocks:
Cryptographic hash functions
h(message) = hashvalue that is short (SHA1 is 160bits)
Given hashvalue, it is hard to guess message
There is a cat and mouse game. People come up with hash, and then people figure out some ways to possibly attack it. So these change over time.
Symmetric encryption:
Given a message (M) and key (K), easy to encrypt and decrypt Message against Key (Given M,K easy to --> {M}
k
Given M,K easy to --> M )
Given only an encrypted message or message, it is hard to find K (Given M,K hard to --> K )
Given encrypted message and no key, it is hard to guess message
Problem:
both sender and receiver need to know the key. Communicating the key over network, it can get sniffed.
Asymmetric encryption:
Have public key (U) and private key (K), which come in a pair
Given public key, it is easy to encrypt a message with public key (Given M,U easy to --> {M}
U
)
Given encrypted message and private key, it is easy to decrypt message (Given {M}
U
,K easy to --> M)
If you have public key, can't find private key (U can't --> K)
Encrypted message requires private key to get message. ({M}^U can't --> M)
To send a message, just publish your public key. They encrypt against public key, but only you have private key, so only you can decrypt
Problem:
harder to manage cause 2 keys, and computational more expensive so don't want to over use them.
Combination:
Say A wants to talk to B:
A sends to B "hello, I am A: {NonceA}", encrypted with B's public key
B sends to A "hi, got your message, with {NonceA}, here is my {nonceB}", encrypted with A's public key
A sends to B "yep, it's me {NonceB}", with B's public key. Sends session key for symmetric encryption
Nonce is a random bit string.
A knows this Nonce was only sent to B, and only B has private key to decrypt, so B sending it back is valid.
Prevents malicious user from sending random packets pretending to be A or B
A does same with B's Nonce to do a mutual handshake
In reality, this conversation is as follows:
A -> B {Hello, I'm A {Nonce(A)}
K(A)
}
U(B)
B doesn't know A really sent the message
B -> A {Hi, got your message {Nonce(A)}
K(A)
, {Nonce(B)}}
U(A)
Tells A we are really B cause only we know the Nonce it sent us
Gives a Nonce to A, so B can confirm it is really A
A -> B {Yep, it's me {Nonce(B)}
K(B)
session key}
U(B)
Tells B we are really A
This is how SSH does it
Client
.ssh/id_rsa.pub
Public to publish
.ssh/id_rsa
Private
If you look at this file, content will be sent over internet which is bad.
.ssh/known_hosts
Server names, IP addresses, and public keys. Public key is to foil masqueration, by knowing what key to expect.
Server
.ssh/authorized_keys
Public keys that can login, so long as they have the appropriate private key.
When SSH stats up, it sets up "SSH transport layer"
Initial key exchange, which sets up session key as described above
Session key is renewed every so often
Packets are encrypted, to ensure privacy
Each packet is enclosed with a cryptographic checksum
This also ensures integrity
Send Message and HASH(Message + Key)
From Message and HASH, hard to figure out key
Authorization:
We know who users are, but want to keep track of what they are allowed to do
Access Control:
for each principal p; // User
for each resource r; // File, etc
for each access method m; // Read, write, execute, delete, etc
have a bit OK(p,r,m) that says yes or no
3-dimensional array of bits indexed by p,r,m
Simple model in principle, but not practical and too hard to monitor in practice
Instead, try to come up with a more complicated model to simplify the metadata so easier to maintain (i.e., user can access all files in dir).
Downside is model becomes more complex.