Lecture 11 Scribe Notes
June 1, 2010
Scribe: Alen Zamanyan
How do you access resources?
direct
application has a pointer to OS object
access check done when pointer is given to application (often via VM)
+ fast access after check
- resource can be more easily corrupted (if hardware checking is valid, we're okay; otherwise can be problematic - eg application can write anything to screen buffer)
indirect
application gets an opaque handle (eg file descriptor)
- slower access: system call overhead
+ fine-grained access control (in software - eg revoke access to file descriptor; no need to rely on hardware for this)
Most operating systems use one of these two methods (depends on whether we want speed or flexibility)
Access Control Goals
Nongoal: denial of service
defense mechanism needed to prevent this (usually not part of access control; done at a different level)
A good access control method accomplishes all of this
To design:
threat analysis
security model
Example:
network tech is updating network software for republican senator's office
we look over his shoulder, get password and access the files/emails of the democrats
We need a way of keeping track of what accesses are allowed
What about the access control data itself (should not be tampered with)?
must be udpatable (sensitive operation)
controlled operation needed
indirect access, due to sensitivity (it's rare that hardware will provide a mechanism for access control according to our specific needs)
There are two main ways to do this:
access control lists (ACLs) - list of people who can access each object (eg a guard that checks ID)
capabilities - no centralized access control list; distribute a key to each user with access (eg we need a key to open a lock)
In both cases
must be unforgeable (at least part must live "in" operating system)
must be consulted before access (no way to bypass)
hardware and/or operating system support needed
Tyring to represent a 3D space
3D array of booleans - Can this principal access this object for this operation?
if prinicipals = 1e4
objects = 1e6
operations = 1e2,
we need principals * objects * operations = 1e12 bits to store AC info!?!
this is too much metadata
On the other hand...
Unix Permissions Model
Each object has 9 permission bits
ls -l output
rwxrwxrwx ... <-------- 9 bits
we need 32 bits to store the owner and another 32 bits to store the group, so 64 + 9 = 73 bits/object
with 1e6 objects, we only need 73e6 bits of metadata
+ much more compact!
+ easy to check
- only 3 operations (read, write, exec)
Initially in Unix, a single process could only have one group; the Berkeley folks changed it so a single process can have several (limit is relatively small - up to 8) groups simultaneously
1) this is a bit too generous in many cases
2) sysadmin (root) is in charge of group membership - inflexible
3) hard to maintain - lots of people, lots of roles
Also, say someone works for payroll and billing but shouldn't be able to mix the two up - there is no support for this
We can give users different userids for each set of objects they can access - but the user would have to log out of billing account and log back in to access payroll
ACLs (widely introduced in Windows NT)
now in Linux, Solaris
complicate representation of permissions - associated with each file (& operation) is a list of users (+groups) that can access it
eg - for a given object, we have a list of users & associated operations and another list of groups & associated operations
$ getfacl object <--------- get file access control list
user:: rwx
group:: r-x
adnan:: rwx
other:: ---
eg -
/u/class/spring10/cs111 <---------- has access control list that allows 2 TAs to read/write
There is also a setfacl call that allows us to set the file access control list for a given object.
This getfacl, setfacl pair of calls takes care of #2 above (eg adds flexibility)
Role-based Access Control (RBAC)
used in Solaris, ActiveDirectory
Users can assume roles
rights are associated with roles not with users
if you assume a role (employee, instructor-in-charge, student), you may lose rights of previous role
you can have >1 sessions with different roles
this is more appropriate for a large organization where roles are clearly defined
it often comes with fine-grained control over operations
eg - normally, calling unlink(d), where d is a directory, is not allowed (why is this? - answer below); however, role-based AC can allow users to unlink a directory
Why can't we normally unlink a directory?
unlink("bin") will result in linkcount--
in this case, linkcount goes from 2 to 1, but we have lost the pointer to this directory and have not reclaimed storage (leak)
eg - (another exmaple of fine-grained control in RBAC) - linking to a file that you don't own is normally allowed, but can be disallowed in role-based access.
Say you want to do the following:
link("/user/adnan/a", "b")
Why might we need to disallow this?
If user adnan granted permissions to '/user/adnan/a', then wants to revoke, I may already have a hard link (namely 'b') to the file - if he initially disallowed linking (as can be done in RBAC), this problem would be solved
A hacker does the following:
$ ls -l /bin/passwd
-rwsr-xr-x root /bin/passwd
$ ln /bin/passwd $HOME/p
Suppose a bug is found in the passwd program (the hacker knows about this bug)
When we try to fix the security hole, we do the following
# cat fix > /bin/password.new <----------- create new file
# chmod 4755 /bin/password.new
# mv /bin/password.new /bin/passwd
The bug doesn't get fixed because link count was 2 (instead of the expected 1) - /bin/passwd doesn't get deleted and replaced with the fix; as a result, the hacker is able to exploit the bug
Capabilities
Possession of this 'word' implies rights to an object ('word' may be 64- or 128-bits)
Grant permission to a process by passing it a key
But the process can email it to anyone else, potentially granted access to users who weren't meant to have it!
How to implement
1) encryption - capabilities sent across network
in effect, process1 does 'fd = open();' and sends fd to process 2
2) index into operating system table (eg file descriptor)
Can we forge a file descriptor? - No, we must use 'open', which uses an access control method internally
Issues
1) 'words' must be wide enough so they can't be guessed (at least 64-bits)
2) containment - the capability can escape! (eg if you log it somewhere)
Capabilities are lightweight, easy to understand - as a result, they are popular in academia
Denial of Service Attacks
defense methods
What if 1e6 attackers visit whitehouse.gov/feedback, add comment: 'Prez is dope!' - how can we defend against this?
use a captcha - can't be read by robot, but can be read by human
log IP address, keep track of (recent) bad guys
most botnets have IP address hard-wired for performance, can't afford DNS lookup - so change the IP address
make your web server faster (enough capacity that DoS isn't really effective)
how do we make Apache faster?
for (;;) {
fd = accept();
read(fd); <-------- this hangs if there are no bytes!
handle request;
close(fd);
}
Approach 1 - fork():
What if we try to fork off a child process to do the reading of bytes across the stream and handling of requests?
this is slow because Apache is a big process; needs to be copied
use vfork (share memory) instead
Approach 2 - go multithreaded:
all threads must share memory
it's a pain to maintain scripts from different authors
- bug in one handler can corrupt the whole system
+ faster
Approach 3 - preforked children:
Apache calls 20 forks upon startup
child handles request, but already exists
if it dies, fork off another one
Approach 4 - Event-based (fastest web servers):
+ nonblocking I/O
+ multithreaded (1/CPU) - threads never wait
Ultimately, this is an example of an instance where an application (namely, the web server) does its own scheduling - because the Linux scheduler sucks