Lecture 17 - NFS & Introduction to Security
Authors: Oliver Doan, Patrick Brown, Phillip Chiu
Network File System
Files on an NFS are located on a server and not on clients. Clients' files do not matter to other clients or the server.
How do you get files from an NFS server?
- Use nfs_open: nfsfd_t f = nfs_open("name",...)
- This is unfeasible as all applications would have to be rewritten to use nfs_open
- Use standard file descriptors: FILE *f = fopen("name",...)
- NFS functionality is extended through fopen()
- This is also not a good idea as many applications use nonstandard IO (lower level system calls, like open())
- Solution: Rewrite system calls in the kernel to support NFS
- System calls will be implemented via Remote Procedure Call (RPC)
NFS is its own type of filesystem. It uses a mount table to connect other data areas on the server to the root. The user views this as one big data tree.
NFS protocols look similar in implementation to UNIX system calls.
Request | Response |
LOOKUP(dirfh, name) | fh + attrs (fh = file handle) |
CREATE(dirfd, name, attrs) | fh + attrs |
MKDIR(dirfd, name, attrs) | fh +attrs |
REMOVE(dirfh, name) | status |
RMDIR(dirfh, name) | status |
A file handle is a unique identifier generated by the file server (filer). It persists through system crashes and is generally short (fixed number of bits). The natural implementation of file handles is to use the inode number. Because an NFS can have multiple file systems attached to it, the device/file system number is also needed to locate a file.
Unix has no efficient way to take a device/inode number pair and actually get the data located there at the system call level.
- Possible solution: Add a new system call
- openinode(dev, ino) -> fd
- Problems: Security Loophole - Allows a user to access a file directly if they know the inode number (or they can guess), even if they are lacking permissions
- It is therefore inadvisable to implement this system call
Advantages of having an inode/device number pair:
Client A opens a file for reading. If Client B renames the file (or moves it), client A will still be able to access the file because it references the inode and not the filename. However, if Client C removes the file from the server before client A executes a read, the read operation will return failure. This is in contrast to standard UNIX protocol which does not remove a file as long as it is open (objects are pointing to it). The NFS server does not keep track of which files are currently open for reading. If this error occurs, the NFS returns errno = ESTALE to the client.
Motivation for having NFS
Using an NFS setup isolates clients from the system so that a single client failure does not affect the data.
- Strategy: Use a "stateless" server where server RAM is only used as a cache and does not affect the correctness of the data
- If the server crashes nobody loses any data
- However, since the server RAM is only used as a cache, it does not keep track of clients
NFS Performance
NFS clients execute asynchronous reads (read ahead) in order to improve performance. Alternatively, they cache recently accessed data on the client for future access. The NFS server also uses dallying to improve write performance.
Read and write consistency is no longer guaranteed. If Client A writes data and client B attempts to read the data a very short time afterward, Client B could return the old value of the data if either A dallied on the write or if B had cached the old data.
Close and open consistency is still guaranteed however. If Client A writes and then closes the file, and client B subsequently opens and reads the file, B will always open the newest version of the file. This works because close requires all buffers to be written out and open requires the file to come from the filer and not from the buffer. This does have a performance impact as the buffers must be flushed while using close/open.
When a client attempts to write a file to the server, the NFS server is not allowed to cache this data to RAM and then respond with success. If the server crashes while the data is in RAM and has not yet been written to disk then the data is lost. Some systems allow the use of a "cheat-on-write" flag that allows this behavior. A viable alternative is to cache the data to non-volatile RAM.
Example NFS setup:
- From specbench.org SPEC sfs 2008_nfs.vs (NFS benchmark)
- HP BL860c i2 4-node HA-NFS cluster
- Highly Available (HA) - No single point of failure, but performance may suffer if something fails
- Fault Tolerant (FT) - Same as HA but performance is not impacted by a single failure - more expensive than HA
- 4 servers
- 8 RAID controllers
- 4 FC switches
- 16 disk arrays (2 GiB cache each)
- 1472 drives (72 GB 15,000 RPM)
- ~2 ms response time for ~250,000 ops/second
- Server response time (2ms) is approximately 10x faster than a local disk (20 ms)
System Security
NFS setups can greatly improve performance, but security is an issue.
- Authentication - Traditionally solved by running NFS only in a trusted environment
- UID management - Suppose user "user5" on client A has ID 1000, but ID on client B is 1020 - Traditionally solved by remapping IDs to match
- Current NFS setups use Kerberos authentication and will also remap UIDS across clients.
Security needs to focus on the following items:
- Privacy - Unauthorized release of info
- Integrity - Tampering with data
- Service - Denial of service
- Must achieve general goals of:
- Disallow unauthorized access (negative goal, very difficult to test)
- Allow authorized access (positive goal, more well tested in practice)
Threat Modeling - Likely Problems
- Network Attacks
- Denial of service attacks from outside network
- Exploit bugs to break in (buffer overrun)
- Packet sniffing for passwords
- Brute force password checking
- Virus (via email)
- Drive-by download
- Device Attacks
- USB devices carying viruses
- CD-Rs and DVDs
- Insider Attacks
- Social Engineering