NFS servers have lots of storage either through hard disks or flash. The main motivation for using NFS servers is to have a lot of storage with the added ability of parallelism. The problems with NFS servers are the increased complexity as well as lower reliability as you add more drives.

Reliability for NFS servers tend to follow the bathtub curve. The failure rate tends to be high in the beginning of the establishment of the server because the hard drives that were purchased did not have enough testing and fail from the start. The failure rate drops because drives are not likely to fail as they pass a certain threshold. As time goes on, the drives fail simply due to the wear and tear of time.

RAID was created to alleviate some of the problems associated with disk failures using techniques like concatenation, striping, and mirroring.

RAID0

RAID0 takes multiple disk drives and "concatenates" the drives to make it look like one large drive. For example, ten 1TB hard drive in a RAID0 configuration would essentially act as one 10TB hard drive. The motivation for RAID0 is the lower cost, as buying ten 1TB hard drives is cheaper than buying one 10TB hard drive.

Another type of RAID0 uses "striping" to take advantage of parallelism. Assuming locality of reference, striping separates concurrent data blocks between the disks. For example, if a file spanned five data blocks, striping would put the five data blocks on five different hard drives. The advantage of striping is parallelism as pulling files will happen concurrently providing more throughput. The disadvantage of RAID0 is decreased reliability. if one disk fails, all of your files can become corrupt as all of your files may lose a block of data.

RAID0 Image

RAID1

RAID1 takes the drives and "mirrors" the data across all of the drives to provide redundancy. Each of the drives in the RAID1 configuration has a copy of the same data.

The main advantage of RAID1 is the increased reliability. If one of the drives fail, a copy of the file still exists on one of the other drives. Another advantage of RAID1 is the ability to have twice the READ throughput as the disk arms of each of the different hard drives drives can be in a different location. The obvious disadvantage of RAID1 is the storage cost. If you decide to have three drives in a RAID1 configuration, you would incur three times the storage cost.

RAID0 Image

RAID4

RAID4 uses a combinatoin of RAID0, for concatenation and striping, along with one disk as a dedicated parity disk. The dedicated parity disk provides fault tolerance by calculating the XOR of the bits in the same block on the other drives. For example, if there were five disks in total, A, B, C, D, and the parity drive, the blocks in the parity drive would contain A^B^C^D.

If the parity drive were to fail, the new drive replacing the parity drive, would simply have to be rebuilt by calculating the parity (A^B^C^D) for all of the blocks. If one of the data drives were to fail, the new drive could be rebuilt by calculating the XOR of the other drives including the parity drive. For example, if the data drive B were to fail, the value of the blocks would A^C^D^P.

The advantage of RAID4 is a signicantly cheaper cost compared to RAID1, while still providing redundancy. The disadvantage of RAID4 is the increased complexity. WRITES become more expensive as the parity has to be computed each time there is a write. In order to calculate the parity, the system has to do READS in order to get the bits of the blocks on all of the data drives. The parity drive in particular can become the IO bottleneck.

RAID0 Image

RAID5

RAID5 is essentially RAID4 but with striping. The parity data is distributed across all of the drives so that all of the disks are "hot". The parity of each block level is staggered across each of the disks. The main advantage of RAID4 over RAID5 is that it is easy to add a disk to RAID4.

RAID0 Image

RAID Combinations

A system can be a combination of different RAID formats. One thing to keep in mind is that the rule-of-thumb for the order of RAID combinations is to have concatenation at the top level because it is easier to expand upon if there is no more storage space. In other words, striped mirroring is better than mirrored striping.

Security is the other major problem with NFS servers.

Problems

Here are some of the problems that can arise when working with NFS servers:

  • Clients spoofing themselves as other users
    • e.g. Client pretends to be Professor Eggert with user id 1010
  • Server spoofing (clients fooled)
  • DoS attacks on server
  • Man-in-the-middle attacks

Many security problems are like this. For example, Iran's Supreme Council of Virtual Space, which controls "all" information from in Iran, allegedly performed one such attack. It is suspected that the Supreme Council of Virtual Space was involved with a DoS attack on the BBC due to the BBC Persian Service. Iran has not taken responsibility for the attack but it is suspected they were the party responsible due to the close relation of an earlier effort to jam BBC satellite feeds into the country.

As a result of the many different forms of attack, a good model and checklist is needed ease the design of secure systems.

Three Kinds of Attacks

There are three main types of attacks:

1. Against Privacy

Goal of Attack: Access unauthorized info

2. Against Integrity

Goal of Attack: Tamper with victim's data

3. Against Service

Goal of Attack: Denial of Service

When designing a system, these three attacks need to be accounted for. A common order of importance is: Integrity > Privacy > Service.

Defence Security Goals

To address these kinds of attacks, there are three goals:

  1. Deny unauthorized access
    • Directly addresses integrity and privacy
    • Hard to test, as users who have unauthorized access aren't likely to file a bug report
  2. Allow authorized access
    • Directly addresses integrity and privacy
    • Easy to test, as users who attempt to access their own files will immediately cry out and give notice that they don't have access
  3. Be able to handle LOTS of bogus requests
    • Directly addresses service
    • With a large enough budget, it is easy to test by automating fake requests from servers

The next step is developing a good model for what the bad guys are gonna do.

Threat Modeling and Classification

There are a number of different ways that someone may attack the system:

  • Insiders
  • Social Engineering
    • A notable social engineer who had minimal computer knowledge is Kevin Mitnck
  • Network Attacks
    • Buffer overruns
    • SQL injection attacks
    • Drive by downloads
  • Physical device attacks
    • e.g. Virus on USB flash drives

Kerckhoffs's Design Principle

An important principle in the world of cryptography is Kerckhoffs's principle. The goal of Kerckhoffs's principle is to minimize the amount of data that needs to be kept secret. The assumption is that the bad guys will eventually learn the security design of the system and the only thing keeping the bad guys out is the key.

General functions needed for almost any security method

focused on unauthorized access * authentication * e.g. password * integrity * e.g. checksum * authorization * e.g. access control list (ACL) * auditing * e.g. logs * efficiency * correctness

The goal of authentication is to prevent masquerading.

There are generally two forms of authentication based on where the attacker is in the system. The first is external where the attacker is outside of the system and attempting to gain access into the system. The second is internal where the attacker is already inside the system and attempting to access components of the system.

External

To prevent attackers from accessing the system externally, there are a variety of methods of authentication. The main goal is to make sure that the user attempting to access the system is actually the person that they claim to be.

In the physical world, keys are used to prevent attackers from accessing physical locations. Along the same vein, the foremost popular method of the digital world is the username and password combination. For a while, the password authentication method has proven to be effective. However, today the password method is proving to be insufficient due to a variety of reason. Other forms of authentication is being explored as viable options. Biological methods like fingerprint authentication is being explored. Methods like fingerprint authentication is proving to be a poor authentication method as users leave fingerprints everywhere in the physical world. Currenlty, two-factor authentication is becoming increasingly popular to supplement passwords. Time-based token generation is becoming widely adopted as phone applications like Google Authenticator makes two-factor authentication easy to use and implement.

Internal

The goal of internal authentication is the prevent attackers, that are already in the system, from accessing certain restricted components of the system. One example would be to prevent the attacker from accessing /etc/passwd.

The basic method of restricting users is by using the userid the of the user to restrict the components. By implementing permissions for the internal componenets, the user will be unable to access the restricted components. The Unix file system uses a basic form of this through file permissions. Another layer of internal authentication is through virtualization. By virtualizing stripped down instances of the system to users, the user will be only be able to access those components in the virtualized instance. Another method is through cryptographic authentication. By scrambling the data cryptographically, the data can hide in plain sight. The user would need the cryptographic key in order to unscramble the data. In Unix, the /etc/passwd file uses cryptographic authentication to had the actual passwords. For example, the entry for user eggert may look like this: "eggert: xhQ01FgP10gNd5cX"