LECTURE 17: Introduction to security/authentication
What can go wrong with Network File Systems (NFS) and Common Internet File Systems (CIFS)

I. Media Faults (even in non-networked systems)
II. Security Holes

Media Faults
Lampson Sturgis (?): An algorithm which guarantees reliable storage of data in a distributed system, even when different portions of the data base, stored on separate machines, are updated as part of a single transaction. The algorithm is implemented by a hierarchy of rather simple abstractions, and it works properly regardless of crashes of the client or servers.
Log Structured File System






RAID solution? (Redundant Array Independent Disk; The I originally stood for inexpensive)
-Created in Berkeley to solve the problem of expense of drives when size got too large
-The cost vs capacity decreases to a sweet spot, but then increases as capacity gets larger
-Ideal scenario would be to buy only the drives with "sweet spot" capacity

Example

Virtual: TTTTTTTTTT (represents a 10 TB HDD)
Physical: T T T T T T T T T T (Represents 10 separate 1 TB HDD's)

This example shows what we are trying to accomplish
Want a 10 TB virtual drive that is composed of 10 1 TB drives because a 1 TB drive is more cost effective per TB of space

Drive Failure Rate

-Assume the failure rate of the above example is 2% per year
-Annualized failure rate (AFR) gives the estimated probability that a device or component will fail during a full year of use. It is a relation between the mean time
-between failure (MTBF) and the hours that a number of devices are run per year.

Calculating Failure

For the case where the failure rate is 2% per year: (1-.98^10)
Assumption: No disks are replaced
Weird/Wrong: This is not true if you replace disks as you go

Key Terms

AFR: Annualized Failure Rate described above

MTTF: Mean Time to Failure: extremely similar to another related term, mean time between failures (MTBF). The difference between these terms is that while MTBF is
used for products than that can be repaired and returned to use, MTTF is used for non-repairable products

PDF: Probability Distribution/Density Function: a function that describes the relative likelihood for this random variable to take on a given value. The
probability of the random variable falling within a particular range of values is given by the integral of this variable’s density over that range

The actual PDF for the above example is similar to a bathtub curve graph. PDF starts high dips low and returns to a high point

RAID Types

RAID 0: Concatenation - low reliability
RAID 1: Mirroring (N=2 common) - very expensive to go higher
Uses more than one drive to represent the same data
eg: for a 1 TB virtual drive we use 2 1 TB physical drives
+ read seeks are fast
- write seeks can be slower
.
.
.
RAID 4: Combination of Mirroring and Concatenation explained below

RAID 4

May choose any number you wish (N = ?)

Virtual Drive Size = (N - 1) # physical drives

[][][][][][] (Represent 6 drives for N = 6 as our choice for drives)

[A][B][C][D][E][PARITY] : The 6th drive is the parity drive

Parity Drive
A^B^C^D^E: The exclusive or of all previous blocks
If any block is lost it can be recovered because of the parity block
For blocks A-E: simply take the exclusive or of the remaining blocks to get the original block back
For the Parity block: take the exclusive or of blocks A-E as was done to get the original parity block

Cost of RAID 4: (N/N-1) x Original Cost

Failure Rate

N α (FR)(FR(Window of Repair))
(hours?) (days?)
eg: Transfer TB's of info

Trip to Mars

cdf for RAID 4 is worse than using a single drive when there can't be any replacement on Mars mission

Parity Drive (is hot): bottleneck for writes because all writes must go through parity drive
gets very busy

RAID 5

A RAID 5 comprises block-level striping with distributed parity. Unlike in RAID 4, parity information is
distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a
single drive, subsequent reads can be calculated from the distributed parity such that no data is lost.
RAID 5 requires at least three disks.

What can go wrong with NFS Part II

NFS Client ---> read/write (1000:eggert 1097: faculty) ----> network -----> NFS Server

Break In
Suppose we come up with a rogue NFS Client (user:1000 and group: 1097)
If sent to our NFS Client above it would be recognized as eggert with group faculty and would give access
to eggert's files under the faculty group
How to Fix:
1) Trusted Clients Only (popular, simple, fast) used in machine rooms
2) Encryption of Data/Authentication of Users (slower, more complicated)

Security

Real World Security defends us against:
Force
Computer Security deals with this as well
Fraud
Internet makes fraud a larger concern

Main Form of Attacks:
1) Against privacy (unauthorized release of data)
>Allow authorized access (positive goal)
>prevent unauthorized access (negative goal, harder)
2) Against integrity (tampering with others data)
3) Against service (denial of service)

Example: Supreme Council of Virtual Space
-Reports directly to Supreme Leader of Iran
-Denial of Service Attack on BBC
>took out website and satellite feeds
-Blocked more than 1/2 of worlds websites (in Iran)

Threat Modelling and Classification

1) Insiders (SEASNET admin?)
2) Social Engineering (outsider pretending to be insider)
-K Mitnick disguised as Telephone operator and put in tapping devices
3) Network Attacks
-Distributed Denial of Service (DDoS)
-Drive by Download (DBD)
-Virus
-Most often attacks image libraries because they are low security and written by amateurs
4) Device Attacks
-USB virus

General Design Problems

Kerchkhoff's Desgin Principle: Minimize what needs to be secret
Counter-Example:
Scrambling used for DVD's
Where's the secret?
Fixed set of keys that had to be kept secret - eventually were leaked

General functions needed for computer based security system

1) Authentication eg Password
2) Integrity eg Checksum
3) Authorization eg ACL (access control list)
4) Auditing eg logs (against insiders, imposters)

Want 1-4 to be orthogonal

5) Correctness
6) Efficiency

Authentication

How to Authenticate?

Warner Bros Pres wants to drive up and have car recognition so he doesn't need a badge
-based on:
Who the principal is (retinal scan, thumbprint scan)
Something the principal has (smart card, say)
Something the principal knows (password)
-2 Factor Authentication:
IANA - Changing Internet Time Stamps