CS 111 Lecture 10 Scribe Notes

Scribes: Ken Ohhashi, Colin Fong, Stanway Liau

May 5, 2014


Conditions for Deadlock


  1. Mutual Exclusion: At least one of the resources of a process cannot be shared at any given time
  2. Hold And Wait: A process can hold a resource or or wait for one.
  3. No Preemption: The operating system cannot take away resources from a process once it has started. The process must release the resource itself.
  4. Circular waiting: A process is waiting for a resource held by another process, which is also waiting for the first process to release its own resource.


Resource Allocation Graph/ Wait-For Graph


There are two deadlock cycles in this case.
There is no deadlock in this case even though there is a cycle.

Deadlock Prevention


  1. Lock Ordering
  2. If you find a lock busy, give all your locks up and start over (skips hold and wait O(1))

Quasideadlock


A higher level process is always waiting on a lower priority process

EXAMPLE

-There are 3 levels of priority for threads


		    *Runs from left to right and downward

		    |Tlow(runnable)   | Tmed(waiting)   | Thigh(waiting)

		  1.|acquire(&r)----->|-context switch->| (runnable)

		  2.|             |         | acquire(&r)

		  3.|                 |   (runnable)    | (waiting)

		
  1. low priority thread will acquire resource and switch to high priority thread
  2. the high priority thread will attempt to grab the resource that the low level process just grabbed
  3. the high priority process will yield and wait on the next prioritized process. Medium level processes can run indefinitely and starve the high priority thread because the low priority thread will never run and release the resource
Potential Fixes

Recieve Livelock



		            BUFFER

		       INPUT            OUTPUT
		requests----->[=======================]---->CPU

		       100 Gb/s           10 Gb/s
		
  1. An interrupt service routine grabs data and puts it into the buffer that has a higher priority than the current work routine of the cpu
  2. The cpu can be overwhelmed by the buffer with higher priority requests instead of doing its own work

Data Races Are Evil


The basic problem we have is that there is more than one thread running at one time. What if we use just one thread? This will solve our problem of data races, but we still want to be able to wait for events. We can use event-driven programming to do this. In this type of programming, there is usually a main loop that listens for events. Below is a simple illustration of the concept:


		  main program {
		    while (live) {
		      wait for some event e; //only thing that waits for an event
		      e->handler(e); //handlers never wait and must complete quickly
		    }
		  }
		 

Upsides:

Downsides:


Although multithreading is out of the question, we can still scale via multiprocess/multisystem instead. Some supporters of event-driven programming claim that the scaling provided by multiprocess makes that of multithreading negligible, thus making that downside irrelevant.


Hardware Lock Elision


In multithreaded programming, too much CPU time is spent waiting for locks to unlock. To improve the wait time, we can get hardware support through Hardware Lock Elision(HLE)

Below are example Haswell instructions for HLE. Haswell is a codename for a processor microarchitecture, although this information is not critical to understanding the concept.


		lock:
		  movl $1,%eax
			
		try:
		  xacquire lock xchgl %eax, lock
		  cmp $0, %eax
		  jnz try
		  ret

		unlock:
		  xrelease movl $0,lock
		  ret
		

HLE adds two instruction prefixes, xacquire and xrelease, to the architecture. xacquire allows the thread to move ahead "optimistically", executing subsequent instructions, as if it had the lock. The thread modifies its own cache, but doesn’t flush it out to memory. Once it gets to, xrelease will check the value that was traded into the lock, and if the thread didn’t get the lock properly, it will discard all changes in its cache that occurred and act as if none of those instructions happened. If it did get the lock, the cache is then flushed into RAM, since the instructions are valid.


File System Performance


Say we have a General Parallel File System (GPFS) with 120PB (~200,000 hard drives, each 600GB). Here are some performance ideas for such a file system:

1) Striping

Usually, processing long files requires enormous buffers. By distributing segments of the file to multiple devices and accessing each segment concurrently, we can increase the total throughput and process the file quicker.

Illustration of how striping works

2) Distributed Metadata

Metadata contains information about a file like its location, owners, timestamp, etc. Each node in a file system can act as a data node that holds file data, or a meta node that holds metadata. There are no master/slave nodes, it is just a cloud of data and meta nodes.

3) Efficient directory indexing

Suppose a directory in the file system has 10,000,000 files. If the directory is implemented with an internal data structure, instead of , An possible data structure for this is the B+ tree, which is an n-ary tree with an often large number of children per node. Because this limits the depth of the tree, stored data/files can be retrieved efficiently with minimal I/O operations..

4) Network Partition Awareness

This is when a file system is partitioned, the larger partition stays live (can write, read, remove, and add files to it), and the smaller partition is frozen, or read-only.

5) File system stays live during maintenance

Sometimes, one can't connect to a server like the SEASNet Linux server or MyUCLA because they are doing maintenance on it. It would be more convenient for users if the server could stay live during maintenance, but it would be hard to implement, since users actions could affect some part of maintenance that could be dangerous for the server/file system.

These are all performance issues that would be implemented in an ideal system, but alas, is difficult to do.


[Some] Performance Metrics for File Systems


We will analyze these three performance metrics for file systems:

  1. Latency: delay from request to response
  2. Throughput: rate of request completions
  3. Utilization: fraction of capacity doing useful work

Assume a system with the following characteristics:

Some secondary storage devices


busy wait + polling


	/* Busy wait + polling code */
	for int(;;) {
		char buf[40];
		read-40-bytes-from-SSD-into-buf;

		compute(buf);
	}
		
  1. Latency
  2. Throughput
  3. Utilization


Batching


	/* Batching code */
	for (;;) {
		char buf[820];	//larger batch
		read 840 bytes;
		for (i = 0; i < 21; i++)
			compute(buf + i*40);
	}
		
  1. Latency
  2. Throughput
  3. Utilization


Avoid busy waiting using Interrupts


	/* Avoid busy waiting code */
	for (;;) {
		do {
			block-until-interrupt;
			handle-interrupt;
		} while (!ready)
	}
		
  1. Latency
  2. Throughput
  3. Utilization


Direct Memory Access (DMA)


	/* DMA code */
	while (1) {
		write-command-to-disk;
		block-until-interrupt;
		check-disk-is-ready;
		compute;
	}
		
  1. Latency
  2. Throughput
  3. Utilization
Memory accesses don't have to go through the CPU


Polling + DMA


	/* Polling + DMA code */
	while (DMA-slots-not-ready)
		schedule();
		
  1. Latency
  2. Throughput
  3. Utilization