struct pipe{ mutex_t l; ... }; void writec(struct pipe*p, char c){ lock(&p->l); [write a byte] unlock(&p->l); }
Suppose that it takes a long time to write a byte, then all other processes are caught up busy waiting for the lock to be free.
To solve this issue of processes being busy waiting for a lock, let's try blocking mutexes...
struct pipe{ bmutex_t b; ... }; void writec(struct pipe*p, char c){ acquire(&p->b); [write a byte] release(&p->b); }
Internally the lock for bmutex_t yields if someone else has lock. This means that even the blocking mutexes are busy waiting because the CPU has to check through the queue of processes - this chews up the CPU.
We need a way to tell scheduler that it a process is waiting and what that process is waiting for, so it isn't constantly being checked...
Need to tell kernel:
For it to work you need:
An example pipe structing implemented with condition variables:
struct pipe{ bmutex_t b; condvar_t nonfull; //w-r < N which is true if pipe not full condvar_t nonempty; //true if pipe not empty size_t r, w; char buf[N]; };
wait(condvar_t *c, bmutex_t b)
Precondition: the blocking mutex b is acquired.
Internally, this function releases b, waits, and then reacquires b.
notify(condvar_t *c)
Notifies a process waiting on a condition c to be true that the condition is now true.
notifyAll(condvar_t *c)
Similar to notify
except that it awakens all processes instead of just one.
Example write function implemented using condition variables:
void write(struct pipe *p, char c){ acquire(&p->b); while(p->w - p->r == N) wait(&p->nonfull, &p->b); /*do the write*/ notify(&p->nonempty); release(&p->b); }
If the pipe is full, the process will wait until the pipe is not full (the condition in wait(&p->nonfull, &p->b)
).
The function will also notify the next process that the pipe is not empty after a write is made (notify(&p->nonempty)
).
Implements blocking mutex with int not bool.
Locked when counter = 0. Unlcked when counter > 0.
P = down (acquire)
"prolaag" = "probeer te verlagen = try and decrease
V = up (release)
"verhoog" = increase
Deadlock is a race condition!
Hard to debug
Rare
Can be disastrous (We all want to keep our jobs right?)
4 conditions that can cause deadlock:
Circular wait
Mutual exclusion
Need objects which allow only one process to access at one time.
No preemption of locks
Nothing can force another process to give up lock.
Hold and wait
Have ability to hold object and try to get another.
Strategies for avoiding deadlock:
Write perfect code (a bit impractical!)
Threads hold at most one lock
Downside: no balance transfers!
Lock preemption
Downside: complicates code for process that got booted.
Acquire function checks for circular wait, fails if found
Downside: slows down acquire function, complicates user code.
Lock ordering
If you need to acquire more than one lock, always acquire locks in numeric order (perhaps order could be determined by checking addresses)
Example of acquiring multiple locks using try_acquire:
for(i = 0; i < 5; i++){ /*try_acquire will fail if it couldn't acquire lock*/ if(try_acquire(&l[i]) < 0){ for(j = 0; j < i; j++) release(&l[j]); i++; } }
There was a time when disks were faster than CPUs. This caused file systems to be implemented with simple code. Nowadays, with CPU speeds far greater than disk speeds, we can have tricky file systems because CPUs can handle non-trivial code.
Barracuda XT 3TB ($200) | Corsair Force 60GB SSD ($150) | ||
Cache | 64 MB | Data transfer rate | 3 Gb/s |
# of sectors | 3,907,029,168 | Average operating power | 2.0 W |
Spin speed | 7200 rpm = 120 Hz | Average idle power | 0.5 W |
Data transfer rate | 6 Gb/s | Sustained data transfer rate (write) | 270 MB/s |
Sustained data transfer rate | 138 MB/s | Sustained data transfer rate (read) | 280 MB/s |
I/O data transfer rate | 600 Mb/s | IOPS (I/O per sec) | 50K |
Average latency | 4.16 ms | ||
Random read seek time | 8.5 ms | ||
Random write seek time | 9.5 ms | ||
Average operating power | 9.23 W | ||
Average idle power | 6.39 W |
*Data transfer rate = how fast data can be transferred out of the cache.
*Average latency = assumes arm is in the correct spot.
*Random read seek time = assumes arm is in incorrect spot.
*Random write seek time = avoids overwriting, so it takes a little longer than reading.
**Compare the SSD's 50K IOPS vs. the hard-drive's 70 approx. IOPS.
Summary: SSD is fast, but small and pricey. Hard disk may be slower but way larger and cheaper per GB.
(strategies to get I/O from disk faster)
Reading a single character from the disk
read(fd, &c, 1);
Average read seek | + | 8.5 ms |
Latency | + | 4.16 ms |
Out of cache | + | ~0.01 ms |
Into cache off disk | + | ~0.04 ms |
CPU overhead | + | ~0.01 ms |
TOTAL |
+ |
13.72 ms |
**This translates to reading characters about 70 bytes/s. Not very impressive.
Reading using a buffer (1KB)
char buf[1024]; read(fd,buf,1024);
**This turns out to read characters about 70 KB/s. Improved by 1000 times!
Reading using an even larger buffer (1GB)
char buf[1024*1024*1024]; read(fd,buf,1024*1024*1024);
**This turns out to read characters about 130 MB/s. Improved by 2000 times!
Why not always just read with 1 GB buffer?
Average read seek | + | 8.5 ms |
Latency | + | 4.16 ms |
Out of cache | + | ~0.01 ms |
Into cache off disk | + | ~7000 ms |
CPU overhead | + | ~5 ms |
TOTAL |
+ |
7017 ms |
read(fd, &c, 1);
sys_read(disk, buf, 1024*1024*1024);
and cache recent read.An important thing to note is that as throughput increases, latency decreases:
70 bytes/s | 13.72 ms |
70 KB/s | 13.72 ms |
130 MB/s | 7017 ms |
So there must be a balance between the size of the read and the length of time it takes to read.
We can model disks as a linear array:
Cost of accessing block b is |b-h|.