CS 111

Scribe Notes for 4/12/07

by Sona Chaudhuri, Stephen Oakley

Implementing Processes:

Efficiency:

The following list is a series of concepts that add to the efficiency of a given OS. As a sidenote, these techniques are not unique to the kernel, but can also be applied to efficient programming of applications as well.

BATCHING
CACHING
PREFETCHING

Code Example 1

Consider the following code for reading an input buffer. How can it be optimized?


1  int getc(FILE*);
2  if(currentOffset < end)
3      return buf[currentOffset++];
4  else
5      read(buff, bufsize..etc) //results in slow, repeated talks with the OS

NOTE: Most OS will use getc(), NOT read() so as to improve efficiency. Calling getc() helps to reduce runtime as it includes caching, and batching. Once again, this conceopt can also be applied to applications.

Code Example 2

Consider the following code for reading an input buffer by sector. How is it different from above and what optimizations, if any would help the overall performance of this function?


in readsector():
//put this in kernel so that higher level programs won't have to worry about whether or not
//the program is stand alone.
1  while((inb(0x1f7)&0xc0)!=0x40)//while the disk is busy
2  	//continue() is a busy wait that repeatedly harangues the OS (like an impetulant child) 
3   	continue();
3     //Thus, use schedule() instead of continue. schedule() looks for some other runnable process
4     schedule();

NOTE: The main purpose of this function is to improve performance. MULTITASKING is the key term. Basically, it is important to know how to use the scarce resources of the CPU efficiently. OS is all about resource management and how to efficiently run several applications on the same machine

Stand Alone vs. Multitasking

GOAL: We want the program to be oblivious as to whether the OS is standalone or multitasked.

Outline for the rest of Lecture

Given that a process is a program running on a "virtual" machine:

What can you build with them that you didn't already have with standalone applications?
How do we implement processes?
Look at a small subset of the UNIX API

A. What can you build with processes that you couldn't build before?

Virtualization
Client/Service

B. How do we implement processes?

Here, we look at a client/ervice process and three different implementations of each:

Implementation #1:

(+) don't waste disk space
(+) less I/O overhead
(+) B can start generating output right away
(+) save time

Implementation #2:

(+) can fit more programs into main memory
(+) now a checkpoint exists, other programs can access 'file'
(+) can re-read input if need be

How to handle the data-flows-too-fast problm?

We could have no flow control, have the kernel record data. However, this could be bad because you mayrun out of memory.
Need flow control in the form of a BOUNDED BUFFER in the link
There are two ways of implementing this: POLLING versus BLOCKING

POLLING: Example: The send fails if there is no memory in the link and A waits until the link is free to send to B.
: An advantage of this is that the process can go do other things while this executes. Thus, it is good for complex applications.
BLOCKING: Example:If the above situation occurs, this process sends blocks.
: An advantage of this approach is that the OS is free to do other things while the process is stalled. Thus, it is better for simple applications.

*Use BLOCKING to implement a BOUNDED BUFFER

1  struct port{
2	lock_t lock; //allows exclusive access to the port
3	size_t in,out; //in=number of msgs sent; out=number msgs received; assume no overflow
4	message_t 'buf'[N];
5  }  ports[p]
6  void send(int port, message_t m){
7	struct port* p = &ports[port]; //get access to a port
8	for(;;){
9	  ACQUIRE(p->lock); //now we have exclusive access to this port
10	  bool ok = false;
11	  if(p->in-p->out < N) { //if true, there is room in buffer
12		p-buf[p->in++%N] = m;
13		ok = true;
14	  }
15	  RELEASE(p->lock); //release lock so others can have access
16	  if(ok) 
17		return;
18	  else
19		YIELD(); //tells teh OS to run other processes if needed


Now, let's look at the receive() function:

1  message_t receive(int port){
2	message_t m;
3	struct port* p = &ports[port]; //get access to a port
4	for(;;){
5	  ACQUIRE(p->lock); //now we have exclusive access to this port
6	  bool ok = false;
7	  if(p->in != p->out) { //if true, something is in the buffer
8		m = p->buf[p->out++ %N];
9		ok = true;
10	  }
11	  RELEASE(p->lock); //release lock so others can have access
12	  //If in between the ACQUIRE and RELEASE calls, both 
13	  //send and receive are not running as only one is accessing the port.
14	  if(ok) 
15		return;
16	  else
17		YIELD(); }//tells the OS to run other processes if needed
18	return m;}


A look from the kernel's side (in pseudocode):

1  for(;;){
2	choose a process p that isn't blocked;
3	load p's state into CPU IP;
4	(p is now running)
5	when it hits a yield/interrupt, it goes back to kernel;
6	for each process p that has become ready, unblock(p);


In kernel:
The following processor descriptor table is reminiscent of those actually present n the kernel.
 _______
|___1___|
|___2___| <-saved program counter, other registers, link/file descriptors, etc.
|___3___|	//contains everything you need to know about the process to run it.
|___4___|

UNIX API for processes {subset}

pid_t fork();
*Clones current process
*Returns 0 if you are running in child
*Returns a processIDOfChild > 0 if you are in the parent
*Return -1 if the fork failed
int execvp(char const* file, char const* argv);
*discard current program, replacing it with the contents of the file
*invoke the main routine from that file
*always returns -1, and sets 'errno'
*this is because if the process executed correctly, this function won't have to return
pid_t waitpid(pid_t pid, int* status, int options);
exit(int status);