Assignment 7. System call programming and debugging

Useful pointers

Laboratory: Buffered versus unbuffered I/O

As usual, keep a log in the file lab.txt of what you do in the lab so that you can reproduce the results later. This should not merely be a transcript of what you typed: it should be more like a true lab notebook, in which you briefly note down what you did and what happened.

For this laboratory, you will implement transliteration programs tr2b and tr2u that use buffered and unbuffered I/O respectively, and compare the resulting implementations and performance. Each implementation should be a main program that takes two operands from and to that are byte strings of the same length, and that copies standard input to standard output, transliterating every byte in from to the corresponding byte in to. Your implementations should report an error from and to are not the same length, or if from has duplicate bytes. To summarize, your implementations should like the standard utility tr does, except that they have no options, characters like [, - and \ have no special meaning in the operands, operand errors must be diagnosed, and your implementations act on bytes rather than on (possibly multibyte) characters.

  1. Write a C transliteration program tr2b.c that uses getchar and putchar to transliterate bytes as described above.
  2. Write a C program tr2u.c that uses read and write to transliterate bytes, instead of using getchar and putchar. The nbyte arguments to read and write should be 1, so that the program reads and writes single bytes at a time.
  3. Use the strace command to compare the system calls issued by your tr2b and tr2u commands (a) when copying one file to another, and (b) when copying a file to your terminal. Use a file that contains at least 5,000,000 bytes.
  4. Use the time command to measure how much faster one program is, compared to the other, when copying the same amount of data.

Homework: Encrypted sort revisited

Rewrite the sfrob program you wrote for Homework 5 so that it uses system calls rather than <stdio.h> to read standard input and write standard output. If standard input is a regular file, your program should initially allocate enough memory to hold all the data in that file all at once, rather than the usual algorithm of reallocating memory as you go. However, if the regular file grows while you are reading it, your program should still work, by allocating more memory after the initial file size has been read.

Your program should do one thing in addition to the Homework 5 program. If successful, it should use the fprintf function to output a line of the following form to standard error before finishing:

Comparisons: 23451

where the integer "23451" is replaced by the actual number of comparisons done by your program, and where a "comparison" is an invocation of frobcmp to compare two input lines. The line should be worded exactly as above: for example, it should contain exactly one space, and it should use base 10 without excess leading zeros. It should be terminated with a newline.

Call the rewritten program sfrobu. Measure any differences in performance between sfrob and sfrobu using the time command. Run your program on inputs of varying numbers of input lines, and estimate the number of comparisons as a function of the number of input lines.

Also, write a shell script sfrobs that uses standard tr and sort to sort encrypted files using a pipeline (that is, your script should not create any temporary files). Use the time command to compare the overall performance of sfrobs to the other two programs. You do not need to count the number of comparisons that sfrobs makes.

Submit

Submit a compressed tarball assign7.tgz containing the following files.

All files should be ASCII text files, with no carriage returns, and with no more than 200 columns per line. The C source file should contain no more than 132 columns per line. The shell commands

tar xf assign7.tgz
expand lab.txt sfrob.txt |
  awk '/\r/ || 200 < length'
expand tr2b.c tr2u.c sfrobu.c sfrobs |
  awk '/\r/ || 132 < length'

should output nothing.


© 2005, 2007, 2009–2011, 2013–2015 Paul Eggert. See copying rules.
$Id: assign7.html,v 1.26 2015/11/09 05:28:07 eggert Exp $