Assignment 2. Shell scripting

Laboratory: Spell-checking Hawai‘ian

Keep a log in the file lab2.log of what you do in the lab so that you can reproduce the results later. This should not merely be a transcript of what you typed: it should be more like a true lab notebook, in which you briefly note down what you did and what happened.

For this laboratory we assume you're in the standard C or POSIX locale. The shell command locale should output LC_CTYPE="POSIX". If it doesn't, use the following shell command:

export LC_ALL='C'

and make sure locale outputs the right thing afterwards.

We also assume the file words contains a sorted list of English words. Create such a file by taking the contents of the file /usr/dict/words on SEASnet, and putting it in your working directory. Be careful, though, as the SEASnet file is not entirely sorted, so you'll have to sort your copy.

Start by taking a text file containing the HTML in this assignment's web page, and running the following commands with that text file being standard input. Describe generally what each command outputs (in particular, how its output differs from that of the previous command), and why.

tr -c 'A-Za-z' '[\n*]'
tr -cs 'A-Za-z' '[\n*]'
tr -cs 'A-Za-z' '[\n*]' | sort
tr -cs 'A-Za-z' '[\n*]' | sort -u
tr -cs 'A-Za-z' '[\n*]' | sort -u | comm - words
tr -cs 'A-Za-z' '[\n*]' | sort -u | comm -23 - words

Let's take the last command as the crude implementation of an English spelling checker. Suppose we want to change it to be a spelling checker for Hawai‘ian, a language that has only the following letters (or their capitalized equivalents):

p k ' m n w l h a e i o u

(In this lab for convenience we use ASCII apostrophe (') to represent the Hawai‘ian ‘okina (‘); it has no capitalized equivalent.)

Create in the file hwords a trivial Hawai‘ian dictionary containing the following words:

'a'ole
'ae
'o
'oe
Hawai'i
O'ahu
akamai
aloha
kumu

Add to the file hwords all the lines in words that contain only valid Hawai‘ian characters. hwords, like words, should be sorted. How many lines are in hwords compared to words?

Modify the last command above so that it checks the spelling of Hawai‘ian rather than English, under the (ridiculous) assumption that hwords is a Hawai‘ian dictionary.

Check your work by running your Hawai‘ian spelling checker on this web page, and on the Hawai‘ian dictionary hwords itself. Count the number of "misspelled" English and Hawai‘ian words on this web page, using your spelling checkers. Are there any words that are "misspelled" as English, but not as Hawai‘ian? or "misspelled" as Hawai‘ian but not as English? If so, give examples.

Homework: DOS-style file renaming

Most of you don't remember when DOS was pretty cool, but one nice feature that it had was that its rename command can rename from one pattern to another. For example, in DOS if you type ren *.txt *.doc, it will rename all the files that end with .txt, and rename them so that they end in .doc. The mv command in GNU/Linux doesn't work this way. You're going to write a shell script, fmv (Fancy MoVe) that does work this way.

To make things a bit simpler, we'll restrict the input that you have to deal with. The two arguments will contain exactly one *, and at least one letter, but possibly many letters, either before or after the *.

For instance fmv a* b* would rename all the files or directories that start with a so that they start with b instead.

  1. Assume that if the first argument has the * before other letters, then so will the second, etc. Write fmv under this assumption.
  2. Now write fmv2 so that it doesn't matter if the * is in front on both arguments, or at the end.
  3. Finally, write fmv3 so that it does everything that fmv2 except it ignores directories if there's an option of -f before the two arguments.

Submit

Submit the following files.

All files should be ASCII text files, with no carriage returns, and with no more than 80 columns per line. The shell command:

awk '/\r/ || 80 < length' lab2.log fmv fmv2 fvm3

should output nothing.


© 2005 Paul Eggert and Steve VanDeBogart. © 2007 Paul Eggert. See copying rules.
$Id: assign2.html,v 1.6 2007/01/15 17:28:27 eggert Exp $