You are a reader for CS 132's Compiler Construction project, which asks students to write parsers for multiple languages, notably for a subset of Java. When you grade these parsers, you need to supply test inputs to them, but it is a pain to construct sample inputs by hand, and you'd rather automate the process. So your decide to write a program that generates random sentences from a given context free grammar, so that you can use the outputs of your program to torture-test parsers submitted by CS 132 students.
You've heard that OCaml is a good language for writing compilers and whatnot, so you decide to give it a try for this application.
However, we'll need to test your program too, so this assignment specifies exactly how you will generate "random" (actually, pseudorandom) sentences from a grammar. To test your program, we will supply a grammar, a pseudorandom number generator (PRNG), and a seed for the PRNG, all of which will be kept secret from you before the test. The desired output of your program is completely determined by these three inputs, so this will simplify our job of testing.
type ('nonterminal, 'terminal) symbol = | N of 'nonterminal | T of 'terminal
let dumb_chooser s n = s mod n, (s * 7919 + 39) mod 65521
Write a higher-quality chooser smarter_chooser, using OCaml's Random module. For efficiency the Random module is stateful, but choosers are supposed to be stateless and are supposed to return results based on their seed values only, so you'll need to write some interface glue that models the chooser's seed using the Random module's internal state. Your chooser should invoke Random.set_state, Random.int, and Random.get_state each exactly once and in that order, and these should be the only calls to any Random function in your code.
Write a curried function combine_choosers that takes as arguments two choosers and produces a combined chooser. The combined chooser should use as its seed a pair of seeds, one for each of the chooser arguments; it should operate by invoking the chooser arguments on their respective seeds to choose among n integers, and it should return a pair consisting of (1) the sum modulo n of the respective integer results, and (2) the pair of the respective seeds. Thus, the type of combine_choosers should be some generalization of:
(('a -> int) -> (int * 'a)) -> (('b -> int) -> (int * 'b)) -> (('a * 'b) -> int -> (int * ('a * 'b)))
Write a curried function random_sentence that accepts a grammar, a chooser, and a seed for that chooser, returning a list of terminal values corresponding to a randomly generated sentence for that grammar. Use preorder traversal to generate the parse tree corresponding to the random sentence; the order is important because we want the solution to be completely determined by the grammar, chooser, and seed, so that we can test your results. In the traversal, whenever you visit a node of the parse tree and have alternatives to choose from, invoke the chooser with the current seed and the number of integers, to obtain an integer value and a new seed. Use the returned integer value i to choose the ith alternative in the list (where the list items are numbered 0, 1, 2,...). Do not invoke a chooser except to make a choice among one or more alternatives.
Write two grammars for your random sentence generator, and supply three test cases for each grammar. Use the test cases below as a model for your grammars and test cases. One of your two grammars should be a transliteration of the Spiglet grammar; it should be as faithful as possible as to the original, e.g., the order of production rules should be the same. In your translation of the Spiglet grammar, use the string "<EOF>" to denote <EOF>, "0" to denote all <INTEGER_LITERAL>s, and "a" to denote all <IDENTIFIER>s. Call the Spiglet grammar spiglet_grammar and its set of nonterminals spiglet_nonterminals. Call the other grammar my_grammar and its set of nonterminals my_nonterminals. Call the test cases spiglet_test0 through spiglet_test2 and my_test0 through my_test2. You can supply more grammars and test cases if you like.
Your code may use the Random and List modules, but it should use no other modules other than your own code. Your code should be free of side effects, except for smarter_chooser's calls to Random functions. Simplicity is more important than efficiency, but your code should avoid using unnecessary time and space when it is easy to do so.
Assess your work by writing an after-action report that summarizes why you solved the problem the way you did, other approaches that you considered and rejected (and why you rejected them), and any weaknesses in your solution in the context of its intended application. This report should be a simple ASCII plain text file that consumes a page or so (at most 100 lines and 80 columns per line, please). See Resources for oral presentations and written reports for advice on how to write assessments; admittedly much of the advice there is overkill for the simple kind of report we're looking for here.
Submit three files via CourseWeb. The file hw1.ml should define smarter_chooser and random_sentence along with any auxiliary types and functions. The file hw1test.ml should contain your test cases. The file hw1.txt should hold your assessment. Please do not put your name, student ID, or other personally identifying information in your files.
let dumb_chooser s n = s mod n, (s * 7919 + 39) mod 65521 let dumb_2_chooser = combine_choosers dumb_chooser dumb_chooser let dumb_22_chooser = combine_choosers dumb_2_chooser dumb_2_chooser let combined_chooser_test1 = dumb_2_chooser (1, 2) 3 = (0, (7958, 15877)) let combined_chooser_test2 = (dumb_22_chooser ((1, 2), (3, 4)) 567 = (10, ((7958, 15877), (23796, 31715)))) (* An example grammar for a small subset of Awk, derived from but not identical to the grammar in <http://www.cs.ucla.edu/classes/winter06/cs132/hw/hw1.html>. *) type awksub_nonterminals = | Expr | Lvalue | Incrop | Binop | Num let awksub_grammar = (Expr, function | Expr -> [[T"("; N Expr; T")"]; [N Num]; [N Expr; N Binop; N Expr]; [N Lvalue]; [N Incrop; N Lvalue]; [N Lvalue; N Incrop]] | Lvalue -> [[T"$"; N Expr]] | Incrop -> [[T"++"]; [T"--"]] | Binop -> [[T"+"]; [T"-"]] | Num -> [[T"0"]; [T"1"]; [T"2"]; [T"3"]; [T"4"]; [T"5"]; [T"6"]; [T"7"]; [T"8"]; [T"9"]]) let awksub_test0 = (random_sentence awksub_grammar dumb_chooser 10 = ["++"; "$"; "4"]) let awksub_test1 = (random_sentence awksub_grammar dumb_chooser 100 = ["--"; "$"; "$"; "2"; "-"; "++"; "$"; "$"; "0"; "+"; "$"; "--"; "$"; "7"; "++"; "++"]) let awksub_test2 = (random_sentence awksub_grammar dumb_chooser 1000 = ["--"; "$"; "$"; "1"]) let awksub_test3 = (random_sentence awksub_grammar dumb_chooser 100000 = ["--"; "$"; "$"; "("; "$"; "8"; "+"; "("; "$"; "++"; "$"; "("; "--"; "$"; "$"; "$"; "$"; "--"; "$"; "$"; "7"; "--"; "+"; "$"; "$"; "$"; "("; "++"; "$"; "9"; ")"; "--"; "--"; "++"; "+"; "++"; "$"; "$"; "5"; "--"; "-"; "2"; ")"; ")"; "+"; "0"; "+"; "$"; "$"; "--"; "$"; "--"; "$"; "("; "9"; "+"; "$"; "--"; "$"; "$"; "$"; "("; "5"; ")"; "++"; "++"; "++"; "+"; "--"; "$"; "--"; "$"; "++"; "$"; "--"; "$"; "5"; "+"; "$"; "$"; "--"; "$"; "--"; "$"; "0"; "--"; "++"; "+"; "--"; "$"; "$"; "("; "8"; ")"; "-"; "--"; "$"; "$"; "4"; "+"; "$"; "--"; "$"; "++"; "$"; "$"; "$"; "1"; "--"; "--"; ")"; "+"; "3"; "++"; "-"; "("; "++"; "$"; "$"; "--"; "$"; "("; "3"; ")"; "++"; ")"; "+"; "--"; "$"; "("; "$"; "("; "--"; "$"; "$"; "$"; "$"; "7"; "--"; "--"; "-"; "4"; "++"; ")"; "--"; ")"; "++"; "-"; "("; "$"; "$"; "$"; "2"; "++"; "--"; ")"; ")"; "--"]) let awksub_test4 = (random_sentence awksub_grammar (combine_choosers dumb_chooser dumb_chooser) (300, 200) = ["--"; "$"; "("; "$"; "$"; "--"; "$"; "++"; "$"; "("; "("; "9"; ")"; "-"; "2"; ")"; "+"; "++"; "$"; "$"; "("; "$"; "$"; "$"; "1"; ")"; "-"; "++"; "$"; "$"; "--"; "$"; "$"; "4"; "-"; "6"; "++"; ")"; "+"; "++"; "$"; "$"; "$"; "--"; "$"; "4"; "-"; "$"; "$"; "$"; "$"; "++"; "$"; "$"; "$"; "$"; "7"; "--"; "++"; "++"]) type giant_nonterminals = | Conversation | Sentence | Grunt | Snore | Shout | Quiet let giant_grammar = (Conversation, function | Conversation -> [[N Snore]; [N Sentence; T","; N Conversation]] | Snore -> [[T"ZZZ"]] | Sentence -> [[N Quiet]; [N Grunt]; [N Shout]] | Quiet -> [] | Grunt -> [[T"khrgh"]] | Shout -> [[T"aooogah!"]]) let giant_test0 = (random_sentence giant_grammar dumb_chooser 11 = [","; "khrgh"; ","; "ZZZ"]) let giant_test1 = (random_sentence giant_grammar dumb_chooser 22 = ["ZZZ"]) let giant_test2 = (random_sentence giant_grammar dumb_chooser 33 = [","; "ZZZ"])
When testing on SEASnet, make sure /usr/local/cs/bin is at the start of your path, so that you get the proper verison of OCaml. To do this, append the following lines to your $HOME/.profile file if you use bash or ksh:
export PATH=/usr/local/cs/bin:$PATH
or the following line to your $HOME/.login file if you use tcsh or csh:
set path=(/usr/local/cs/bin $path)
The command ocaml should output the version number 3.11.0.
If you put the sample test cases into a file hw1sample.ml, you should be able to use it as follows to test your hw1.ml solution on the SEASnet implementation of OCaml. Similarly, the command #use "hw1test.ml";; should run your own test cases on your solution.
$ ocaml Objective Caml version 3.11.0 # #use "hw1.ml";; type ('a, 'b) symbol = N of 'a | T of 'b … # #use "hw1sample.ml";; val dumb_chooser : int -> int -> int * int = <fun> val dumb_2_chooser : int * int -> int -> int * (int * int) = <fun> val dumb_22_chooser : (int * int) * (int * int) -> int -> int * ((int * int) * (int * int)) = <fun> val combined_chooser_test1 : bool = true val combined_chooser_test2 : bool = true type awksub_nonterminals = Expr | Lvalue | Incrop | Binop | Num val awksub_grammar : awksub_nonterminals * (awksub_nonterminals -> (awksub_nonterminals, string) symbol list list) = (Expr, <fun>) val awksub_test0 : bool = true val awksub_test1 : bool = true val awksub_test2 : bool = true val awksub_test3 : bool = true val awksub_test4 : bool = true type giant_nonterminals = Conversation | Sentence | Grunt | Snore | Shout | Quiet val giant_grammar : giant_nonterminals * (giant_nonterminals -> (giant_nonterminals, string) symbol list list) = (Conversation, <fun>) val giant_test0 : bool = true val giant_test1 : bool = true val giant_test2 : bool = true #