Let's try doing Homework 2 over again, this time in a different language, namely Prolog. Maybe we'll have better luck this time.
This time we won't bother to generate a matcher; we'll just have a general predicate that does matching directly. Instead of asking for a derivation, we'll just ask the grammar to be annotated to tell us what to generate; real parsers do this far more often than they generate derivations, so this is more realistic. Also, instead of passing an acceptor explicitly to the matcher, we'll make it the caller's responsibility to decide whether to accept the match. If the caller doesn't like the match, the caller can backtrack into the matcher to get a different solution. Prolog has backtracking built-in, so this is easy to implement and explain.
Write a predicate parse_prefix/4 that accepts the following arguments:
parse_prefix(NT,G,RS,S) should succeed if an initial prefix of RS can be parsed according to G as an instance of NT, such that S is the remaining suffix of RS after the initial prefix is removed. parse_prefix/4 may need to backtrack if there are multiple answers. It can assume that RS is a ground term, that is, that RS contains no logical variables.
In this assignment a grammar is represented as a list of rules. Each rule is a term of the form rule(NT,RHS), which describes a grammar rule whose left hand side is the attributed nonterminal NT and whose right hand side is RHS. RHS is a list of marked symbols. A grammar need not be a ground term; each rule in the grammar is considered separately with respect to the logical variables in the grammar.
A marked symbol is a Prolog term of the form -T, which represents the token T, or of the form +NT, which represents the attributed nonterminal NT.
An attributed nonterminal is a Prolog structure whose function symbol is the nonterminal's name and whose arguments are the attributes. In our test case the attributes are parse trees, but in general they can be any Prolog term.
Here is a predicate awk_grammar/1 that succeeds on an example grammar. You can use this predicate for testing.. You might want to put it into a file named awkgram.prolog. The grammar is the same as Homework 2's awksub_grammar.
awk_grammar([ rule(expr(op(T,B,E)), [+term(T), +binop(B), +expr(E)]), rule(expr(T), [+term(T)]), rule(term(N), [+num(N)]), rule(term(L), [+lvalue(L)]), rule(term(pre(O,L)), [+incrop(O), +lvalue(L)]), rule(term(post(L,O)), [+lvalue(L), +incrop(O)]), rule(term(E), [-'(', +expr(E), -')']), rule(lvalue($(E)), [-($), +expr(E)]), rule(incrop(++), [- ++]), rule(incrop(--), [- --]), rule(binop(+), [-(+)]), rule(binop(-), [-(-)]), rule(num(0), [- 0]), rule(num(1), [- 1]), rule(num(2), [- 2]), rule(num(3), [- 3]), rule(num(4), [- 4]), rule(num(5), [- 5]), rule(num(6), [- 6]), rule(num(7), [- 7]), rule(num(8), [- 8]), rule(num(9), [- 9]) ]).
Your implementation of parse_prefix/4 should not use any side effects or cuts, and should avoid using built-in Prolog predicates as much as possible. As an exception, you may use the side-effect-free predicates defined in the Term processing and List processing sections of the GNU Prolog manual. Note that setarg has side effects and cannot be used.
Submit two files. First, submit a file hw4.prolog containing your predicate definitions. Make sure that your definitions work with gprolog, the Prolog implementation installed on SEASnet in /usr/local/cs/bin; if you type the shell command gprolog --version the first line of output should be Prolog top-Level (GNU Prolog) 1.3.0.
Second, submit an ASCII text file hw4.txt that briefly describes how you arrived at your solution, focusing on alternative approaches that you rejected and why. Also, describe what happens when your implementation of parse_prefix/4 is called in an attempt to implement Homework 1, and why. For example, what happens when you call awk_grammar(G), parse_prefix(expr(op(3,(+),$(4))), G, RS, S) or even awk_grammar(G), parse_prefix(A,G,B,C)?
Please do not put your name, student ID, or other personally identifying information in your files.
In all of these examples, the programmer repeatedly typed ";" in order to backtrack through all possible solutions. (The last example has too many solutions so only the first few are printed.) Your program does not need to return the answers in the same order as shown below. Also, your program's final response need not agree with the yes and ; no responses in the examples shown below.
$ eggert@day:~/junk$ gprolog GNU Prolog 1.3.0 By Daniel Diaz Copyright (C) 1999-2007 Daniel Diaz | ?- consult('awkgram.prolog'). compiling awkgram.prolog for byte code… yes | ?- consult('hw4.prolog'). compiling hw4.prolog for byte code… yes | ?- consult(user). compiling user for byte code... parse_awk(E,RS,S) :- awk_grammar(G), parse_prefix(E,G,RS,S). ^D user compiled,… | ?- parse_awk(expr(E), [ouch], S). no | ?- parse_awk(expr(E), [9], S). E = 9 S = [] ? ; no | ?- parse_awk(expr(E), [9,+,$,1,+], S). E = op(9,+,$(1)) S = [+] ? ; E = 9 S = [+,$,1,+] ? ; no | ?- parse_awk(expr(E), [9,+,$,1,+], []). no | ?- parse_awk(expr(E), [$,1,+,2], S). E = op($(1),+,2) S = [] ? ; E = $(op(1,+,2)) S = [] ? ; E = $(1) S = [+,2] ? ; no | ?- parse_awk(expr(E), [$,3,++,4], [S1|Srest]). E = $(3) S1 = ++ Srest = [4] ? ; E = post($(3),++) S1 = 4 Srest = [] ? ; no | ?- parse_awk(expr(E), [$,3,++,4], [S1|Srest]), integer(S1). E = post($(3),++) S1 = 4 Srest = [] ? ; no | ?- parse_awk( expr(E), ['(', $, 8, ')', -, $, ++, $, --, $, 9, +, '('], S). E = op($(8),-,$(pre(++,$(pre(--,$(9)))))) S = [+,'('] ? ; E = $(8) S = [-,$,++,$,--,$,9,+,'('] ? ; no | ?-