CompSci 142 / CSE 142
Winter 2018 | News | Course
Reference | Schedule | Project Guide | Code Examples
This webpage was adapted from Alex
Thornton’s offering of CS 141
CompSci 142 / CSE 142 Winter 2018
Project #3: Symbol Table
Due date and time: Feb 8 Thursday, 11:59pm; No late submission will be accepted.
Now that we have written a Recognizer for our language, we can proceed to add semantics that will give our programs meaning. As you may have noticed in the tests cases for the previous lab, some of the input source test cases which passed the recognizer would not be considered valid crux programs. For example, consider the following input source:
func main() : void
{
::printInt(x);
}
This
input forms a valid sentence in the Crux grammar, and therefore passes the
recognizer. However, we would not consider it a valid Crux program because no
variable x
has been declared. Our
compiler should detect this "use before definition" error as a semantic
constraint on the sentences that form Crux programs. To catch and report such
errors we shall implement a SymbolTable
.
We
create a Symbol
class, so that our
compiler can model the variables and functions that Crux programmers declare in
their programs. Every time the input source declares a variable or declares a
function, the compiler creates a Symbol
instance to represent it. A SymbolTable
stores each of these Symbol
s
for later retrieval, when the Crux program uses the variable or calls the
function.
Semantically,
the Crux has a layer of scope for each function. According to the Crux
grammar, curly braces only occur in a statement-block
.
However, the statement-block
occurs in
one of two different places:
Conceptually,
each time the Parser
encounters and statement-block
it introduces a new scope to
contain any newly declared Symbols
(variables and functions). This rule is loosened somewhat because Crux's symbol
semantics specify that parameters of a function are also scoped with the
function body.
The
Crux grammar allows curly braces to nest (an if-else inside a while loop inside
a function). Therefore, our SymbolTable
must model the nesting of scopes. Valid Crux source code, contains balanced
curly braces; for every open brace there is a close brace. We can therefore
treat the nested scopes as a stack of SymbolTable
s;
one table for each scope. The parser models the entry of a scope by calling the
enterScope()
helper method.
Similarly, when a scope closes (a closing curly brace), the parser calls exitScope()
.
Conceptually,
the SymbolTable
forms a linked
list of Map<String, Symbol>
s.
Each table has a pointer to a parent table, representing the outer scope. The lookup(String name)
method recursively
walked the list of tables, proceeding from the innermost scope to outermost
scope. The lookup
method returns the
first Symbol
matching name
. Otherwise, it signals that no such symbol by
the given name
exists in any of the
scopes.
Although
we introduce Symbol
s in this Lab, we
will not be detecting type-invalid usage until a later lab. For this lab, we
only check that symbol names do not clash (Redefinition Error) and that they
exist (Unknown Symbol Error). If some Crux source code declares and variable
and later uses that name to call a function, we shall consider it valid because
the name exists in the table when the function is parsed.
Symbol
s are not available outside of the scope
(set of curly braces) where they are declared. If some Crux source code
attempts to use a variable or function name that has not been declared in the
current scope or any outer scope then the Parser signals a
"ResolveSymbolError".
We
implement this functionality through a helper method in the Parser: Symbol tryResolveSymbol(Token ident)
. When the
Parser encounters an identifier use, it calls tryResolveSymbol
to perform a lookup in the currentSymbolTable
.
If the lookup signals a failure, then tryResolveSymbol
logs a "ResolveSymbolError". Otherwise, lookup succeeds and tryResolveSymbol
returns the resulting Symbol
so that the Parser may resume parsing.
In
some other languages, lookup uses both type and name to find the correct
symbol. For example, Java allow function overloading and can distinguish with
method is meant by inspecting the argument types at the callsite. Crux is a
simple language, and uses only the name to distinguish among possible symbols.
Within each scope, all of the Symbol
s
carry a unique name. If some Crux source code attempts to declare two
variables, or two functions, or a function and a variable by the same name, in
the same scope, then the Parser signals a "DeclareSymbolError".
We
implement this functionality through a helper method in the Parser: Symbol tryDeclareSymbol(Token ident)
. When the
parser encounters a variable declaration or function definition, it calls tryDeclareSymbol
to perform an insertion in the currentSymbolTable
. If the insertion signals a
failure, then tryDeclareSymbol
logs a
"DeclareSymbolError". Otherwise, insertion succeeded and tryDeclareSymbol
returns the newly created Symbol
so that the Parser may resume parsing.
The Crux Specification contains a section describing certain predefined functions. We consider these functions to be "built-in", because the Crux programmer doesn't have any way to implement them. However, programming in Crux would be might useless without the simple abilities represented by these functions. Before parsing, the symbol table should be pre-loaded with symbols representing these functions.
The Crux Symbol Semantics section of the Crux Specification. Your program should be able to correctly identify the two errors, DeclareSymbolError and ResolveSymbolError. After encountering one of these two errors, the Parser logs both the error and the current symbol table before proceeding to parse the rest of the token stream. When logged, the symbol table should report symbols in the order in which they were declared in the Crux input program.
For
convenience, you may get a start on this lab by using a pre-made Lab3_SymbolTable.zip project, which contains
skeletons for both the crux.SymbolTable
and crux.Symbol
classes, as
well as a modified skeleton for crux.Parser
which sketches how to handles the error reporting. As before, you are both
allowed and encouraged to make your program easier to read and maintain by
implementing helper functions with good names.
enterRule
and exitRule
functions if
you feel that they clutter your code.Symbol tryDeclareSymbol(Token ident)
.Symbol tryResolveSymbol(Token ident)
.Symbol
class, which
(for now) only holds the name of a symbol.SymbolTable
class,
which models a stack of scopes.Test cases are available in this tests.zip file. The provided tests are not meant to be exhaustive. You are strongly encouraged to construct your own. (If chrome gives you a warning that "tests.zip is not commonly downloaded and could be dangerous" it means that Google hasn't performed an automated a virus scan. This warning can be safely ignored, as the tests.zip file contains only text files.)
A
zip file, named Crux.zip, containing the following files (in the crux
package):
We will release an AutoTester soon. So please make sure that your work meets our requirements. We reserve the right to assign 0 points to any submissions which cannot be automatically unzipped and tested. Additionally, we reserve the right to assign 0 points to any submission which 'games' the automated testing by using a lookup which produces only outputs that correspond to the test cases we happen to use. Be sure to submit the version of the project that you want graded, as we won't regrade if you submit the wrong version by accident.
Enjoy!
·
Adapted from a similar document from CS 141 Winter 2013 by Alex
Thornton,
·