JTB
JAVA TREEBUILDER
OTHER TOOLS
JTB Home

What's New

Release Notes

Download

Documentation

Examples

Tutorial

Why Visitors?

Other Tools

  • JJTree
  • ANTLR
  • NewJacc

  • User Comments



    Special Edition for GJ



    Links:
  • JavaCC
  • Design Patterns
  • Javasoft
  • GJ

  • Jump to Java!

    JJTree

    JJTree is a parse tree builder written by the same folks who gave us JavaCC and is included with the JavaCC distribution.  JTB is similar in function to JJTree, however, there are significant differences both in how each tool works and how to program the tree generated by each program.  Some of these differences are listed below: 
    • JTB's design concentrates on simplicity and ease of use while JJTree provides greater flexibility at the cost of added complexity and development time.
    • JTB's tree node classes implement the Visitor design pattern as described in the book Design Patterns: Elements of Reusable Object-Oriented Software.  See the page entitled "Why Visitors" for information on the benefits of this design pattern.  Note that this functionality has recently been added to JJTree.
    • JTB's nodes preserve type information by storing references to the actual types of its children whereas tree nodes created by JJTree store their children as a Vector of nodes.  This eliminates any ambiguity in the tree structure as well as type casts that might be necessary to access specific node types in a JJTree syntax tree.
    • JTB operates on a standard, unmodified JavaCC grammar file.
      • This approach presents several advantages in our eyes:
        • A smaller learning curve since no additional grammar needs to be learned.
        • No additional work needs to be done on a grammar file to prepare it for processing with JTB.
        • A grammar will only have one possible tree structure.  Any programmer who sees a JavaCC grammar file will immediately know the structure of the JTB tree.
      • Although we consider the advantages to be important enough to take this approach, there are also possible disadvantages:
        • The tree generated by JTB may not be as flexible as the one by JJTree.  For example, with JTB, all nonterminals generate a class.  With JJTree, certain productions can be flagged so as not to generate a node class.
        • More memory will be required by a JTB tree as opposed to a JJTree tree which suppresses certain nonterminals from generating nodes.
    Like JJTree, JTB takes a JavaCC grammar file as input and outputs syntax tree node classes as well as an annotated grammar file which contains code to build the tree during parse.  JTB also generates a default Visitor class whose methods visit the the tree depth first. 

    JJTree: A Hands-On Comparison

    I recently attempted to further compare the two tools by using each of them to perform a small sample task: to find undeclared variables in a small subset grammar of Scheme.  This example can be downloaded from the Examples page. 

    In all fairness to JJTree, note that I am not a JJTree expert.  There may be ways of doing things which I was not aware of.  The JJTree examples included with JavaCC which I used as a study model probably didn't cover the whole spectrum of the tool's capabilities as well as tricks, workarounds, and good JJTree programming style.   In addition, working on a larger project would probably reveal more advantages and disadvantages of using JJTree. 

    For this comparison, I used JJTree 0.3pre3 (included with JavaCC 0.7pre5) and JTB 1.0.  My attempt at objective observation produced the following: 
     

    Observation JJTree JTB
    Preparation
    Time
    High. None.
    The grammar had to be analyzed and the parse tree planned.  A mistake in planning the parse tree could result in much time wasted if discovered after visitor programming has begun.  JTB's parse tree is fixed and cannot be manually altered. 
    The grammar file also had to be annotated according to the planned parse tree.  This in turn had to be tested and debugged (for which I used the dump() method of Node). The bare grammar can be used without modification.
    Visitor Programming About equal to JTB. About equal to JJTree.
    Slightly more time had to be spent providing visit() methods which did nothing but visit certain nodes' children. JTB automatically provides a visitor whose methods visit each node's children.  Extending this class allows you to override only the necessary visit() methods in which some task needs to be performed.
    Parse
    Tree
    Nodes
    The working directory got very cluttered after generating the tree classes.  There is no way to automatically generate tree classes into a specified directory.  JTB automatically generates the syntax tree nodes into a package and subdirectory called "syntaxtree", keeping the grammar's directory clean.
    The generated node classes had to be modified, which is common when dealing with parse trees.  This, however, gets somewhat messy when programming for a grammar with hundreds of potential tree node classes. JTB's parse tree classes don't need to be edited.  Once generated, they should be left alone.  All work should be done within visitors. 

    Should certain node classes require data to be stored in them, we have found that a good alternative solution is to store the data in a Hashtable (using the object itself as the hash key) which can be passed from visitor to visitor as needed.

    Children are stored in a vector.  There is no way to guarantee the type or number of children in a node. JTB stores actual references to its children's types.  Children of nodes will always be where you expect them.
    Tokens Special productions and classes are needed to store tokens in the parse tree. JTB automatically stores all tokens in the tree. 

    Advantage: All tokens are readily available for use in the visitors.  No additional work needs to be done to store them. 

    Disadvantage: Memory is wasted on unneeded tokens.

    General
    Observations
    I had to continuously refer back to my annotated grammar file to see how I had arranged the parse tree. JTB inserts helpful comments above visit() methods, indicating precisely which children correspond to what part of the production.  In addition, since the tree will always look the same for a given grammar, there is less ambiguity.

    ANTLR

    ANTLR is a tool descended from the Purdue Compiler Construction Tool Set (a.k.a. PCCTS).  It is not a tool for use with JavaCC; rather it is a set of tools amongst which are a Java parser generator and a tree builder.  I have never used ANTLR or PCCTS so I cannot comment on them.  However, you can find the tool at http://java.magelang.com/antlr
     

    NewJacc

    According to the NewJacc documentation, NewJacc is "a parser generator system built upon Sun Microsystems JavaCC tool and the Purdue University Java Tree Builder (JTB) tool. NewJacc's principle extension is to provide users with a way to associate rewrite rules with individual productions in the language grammar. These rules are used to describe how the parse tree should be traversed. Users can easily control what action is performed at each node in the tree during their traversals. This provides users with great leverage in the construction of a variety of source to source translation tools.

    I have not used NewJacc and am not thoroughly familiar with it, but I have tinkered with some of the examples provided with it, and it does look like a very interesting tool.  NewJacc can be found at http://hopper.cs.wvu.edu/software.html
     


    Maintained by Wanjun Wang, wanjun@purdue.edu.
    Created September 10, 1997.
    Last modified Aug. 6, 1999.