OSGalaxy

published on 2008-11-21 11:44:08
Davyd Madeley So in order to solve some parsing problems at work, I've recently been messing around with ANTLR, a language recognition tool. Doing things like writing grammars to parse some of our files into ASTs and then walking those ASTs using a regular programming language. However, reading and semantically validating that tree is a problem in itself. Checking if nodes have the right numbers and types of children; executing code based on whether or not quite complex chains of nodes exist; these are the sorts of problems that can come up when trying to comprehend the AST you spent so long generating.

It turns out that ANTLR has a pretty nifty solution to this problem in what it calls tree grammars. Tree grammar rules match the nodes of your AST to execute code (actions in ANTLR-speak) in your target language. Here is an example for a simple calculator.

First a grammar to lex and parse the text stream into a binary tree of operations: Calc.g. The rules are more or less in EBNF form (the ^ symbol means to make this the parent tree node) and implement the precedence and associativity for familiar, infix maths. It turns an expression (e.g. 2 * (1 + 2) / a + abs -4 / -2) into an AST:
Since we're creating an interpreter, what we want to do is put together something that will walk the tree we've generated and do some operations. We do this using our tree grammar: CalcWalker.g (notice how CalcWalker refers to Calc for its tokens). Each subrule here attempts to match a tree node and its children, if the rule matches it executes the provided code in our target language (in this case Java).

Making it run is just a matter of setting things up: Test.java (I thought about writing a Java-GTK calculator, but instead I choose to leave that up to someone else; I also thought about generating it all to Javascript and making a Javascript-based web-calculator, but I couldn't seem to build the Javascript support library).

If you want to have a play, here are the source files: Calc.g, CalcWalker.g and Test.java. You'll also need to download the latest ANTLRv3 .jar. Build it all with:
java -classpath antlr.jar org.antlr.Tool Calc*.g # generate Java source from grammars
javac -classpath antlr.jar Calc*.java Test.java # compile everything
and run it something like:
[davyd@frobisher Calc]$ java -classpath antlr.jar:. Test "2 + 2 + 4 * 2"
ans = 12.0
For extra fun, switch to the file backend to make registers work.

Confused? Don't stress, it took me a week or so to really get my head around it (and that was after seeing a tutorial on it at l.c.a in January and buying the book). Plus, I have like two degrees, so I'm probably a lot smarter than you :-P

double bass


> Read More... | Digg This!