rockit: a parser generator for ruby
DESCRIPTION
Discusses Rockit, a parser generator for the Ruby langauge. Touches on versions 0.3.8 and 0.7.2, which are GLR and PEG, respectively. Describes GLR and PEG. Given for http://www.cs.rit.edu/~ats/lp-2006-2/TRANSCRIPT
Rockit
Jason MorrisonLanguage Processors and Compiler Construction
Winter 2006-3
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!
It used to be GLR
– First home I found:• http://rockit.sourceforge.net/
– Version 0.3.8, updated October 2001
– What did it look like?
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!
Ruby Rockit: Overview
• It used to be GLR.– What’s that?
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!
What’s GLR?
– Generalized LR
– Parse tables have multiple transitions
– When it hits S/R or R/R conflict:• Fork parse stack• Follow both paths• Got a rule that produces no transition?
Discard that stack.
How well does it work?
– Run time scales with degree of nondeterminism.
– O(n) on deterministic grammar.
– Optimize: share common stack prefix/suffix.(Leads to a state lattice rather than stack)
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!
Maybe in the future it will be PEG.
– Rockit 0.8.0
– C-based parser, Ruby production rules
– Unambiguous, lookahead,O(n) Packrat parsing
– Ruby and Java grammars, analyzers, pretty-printers
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.– …PEG? Packrat?
• It’s kind of there.
• Let’s see!
What’s PEG?
• “Parsing Expression Grammar”
• No separate tokenizing phase
• Unambiguous (one tree for valid parse)
• No left recursion (as with LL)
What’s PEG?
• Fixes “dangling else” via prioritization
• Infinite lookahead via syntactic predicates
• Let’s hear it from B ryan Ford (coined PEG)
http://www.bford.info/pub/lang/peg-slides.pdf
What’s Packrat?
• Essentially: recursive descent
• Memoization yields O(n) with unlimited lookahead and backtracking
• Packrat parsers:– Java: Rats! (from xtc, eXTensible C project)– Haskell: Pappy
Packrat: Since when?
• Theory has been formalized since the 70s
• Memoization was too heavy, then.
• Now, RAM is cheap; memoization often consumes less than subsequent compilation steps.
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.– What do you mean by that?
• Let’s see!
Rockit 0.7.2
– Newest available– Deemed 0.8.0 preview– Pure Ruby (=> slow) recursive-descent parser– Does not support syntactic predicates– Examples are only documentation– Unorganized source– No error hooks
Ruby Rockit: Overview
• It used to be GLR.
• Maybe in the future it will be PEG.
• It’s kind of there.
• Let’s see!– Let’s see it anyways!
Conclusion
• 0.7.2 is interesting, but doesn’t have:– Syntactic predicates– Memoization
• Still denotes grammar in code.– Pros: Quick, familiar.– Cons: Difficult to analyze.
• If 0.8.0 is released as promised, should be great!
Resources
• These slides and I:– http://jayunit.net/rockit– [email protected]
• PEG ML:– https://lists.csail.mit.edu/mailman/listinfo/peg
• Bryan Ford:– http://pdos.csail.mit.edu/~baford/packrat
• Wikipedia, Google for:– GLR parser, PEG grammar, Packrat parser