treetop austinrb

25
TREETOP: Making Parsers Fun Monday October 7 th @ Capital Factory

Upload: patrick-ritchie

Post on 20-Jun-2015

203 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Treetop austinrb

TREETOP: Making Parsers Fun

Monday October 7th @ Capital Factory

Page 2: Treetop austinrb

Say what? Why do I need a Parser?Argentina.blft # real world example Big Lever Gears Format

propertyUrlPrefix = "propiedad";landingPageUrlPrefix = "alquileres-vacaciones";propertyReviewsPrefix = "reviews";propertyReviewsWritePrefix = "reviews/write";propertyReviewsConfirmPrefix = "reviews/confirm";propertyReviewsResponsePrefix = "reviews/response";

Page 3: Treetop austinrb
Page 4: Treetop austinrb

This could work…

Page 5: Treetop austinrb

But…Locale = { es_AR;}BrandId = 68;BrandConfig{ OmnitureAccount { Dev = "homeawayardev"; } linguaEnabled = false;}

Nesting

Enum

Number

String

Boolean

Page 6: Treetop austinrb

Ok… maybe not

# this just extracts the values, we haven’t even # begun to set the correct type or handle the # nesting/([a-zA-Z0-9])+=([a-zA-Z]+[^;]*|'"'[^"]*'"'|[0-9]+|(true|false))/

Page 7: Treetop austinrb
Page 8: Treetop austinrb

Ok, so we need a Parser

Page 9: Treetop austinrb
Page 10: Treetop austinrb

.tt (Treetop) Grammar Files

Page 11: Treetop austinrb

Using your grammars inRuby

$tt foo.treetop bar.treetop$tt foo.treetop -o foogrammar.rb

Page 12: Treetop austinrb

Using Parsers in Ruby

Page 13: Treetop austinrb

Let’s try this again

Page 14: Treetop austinrb

What about nesting?

Page 15: Treetop austinrb
Page 16: Treetop austinrb

But It gets better!

Page 17: Treetop austinrb

Use Ruby in your .tt files!

Page 18: Treetop austinrb

Alternate Method

Page 19: Treetop austinrb

Bringing it all together

Page 20: Treetop austinrb

Bringing it all together Part II

Page 21: Treetop austinrb
Page 22: Treetop austinrb

Gotchas• First rule must match the document

Page 23: Treetop austinrb

On Parsing Expression Grammars

Parsing expression grammars (PEGs) are an alternative to context free grammars for formally specifying syntax, and packrat parsers are parsers for PEGs that operate in guaranteed linear time through the use of memoization.

• Linear time, fast!• Memory hog, storage proportional to the total input size• Not suitable for natural language processing

Page 24: Treetop austinrb

Further Reading

Parsing Expression Grammars• http://en.wikipedia.org/wiki/Parsing_expression_grammar• http://bford.info/packrat/

Treetop• http://treetop.rubyforge.org/• http://github.com/nathansobo/treetop/tree/master• https://groups.google.com/forum/#!forum/treetop-dev

Page 25: Treetop austinrb

Thank You!

Patrick Ritchie@pritchie