parsing in perlnereida.deioc.ull.es/~pl/pspdf/otros/ye06-parsers.pdfparser attempts to locate the...
Post on 17-May-2018
216 Views
Preview:
TRANSCRIPT
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
What we will talk about
Parsing...
what it is...
the tools to make it!
But not how to do it!
show some examples...
and compare their efficiency.
Alberto Simoes Parsing in Perl
Parsing
In computer science, parsing is the process of analyzing an inputsequence (read from a file or a keyboard, for example) in order todetermine its grammatical structure with respect to a given formalgrammar. It is formally named syntax analysis. A parser is acomputer program that carries out this task. The name isanalogous with the usage in grammar and linguistics.
Parsing transforms input text into a data structure, usually a tree,which is suitable for later processing and which captures theimplied hierarchy of the input. Generally, parsers operate in twostages, first identifying the meaningful tokens in the input, andthen building a parse tree from those tokens.
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
Parsing
In computer science, parsing is the process of analyzing an inputsequence (read from a file or a keyboard, for example) in order todetermine its grammatical structure with respect to a given formalgrammar. It is formally named syntax analysis. A parser is acomputer program that carries out this task. The name isanalogous with the usage in grammar and linguistics.
Parsing transforms input text into a data structure, usually a tree,which is suitable for later processing and which captures theimplied hierarchy of the input. Generally, parsers operate in twostages, first identifying the meaningful tokens in the input, andthen building a parse tree from those tokens.
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
The Process
Lexical analysis is the processing of an input sequence ofcharacters (such as the source code of a computer program)to produce, as output, a sequence of symbols called“lexicaltokens”, or just “tokens”. For example, lexers for manyprogramming languages convert the character sequence 123abc into two tokens: 123 and abc (whitespace is not a tokenin most languages). The purpose of producing these tokens isusually to forward them as input to another program, such asa parser.
Syntax analysis is a process in compilers that recognizes thestructure of programming languages. It is also known asparsing.
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
The Process
Lexical analysis is the processing of an input sequence ofcharacters (such as the source code of a computer program)to produce, as output, a sequence of symbols called“lexicaltokens”, or just “tokens”. For example, lexers for manyprogramming languages convert the character sequence 123abc into two tokens: 123 and abc (whitespace is not a tokenin most languages). The purpose of producing these tokens isusually to forward them as input to another program, such asa parser.
Syntax analysis is a process in compilers that recognizes thestructure of programming languages. It is also known asparsing.
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
Approaches
Top-down parsing - A parser can start with the start symboland try to transform it to the input. Intuitively, the parserstarts from the largest elements and breaks them down intoincrementally smaller parts. LL parsers are examples oftop-down parsers.
Bottom-up parsing - A parser can start with the input andattempt to rewrite it to the start symbol. Intuitively, theparser attempts to locate the most basic elements, then theelements containing these, and so on. LR parsers are examplesof bottom-up parsers. Another term used for this type ofparser is Shift-Reduce parsing
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
Approaches
Top-down parsing - A parser can start with the start symboland try to transform it to the input. Intuitively, the parserstarts from the largest elements and breaks them down intoincrementally smaller parts. LL parsers are examples oftop-down parsers.
Bottom-up parsing - A parser can start with the input andattempt to rewrite it to the start symbol. Intuitively, theparser attempts to locate the most basic elements, then theelements containing these, and so on. LR parsers are examplesof bottom-up parsers. Another term used for this type ofparser is Shift-Reduce parsing
Wikipedia (August 2006)
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
What is Parsing?
to recognize portions of text:
detect tokens;
integers, reals, strings, variables, reserved words, etc.
analyze a specific token sequence:
detect syntax;
define the order tokens make sense;
interpret the sequence and perform an action:
perform semantic actions;
execute the code defined; generate code;
Alberto Simoes Parsing in Perl
So, Regular Expressions?
yes!
RegExp are good for tokens;RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Alberto Simoes Parsing in Perl
So, Regular Expressions?
yes!
RegExp are good for tokens;RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Alberto Simoes Parsing in Perl
So, Regular Expressions?
yes!
RegExp are good for tokens;RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Alberto Simoes Parsing in Perl
So, Regular Expressions?
yes!
RegExp are good for tokens;RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Alberto Simoes Parsing in Perl
So, Regular Expressions?
yes!
RegExp are good for tokens;RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
Then?
Typically:
flex for lexical analysis(re2c for thread-safe and reentrancy);bison for syntactic analysis(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;Parse::RecDescent;Parse::Yapp;Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
What I’ve tested
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as abaseline of efficiency.
Alberto Simoes Parsing in Perl
My Test Case (1/2)
a simple calculator;sums, subtractions, variables, prints;BNF:
Program ← Statement Program
Statement
Statement ← Variable ′ =′ Expression ′;′
′print ′ Expression ′;′
Expression ← Expression ′ −′ Expression
Expression ′ +′ Expression
Variable
Number
Number ← /\d + /
Variable ← /[a− z] + /
Alberto Simoes Parsing in Perl
My Test Case (1/2)
a simple calculator;sums, subtractions, variables, prints;BNF:
Program ← Statement Program
Statement
Statement ← Variable ′ =′ Expression ′;′
′print ′ Expression ′;′
Expression ← Expression ′ −′ Expression
Expression ′ +′ Expression
Variable
Number
Number ← /\d + /
Variable ← /[a− z] + /
Alberto Simoes Parsing in Perl
My Test Case (1/2)
a simple calculator;sums, subtractions, variables, prints;BNF:
Program ← Statement Program
Statement
Statement ← Variable ′ =′ Expression ′;′
′print ′ Expression ′;′
Expression ← Expression ′ −′ Expression
Expression ′ +′ Expression
Variable
Number
Number ← /\d + /
Variable ← /[a− z] + /
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
My Test Case (2/2)
automatic test generation;
randomly add, subtract and define variables;
randomly print variables;
example:a = 10;a = 150 - a + 350;print a;
different test sizes:10 lines;100 lines;1 000 lines;10 000 lines;100 000 lines;1 000 000 lines;2 000 000 lines;4 000 000 lines;6 000 000 lines;
Alberto Simoes Parsing in Perl
Parse::RecDescent ID
Author: Damian Conway
Lastest Release: 1.94 (April 9, 2003)
Available from: CPAN
Alberto Simoes Parsing in Perl
Parse::RecDescent rationale
⇑ full Perl implementation;⇑ mixed lexical and syntactic analyzer in same code;⇓ slow;⇓ only support LL(1) grammars;
Alberto Simoes Parsing in Perl
Parse::RecDescent
use Parse::RecDescent;
our %VAR;
my $grammar = q{
Program: Statement(s) /\Z/ { 1 }
Statement: Var ’=’ Expression ’;’ { $main::VAR{$item[1]} = $item[3]; }
| /print/ Expression ’;’ { print "> $item[2]\n"; }
Expression: Number ’+’ Expression { $item[1]+$item[3] }
| Number ’-’ Expression { $item[1]-$item[3] }
| Var ’+’ Expression { ($main::VAR{$item[1]} || 0) + $item[3] }
| Var ’-’ Expression { ($main::VAR{$item[1]} || 0) + $item[3] }
| Var { $main::VAR{$item[1]} || 0; }
| Number { $item[1]; }
Number: /+./
Var: /[a-z]+/
};
my $parser = new Parse::RecDescent($grammar);
undef $/;
my $text = <STDIN>;
$parser->Program($text) or die "** Parse Error **\n";
Alberto Simoes Parsing in Perl
Problems
Unfortunately, the program does not respect left association of theoperators. Couldn’t manage to solve that (didn’t try hard).
3− 2 + 1 is evaluated as Number(3)− Expression(2 + 1), thus,evaluating it to 0 instead of the correct answer: 2
Well, I had a cheat version, but it made the test program a lotslower than it is at the moment.
Alberto Simoes Parsing in Perl
Problems
Unfortunately, the program does not respect left association of theoperators. Couldn’t manage to solve that (didn’t try hard).
3− 2 + 1 is evaluated as Number(3)− Expression(2 + 1), thus,evaluating it to 0 instead of the correct answer: 2
Well, I had a cheat version, but it made the test program a lotslower than it is at the moment.
Alberto Simoes Parsing in Perl
Problems
Unfortunately, the program does not respect left association of theoperators. Couldn’t manage to solve that (didn’t try hard).
3− 2 + 1 is evaluated as Number(3)− Expression(2 + 1), thus,evaluating it to 0 instead of the correct answer: 2
Well, I had a cheat version, but it made the test program a lotslower than it is at the moment.
Alberto Simoes Parsing in Perl
Parse::RecDescent timings
test size spent time
10 0.104 s100 0.203 s
1 000 1.520 s10 000 87.310 s
Alberto Simoes Parsing in Perl
Parse::RecDescent Memory Usage
perl recdes.pl 1,778,617,585,999 bytes x ms
ms0.0 20000.040000.060000.080000.0100000.0120000.0140000.0160000.0180000.0200000.0220000.0240000.0
byte
s
0M
2M
4M
6M
heap-admin
x809F54B:Perl_safesysrea
x809F49D:Perl_safesysmal
test file with 10 000 lines
Alberto Simoes Parsing in Perl
Parse::YAPP ID
Author: Francois Desarmenien
Lastest Release: 1.05 (Nov 4, 2001)
Available from: CPAN
Alberto Simoes Parsing in Perl
Parse::YAPP rationale
⇑ full Perl implementation;⇑ supports bison-like LR grammars;⇓ you need to specify your own lexical analyzer;⇓ slow for big input files...
if you do not prepare a good lexical analyzer;
Alberto Simoes Parsing in Perl
Parse::Yapp
%left ’+’ ’-’
%%
Program : Statement
| Program Statement
;
Statement : Var ’=’ Expression ’;’ { $main::VAR$_[1] = $_[3] }
| Print Expression ’;’ { print "> $_[2]\n" }
;
Expression : Expression ’-’ Expression { $_[1] - $_[3] }
| Expression ’+’ Expression { $_[1] + $_[3] }
| Var { $main::VAR{$_[1]} || 0 }
| Number { $_[1] }
;
%%
our %VAR;
my $p = new Calc();
undef $/;
my $File = <STDIN>;
$p->YYParse( yylex => \&yylex,yyerror => \&yyerror);
Alberto Simoes Parsing in Perl
Parse::Yapp
sub yyerror {
if ($_[0]->YYCurtok) {
printf STDERR (’Error: a "%s" (%s) was fond where %s was expected’."\n",$_[0]->YYCurtok, $_[0]->YYCurval, $_[0]->YYExpect)
} else {
print STDERR "Expecting one of ",join(", ",$_[0]->YYExpect),"\n";}
}
sub yylex{
for($File){
1 while (s!^(\s+|\n)!!g); # Advance spaces
return ("","") if $_ eq ""; # EOF
# Tokens
s!^(\d+)!! and return ("Number", $1);
s!^print!! and return ("Print", "print");
s!^([a-z]+)!! and return ("Var", $1);
# Operators
s!([;+-=])!! and return ($1,$1);
print STDERR "Unexpected symbols: ’$File’\n" ;
}
}
Alberto Simoes Parsing in Perl
Parse::YAPP timings
test size Parse::RecDescent Parse::YAPP
10 0.104 s 0.016 s100 0.203 s 0.034 s
1 000 1.520 s 0.272 s10 000 87.310 s 4.972 s
100 000 — 2 253.657 s
Alberto Simoes Parsing in Perl
Parse::Yapp Memory Usage
perl Calc.pl 74,532,562,124 bytes x ms
ms0.0 20000.0 40000.0 60000.0
byte
s
0k
200k
400k
600k
800k
1,000k
1,200k
x809F54B:Perl_safesysrea
heap-admin
x809F49D:Perl_safesysmal
test file with 10 000 lines
Alberto Simoes Parsing in Perl
Parse::YAPP + flex ID
Idea by: Alberto Simoes
Latest Release: n/a
Available from: The Perl Review v0i3, 2002
Alberto Simoes Parsing in Perl
Parse::YAPP+flex rationale
⇑ fast and robust for big input files;⇑ supports bison-like LR grammars;⇓ to glue Perl and C takes some work;⇓ you need a C compiler;⇓ you need to know a little of C and flex;
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: the lexical analyzer
%{
#define YY_DECL char* yylex() void;
%}
char buffer[15];
%%
"print" { return strcpy(buffer, "Print"); }
[0-9]+ { return strcpy(buffer, "Number"); }
[a-z]+ { return strcpy(buffer, "Var"); }
\n { }
" " { }
. { return strcpy(buffer, yytext); }
%%
int perl_yywrap(void) { return 1; }
char *perl_yylextext(void) { return perl_yytext; }
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: the syntactic analyzer
%left ’+’ ’-’
%%
Program : Statement
| Program Statement
;
Statement : Var ’=’ Expression ’;’ { $main::VAR$_[1] = $_[3] }
| Print Expression ’;’ { print "> $_[2]\n"; }
;
Expression : Expression ’-’ Expression { $_[1] - $_[3] }
| Expression ’+’ Expression { $_[1] + $_[3] }
| Var { $main::VAR{$_[1]} || 0 }
| Number { $_[1] }
;
%%
our %VAR;
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::Yapp + flex: just that?
NO!
you need XS glue code;you need some Perl glue code;you need a decent makefile;
Can you give details?
Check my article “Cooking Perl with Flex” in TPR v0i3, 2002;http://alfarrabio.di.uminho.pt/~albie/publications/perlflex.pdf
Alberto Simoes Parsing in Perl
Parse::YAPP + flex timings
test size RecDescent YAPP YAPP + flex
10 0.104 s 0.016 s 0.034 s100 0.203 s 0.034 s 0.049 s
1 000 1.520 s 0.272 s 0.174 s10 000 87.310 s 4.972 s 1.168 s
100 000 — 2 253.657 s 12.145 s1 000 000 — — 122.377 s2 000 000 — — 264.219 s4 000 000 — — 530.527 s6 000 000 — — 800.705 s
Alberto Simoes Parsing in Perl
Parse::Yapp + flex Memory Usage
perl parse.pl 20,106,601,308 bytes x ms
ms0.0 2000.0 4000.0 6000.0 8000.010000.012000.014000.016000.018000.020000.022000.0
byte
s
0k
200k
400k
600k
x809F54B:Perl_safesysrea
x4032CAF:perl_yyalloc
heap-admin
x809F49D:Perl_safesysmal
test file with 10 000 lines
Alberto Simoes Parsing in Perl
Parrot Grammar Engine ID
Author: mostly, Patrick Michaud
Lastest Release: to be released yet
Available from: Parrot releases or Parrot SVN tree
Alberto Simoes Parsing in Perl
PGE rationale
⇑ built-in in Perl 6;⇑ includes constructs to simplify the LL(1) constrain;m not yet fast... but we are working on it;⇓ Mainly a top-down parser (although bottom-up should also be supported);⇓ ATM you need to write semantic actions in PIR;
Alberto Simoes Parsing in Perl
PGE implementation
grammar Benchmark;
token program { <?statement>+ }
rule statement {
| print <expression> ; {{ $I0 = match[’expression’];
print $I0; print "\n" }}
| <var> = <expression> ; {{ $P0 = match[’expression’];
$S0 = match[’var’]; set_global $S0, $P0 }}
}
rule expression { <value> [ <add> | <sub> ]* {{ $I0 = match[’value’]
# 25 lines removed...
.return($I0) }}
}
rule add { \+ <value> }
rule sub { \- <value> }
rule value { <number> {{ $I0 = match[’number’]; .return ($I0) }}
| <var> {{ $S0 = match[’var’];
$P0 = get_global $S0; $I0 = $P0; .return($I0) }}
}
token number { \d+ }
token var { <[a..z]>+ }
Alberto Simoes Parsing in Perl
PGE timings
test size RecDescent YAPP YAPP + flex PGE10 0.104 s 0.016 s 0.034 s 0.124 s
100 0.203 s 0.034 s 0.049 s 0.253 s1 000 1.520 s 0.272 s 0.174 s 1.463 s
10 000 87.310 s 4.972 s 1.168 s 16.189 s100 000 — 2 253.657 s 12.145 s 665.746 s
1 000 000 — — 122.377 s —2 000 000 — — 264.219 s —4 000 000 — — 530.527 s —6 000 000 — — 800.705 s —
Alberto Simoes Parsing in Perl
PGE Memory Usage
../../../../parrot -j main.pir 92,090,753,626 bytes x ms
ms0.0 2000.0 4000.0 6000.0 8000.0 10000.0 12000.0
byte
s
0M
2M
4M
6M
8M
x417A880:mem__sys_reallo
heap-admin
x417A82F:mem__internal_a
x417A73D:mem_sys_allocat
x417A7DF:mem_sys_allocat
test file with 10 000 lines
Alberto Simoes Parsing in Perl
Remember I had C implementations?
Let’s look into their memory usage.
Alberto Simoes Parsing in Perl
Remember I had C implementations?
Let’s look into their memory usage.
Alberto Simoes Parsing in Perl
Timings for C implementations
test size Parse:: Parse:: YAPP PGE re2c + flex +RecDescent YAPP + flex lemon bison
10 0.104 s 0.016 s 0.034 s 0.124 s 0.001 s 0.001 s100 0.203 s 0.034 s 0.049 s 0.253 s 0.001 s 0.001 s
1 000 1.520 s 0.272 s 0.174 s 1.463 s 0.002 s 0.002 s10 000 87.310 s 4.972 s 1.168 s 16.189 s 0.009 s 0.009 s
100 000 — 2 253.657 s 12.145 s 665.746 s 0.089 s 0.103 s1 000 000 — — 122.377 s — 0.850 s 0.862 s2 000 000 — — 264.219 s — 1.896 s 1.891 s4 000 000 — — 530.527 s — 4.327 s 3.604 s6 000 000 — — 800.705 s — 5.681 s 5.665 s
Alberto Simoes Parsing in Perl
flex+bison Memory Usage
parser 16,427,193 bytes x ms
ms0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0
byte
s
0k
20k
40k
60k
x401914F:posix_memalign
x40625FE:g_malloc0
x80492D9:yyalloc
test file with 10 000 lines
Alberto Simoes Parsing in Perl
re2c+lemon Memory Usage
parser 1,418,530 bytes x ms
ms0.0 50.0 100.0 150.0 200.0 250.0 300.0
byte
s
0k
2k
4k
6k
heap-admin
x8048BD2:ParseAlloc
x401914F:posix_memalign
x40625FE:g_malloc0
test file with 10 000 lines
Alberto Simoes Parsing in Perl
Performance Comparison
0.001
0.01
0.1
1
10
100
1000
10000
10 100 1000 10000 100000 1e+06 1e+07
Tim
e (s
econ
ds)
Test Size (lines)
re2c+lemonbison+flex
Parse::Yapp + flexPGE
Parse::YappParse::RecDescent
Alberto Simoes Parsing in Perl
Thanks!!
Luciano Rocha for the flex + bison and re2c + lemonimplementations;
Ruben Fonseca for the PGE idea;
Patrick Michaud and Kevin Tew for the PGE implementation;
and, of course, Larry Wall, Gloria Wall, Leopold Toetsch, ChipSalzenberg, Allison Randal, Damian Conway, AnnaKournikova, Francois Desarmenien, Jerry Gay, Will Coleda,Simon Cozens, Vern Paxson, Jef Poskanzer, Kevin Gong, briand foy, Santa Claus, Audrey Tang, Jose Joao Almeida,Batman, Jonathan Scott Duff, Nuno Carvalho, Marty Pauley,Leon Brocard, Josette Garcia, James Tisdall, Jose Castro,Michael Schwern, Pamela Anderson, Andy Lester, Abigail,Nicholas Clark, Magda Joana Silva, Matt Diephouse, IlyaMartynov, Wikipedia, Randal Schwartz, Dan Sugalski, JonOrwant, Tom Christiansen, Johan Vromans, ........................
Alberto Simoes Parsing in Perl
Thanks!!
Luciano Rocha for the flex + bison and re2c + lemonimplementations;
Ruben Fonseca for the PGE idea;
Patrick Michaud and Kevin Tew for the PGE implementation;
and, of course, Larry Wall, Gloria Wall, Leopold Toetsch, ChipSalzenberg, Allison Randal, Damian Conway, AnnaKournikova, Francois Desarmenien, Jerry Gay, Will Coleda,Simon Cozens, Vern Paxson, Jef Poskanzer, Kevin Gong, briand foy, Santa Claus, Audrey Tang, Jose Joao Almeida,Batman, Jonathan Scott Duff, Nuno Carvalho, Marty Pauley,Leon Brocard, Josette Garcia, James Tisdall, Jose Castro,Michael Schwern, Pamela Anderson, Andy Lester, Abigail,Nicholas Clark, Magda Joana Silva, Matt Diephouse, IlyaMartynov, Wikipedia, Randal Schwartz, Dan Sugalski, JonOrwant, Tom Christiansen, Johan Vromans, ........................
Alberto Simoes Parsing in Perl
Thanks!!
Luciano Rocha for the flex + bison and re2c + lemonimplementations;
Ruben Fonseca for the PGE idea;
Patrick Michaud and Kevin Tew for the PGE implementation;
and, of course, Larry Wall, Gloria Wall, Leopold Toetsch, ChipSalzenberg, Allison Randal, Damian Conway, AnnaKournikova, Francois Desarmenien, Jerry Gay, Will Coleda,Simon Cozens, Vern Paxson, Jef Poskanzer, Kevin Gong, briand foy, Santa Claus, Audrey Tang, Jose Joao Almeida,Batman, Jonathan Scott Duff, Nuno Carvalho, Marty Pauley,Leon Brocard, Josette Garcia, James Tisdall, Jose Castro,Michael Schwern, Pamela Anderson, Andy Lester, Abigail,Nicholas Clark, Magda Joana Silva, Matt Diephouse, IlyaMartynov, Wikipedia, Randal Schwartz, Dan Sugalski, JonOrwant, Tom Christiansen, Johan Vromans, ........................
Alberto Simoes Parsing in Perl
Thanks!!
Luciano Rocha for the flex + bison and re2c + lemonimplementations;
Ruben Fonseca for the PGE idea;
Patrick Michaud and Kevin Tew for the PGE implementation;
and, of course, Larry Wall, Gloria Wall, Leopold Toetsch, ChipSalzenberg, Allison Randal, Damian Conway, AnnaKournikova, Francois Desarmenien, Jerry Gay, Will Coleda,Simon Cozens, Vern Paxson, Jef Poskanzer, Kevin Gong, briand foy, Santa Claus, Audrey Tang, Jose Joao Almeida,Batman, Jonathan Scott Duff, Nuno Carvalho, Marty Pauley,Leon Brocard, Josette Garcia, James Tisdall, Jose Castro,Michael Schwern, Pamela Anderson, Andy Lester, Abigail,Nicholas Clark, Magda Joana Silva, Matt Diephouse, IlyaMartynov, Wikipedia, Randal Schwartz, Dan Sugalski, JonOrwant, Tom Christiansen, Johan Vromans, ........................
Alberto Simoes Parsing in Perl
top related