antlr4 get the right tool for the job

12
© Zühlke 2015 Antlr4 Get the right tool for the job Antlr4 | Alexander Pacha 24. July 2015 Slide 1

Upload: alexander-pacha

Post on 08-Jan-2017

281 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Antlr4Get the right tool for the job

24. July 2015 Slide 1

Page 2: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

What is Antlr?

Another Tool for Language Recognition

24. July 2015 Slide 2

Page 3: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Basic Concepts

• Every language has syntax and semantic• A parser is a syntax analyzer

• Two steps:• Lexical Analysis: Grouping words into tokens• Actual parsing: Recognize sentence structure and build

parse tree

Languages and Parsers

int a = 42 + 3

24. July 2015 Slide 3

Page 4: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Antlr Features

• Parser generator from specified grammar (in Antlr Meta-language)

• Generated Parser in selected target language (e.g. Java, C#, Python)

• High performance• High flexibility (e.g. grammar islands, rewriting input

stream)• Cool features (e.g. error-handling, visitors, listeners)

a = (42 + 3

24. July 2015 Slide 4

Page 5: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Building an application with AntlrGrammar in Extended Backus-Naur-Format (EBNF)

grammar MyGrammar;rule1 : «stuff»;rule2 : «more stuff»;

Convenience operators: Optional (?), Zero-or-more (*), One-or-more (+)Lexer-Rules APPLE: ‘apple‘;

INT: [0-9]+;

Parser-Rules• Sequence decimal: INT ‘.‘ INT;

• Token dependence vector: ‘[‘ INT+ ‘]‘;

• Choice fruit: APPLE | ORANGE;

• Nested phrase breakfast: fruit JOGHURT;24. July 2015 Slide 5

Page 6: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Grammar Samplegrammar LabeledExpr;

prog: stat+ ;

stat: expr NEWLINE # printExpr | ID '=' expr NEWLINE # assign | CLEAR NEWLINE # clearCmd | NEWLINE # blank ;

expr: expr op=('*'|'/') expr # MulDiv | expr op=('+'|'-') expr # AddSub | INT # int | ID # id | '(' expr ')' # parens ;

MUL : '*' ;DIV : '/' ;ADD : '+' ;SUB : '-' ;PRINT: 'print';CLEAR: 'clear' ;ID : [a-zA-Z]+ ;INT : [0-9]+ ;NEWLINE:'\r'? '\n' ;WS : [ \t]+ -> skip ;

24. July 2015 Slide 6

Page 7: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Generated Tree a = 42 + 3 b = (a - 5) * 2 5 + 4 clear b

Sample program

24. July 2015 Slide 7

Page 8: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Listener Sample

Output:

package Sample1;

public class SimpleListener extends LabeledExprBaseListener {

@Override public void enterInt(LabeledExprParser.IntContext ctx) { System.out.println(ctx.getText()); }

@Override public void enterId(LabeledExprParser.IdContext ctx) { System.out.println("ID: " + ctx.getText()); }}

42 3 ID: a 5 2 5 4 ID: b

24. July 2015 Slide 8

Page 9: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Visitor Samplepackage Sample1;

public class SimpleVisitor extends LabeledExprBaseVisitor {

@Override public Object visitAssign(LabeledExprParser.AssignContext ctx) { System.out.println(ctx.getText()); return null; //return super.visitAssign(ctx); }

@Override public Object visitAddSub(LabeledExprParser.AddSubContext ctx) { System.out.println(ctx.getText()); return null; }}

Output: a=42+3 b=(a-5)*2 5+4

24. July 2015 Slide 9

Page 10: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Quiz

Sample data:

Goal:

Bonus 1: Allow , or ; to be used as separatorBonus 2: Allow integer and decimal values (e.g. 33.15)

Create grammar to parse CSV-files

2,34,1313,33,149,66,94

24. July 2015 Slide 10

Page 11: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

Example Solution

grammar CommaSeparatedValues;

file: row+;row: field (',' field)* NEWLINE;field: INT;

INT: [0-9]+;NEWLINE: '\r'? '\n';

//Bonus 1:row: field ((','|';') field)* NEWLINE;

//Bonus 2:field: INT | DECIMAL;DECIMAL: INT '.' INT;

24. July 2015 Slide 11

Page 12: Antlr4   get the right tool for the job

© Zühlke 2015Antlr4 | Alexander Pacha

PostScript Parser Demo

Lisual 2.025

Lisual 2.025

24. July 2015 Slide 12