binary studio academy pro: antlr course by alexander vasiltsov (lesson 4)
TRANSCRIPT
ANTLR 4
Cool Features
by Alexander Vasiltsov
What features?
● Recursive rules
● Precedence
● Associativity
● Actions
● Attributes
● Semantic predicates
● Commands
Recursive rules
expr : expr ‘*’ expr
| expr ‘+’ expr
| INT
| ‘(‘ expr ‘)’
;
expr : addExpr;
addExpr : multExpr (‘+’ multExpr)*;
multExpr : atom (‘*’ atom)*;
atom : INT;
Precedence
Rule or alternative described earlier has higher
priority than others
Associativity
By default, ANTLR
associates operators left to
right
It’s possible to change the
direction of associativity
expr : <assoc=right> expr ‘^’ expr
| INT
;
expr : expr ‘^’ expr
| INT
;
assoc option
expr : <assoc=right> expr ‘^’ expr
| expr ‘*’ expr
| expr ‘+’ expr
| INT
| ‘(‘ expr ‘)’
;
Actions
rule : subrule {<some code in target language> }
;
rule2 : alternative1 {<some code for alternative1> }
| alternative2 {<some code for alternative2> }
;
rule3 : subrule1 {<some code after subrule1 matched> }
TOKEN {<some code after TOKEN matched> }
subrule2 {<some code after whole rule matched> }
;
Actions and attributes
<rulename> [<arguments list>]returns [<return values>]locals [<local members>]@init {
<init code>}@after {
<finalization code>}: <rule description>;
Actions and attributes (2)rule [int i, string s] returns [string[] vals, int max]locals [int loc=0]@init {
//some init code here}@after {
//some post-processing code}: 'TOKENS'+;
public partial class RuleContext : ParserRuleContext {public int i;public string s;public string[] vals;public int max;public int loc;
public RuleContext(ParserRuleContext parent, intinvokingState, int i, string s) : base(parent, invokingState)
{this.i = i;this.s = s;
}...
}
public RuleContext rule(int i, string s) {//some init code here... //parsing logic here//some post-processing code
}
Custom attributes
parent_rule : rule[0, "test"] {Console.WriteLine("max: "+ $rule.max);};
rule [int i, string s] returns [string[] vals, int max]locals [int loc=0]@init {
$ctx.vals = new[] {s};$ctx.loc = i + 10;
}@after {
$ctx.max *= 10;}
: (TOKEN { if ($ctx.loc > $TOKEN.text.Length) $ctx.max++;})+;
Predefined token attributes
Attribute Typ
e
Description
text string Text matched
type int Token type
line int Line number (counting from 1)
pos int Position in line (counting from 0)
index int Overall token offset inside input stream
channel int Channel where token was emited
int int Token integer value
Predefined rule attributes
Attribute Type Description
$ctx ParserRuleContext Context object
$ctx.GetText() string Text matched
$ctx.start IToken First token
$ctx.stop IToken Last token
Where else to place code
grammar MyGrammar;
@header {
using System;
}
@members {
private int myField;
public int MyProperty {get; set;}
public void MyMethod() {
//do something
}
}
rule : subrule TOKEN+ ;
grammar MyGrammar;
@lexer::header { ... }
@parser::header { ... }
@lexer::members { ... }
@parser::members { ... }
rule : subrule TOKEN+ ;
Bonus for C#
● MyGrammar.g4.lexer.csnamespace MyNamespace{
partial class SimpleLexer{}
}
● MyGrammar.g4.parser.csnamespace MyNamespace{
partial class SimpleParser{}
}
Semantic predicates
Lexer commands
TokenName: alternative -> command-name[(parameter)] ;
skip Skips token, doesn’t send it to parser
type(T) Set type T to token
channel(C) Send token in channel C
mode(M) Switch lexer to mode M
pushMode(M) Switch lexer to mode M, put current mode to the
stack
popMode Switch lexer to mode taken from the stack’s top
more Match token but continue searching
type(T)
Sets required type to the token
channel(C)
Sends token in desired channel
Predefined channels:
● Token.DEFAULT_CHANNEL
● Token.HIDDEN_CHANNEL
mode(M), pushMode(M), popMode
Switch lexer to defined mode.
Each mode contains its own set of rules
Default mode: DEFAULT_MODE
more
Matches token but continues searching
For those who like hardcoreIt’s possible to override any method in generated lexer and parser classes. Chose one that fits your
needs and act!