climbing the abstract syntax tree (php uk 2018)

123
@asgrim Climbing the Abstract Syntax Tree James Titcumb PHP UK 2018

Upload: james-titcumb

Post on 18-Mar-2018

132 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Climbing theAbstract Syntax Tree

James TitcumbPHP UK 2018

Page 2: Climbing the Abstract Syntax Tree (PHP UK 2018)

$ whoami

James Titcumb

www.jamestitcumb.com

www.roave.com

@asgrim

Page 3: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

How PHP works

PHP code

OpCache

Execute (VM)

Lexer + Parser

Compiler

Page 4: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The PHP Lexer

zend_language_scanner.l

Page 5: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 6: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 7: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 8: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_IN_SCRIPTING>"exit" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"die" {

RETURN_TOKEN(T_EXIT);

}

<ST_IN_SCRIPTING>"function" {

RETURN_TOKEN(T_FUNCTION);

}

Page 9: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 10: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 11: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 12: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 13: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 14: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 15: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 16: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 17: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_scanner.l

<ST_DOUBLE_QUOTES,ST_BACKQUOTE,ST_HEREDOC>"${" {

yy_push_state(ST_LOOKING_FOR_VARNAME);

RETURN_TOKEN(T_DOLLAR_OPEN_CURLY_BRACES);

}

<ST_LOOKING_FOR_VARNAME>{LABEL}[[}] {

yyless(yyleng - 1);

zend_copy_value(zendlval, yytext, yyleng);

yy_pop_state();

yy_push_state(ST_IN_SCRIPTING);

RETURN_TOKEN(T_STRING_VARNAME);

}

Page 18: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The PHP Lexer

zend_language_scanner.l

Page 19: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The PHP Lexer

zend_language_scanner.l

re2c

Page 20: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The PHP Lexer

zend_language_scanner.l

re2c

zend_language_scanner.c

Page 21: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The PHP Parser

zend_language_parser.y

Page 22: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_parser.y

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

Page 23: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 24: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 25: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 26: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 27: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 28: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 29: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

zend_language_parser.y

Page 30: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

Page 31: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

Page 32: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

if_stmt_without_else (B)

Page 33: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

if ($a == 1)

{

a();

}

else if ($b == 1)

{

b();

}

else

{

c();

}

Using the rules to parse

if_stmt_without_else (A)

if_stmt_without_else (B)

if_stmt

Page 34: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Zend_language_parser.y (PHP 7.0.10)

if_stmt:

if_stmt_without_else %prec T_NOELSE { $$ = $1; }

| if_stmt_without_else T_ELSE statement

{ $$ = zend_ast_list_add($1, zend_ast_create(ZEND_AST_IF_ELEM, NULL, $3)); }

;

if_stmt_without_else:

T_IF '(' expr ')' statement

{ $$ = zend_ast_create_list(1, ZEND_AST_IF,

zend_ast_create(ZEND_AST_IF_ELEM, $3, $5)); }

| if_stmt_without_else T_ELSEIF '(' expr ')' statement

{ $$ = zend_ast_list_add($1,

zend_ast_create(ZEND_AST_IF_ELEM, $4, $6)); }

;

Page 35: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

zend_language_parser.y (PHP 5.6.26)

T_IF parenthesis_expr { zend_do_if_cond(&$2, &$1 TSRMLS_CC); }

statement { zend_do_if_after_statement(&$1, 1 TSRMLS_CC); }

void zend_do_if_cond(const znode *cond, znode *closing_bracket_token TSRMLS_DC)

{

int if_cond_op_number = get_next_op_number(CG(active_op_array));

zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC);

opline->opcode = ZEND_JMPZ;

SET_NODE(opline->op1, cond);

closing_bracket_token->u.op.opline_num = if_cond_op_number;

SET_UNUSED(opline->op2);

INC_BPC(CG(active_op_array));

}

Page 36: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

AST is new in PHP 7+

Page 37: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

How PHP works

PHP code

OpCache

Execute (VM)

Lexer + Parser

Compiler

Page 38: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Let’s simplify!

Page 39: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

First… WTF is AST?

Page 40: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

AST is just a data structure

Page 41: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PHP code

<?php

echo "Hello world";

Page 42: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

An AST representation

Echo statement

`-- String, value "Hello world"

Page 43: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PHP code

<?php

echo "Hello " . "world";

Page 44: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

An AST representation

Echo statement

`-- Concat

|-- Left

| `-- String, value "Hello "

`-- Right

`-- String, value "world"

Page 45: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PHP code

<?php

$a = 5;

$b = 3;

echo $a + ($b * 2);

Page 46: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

An AST representation

Assign statement

|-- Variable $a

`-- Integer, value 5

Assign statement

|-- Variable $b

`-- Integer, value 3

Echo statement

`-- Add operation

|-- Left

| `-- Variable $a

`-- Right

`-- Multiply operation

|-- Left

| `-- Variable $b

`-- Right

`-- Integer, value 2

Page 47: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Why?

Page 48: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Faster!*

Page 49: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

AST compilation

Statements

EchoAssign

Scalarvalue: (int)5

Variablename: $a

Assign

Scalarvalue: (int)3

Variablename: $b Add op

Right operandLeft operand

Variablename: $a

Multiply op

Right operandLeft operand

Variablename: $b

Scalarvalue: (int)2

Page 50: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

AST compilation: pre-order traversal

Statements

EchoAssign

Scalarvalue: (int)5

Variablename: $a

Assign

Scalarvalue: (int)3

Variablename: $b Add op

Right operandLeft operand

Variablename: $a

Multiply op

Right operandLeft operand

Variablename: $b

Scalarvalue: (int)2

Page 51: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Pre-order traversal: Polish notation

Assign(Variable $a, Scalar 5)

Assign(Variable $b, Scalar 3)

Echo (

Add(

Variable $a,

Multiply( $b, 2 )

)

)

Page 52: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

Page 53: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Page 54: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Operator Left operand Right operand

Page 55: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order of precedence

1 + 2 * 3

= 1 + (2 * 3) = 7?

= (1 + 2) * 3 = 9?

+ 1 * 2 3

Operator Left operand Right operand

Operator Left operand Right operand

Page 56: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * +

Page 57: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

Page 58: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

Page 59: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

Page 60: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 61: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 62: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

2

3

Page 63: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

6

Page 64: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

1

6

Page 65: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Reverse Polish Notation

1 2 3 * + The stack

7

Page 66: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Let’s write a compiler (!!!)In three easy steps…

Page 67: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Warning: do not use in production

Page 68: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

View > Sourcehttps://github.com/asgrim/basic-maths-compiler

Page 69: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Define the language

Tokens

● T_ADD (+)

● T_SUBTRACT (-)

● T_MULTIPLY (/)

● T_DIVIDE (*)

● T_INTEGER (\d)

● T_WHITESPACE (\s+)

Page 70: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Step 1: Writing a simple lexer

Page 71: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Using regular expressions

private static $matches = [

'/^(\+)/' => Token::T_ADD,

'/^(-)/' => Token::T_SUBTRACT,

'/^(\*)/' => Token::T_MULTIPLY,

'/^(\/)/' => Token::T_DIVIDE,

'/^(\d+)/' => Token::T_INTEGER,

'/^(\s+)/' => Token::T_WHITESPACE,

];

Page 72: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Step through the input string

public function __invoke(string $input) : array

{

$tokens = [];

$offset = 0;

while ($offset < strlen($input)) {

$focus = substr($input, $offset);

$result = $this->match($focus);

$tokens[] = $result;

$offset += strlen($result->getLexeme());

}

return $tokens;

}

Page 73: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

The matching method

private function match(string $input) : Token

{

foreach (self::$matches as $pattern => $token) {

if (preg_match($pattern, $input, $matches)) {

return new Token($token, $matches[1]);

}

}

throw new \RuntimeException(sprintf(

'Unmatched token, next 15 chars were: %s', substr($input, 0, 15)

));

}

Page 74: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Step 2: Parsing the tokens

Page 75: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

/**

* Higher number is higher precedence.

* @var int[]

*/

private static $operatorPrecedence = [

Token::T_SUBTRACT => 0,

Token::T_ADD => 1,

Token::T_DIVIDE => 2,

Token::T_MULTIPLY => 3,

];

Page 76: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 77: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 78: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 79: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

/** @var Token[] $stack */

$stack = [];

/** @var Token[] $operators */

$operators = [];

while (false !== ($token = current($tokens))) {

if ($token->isOperator()) {

// ...

}

$stack[] = $token;

next($tokens);

}

Page 80: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

if ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 81: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

if ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 82: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

if ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 83: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

if ($token->isOperator()) {

$tokenPrecedence = self::$operatorPrecedence[$token->getToken()];

while (

count($operators)

&& self::$operatorPrecedence[$operators[count($operators) - 1]->getToken()]

> $tokenPrecedence

) {

$higherOp = array_pop($operators);

$stack[] = $higherOp;

}

$operators[] = $token;

next($tokens);

continue;

}

Page 84: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

// Clean up by moving any remaining operators onto the token stack

while (count($operators)) {

$stack[] = array_pop($operators);

}

return $stack;

Page 85: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

Output stack

Operator stack

Page 86: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1Output stack

Operator stack

Page 87: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1

+

Output stack

Operator stack

Page 88: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+

Output stack

Operator stack

Page 89: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2

+ *

Output stack

Operator stack

Page 90: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3

+ *

Output stack

Operator stack

Page 91: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 *

+ *

Output stack

Operator stack

Page 92: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Order tokens by operator precedence

1 + 2 * 3

1 2 3 * +

+

Output stack

Operator stack

Page 93: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Create AST

while ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 94: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Create AST

while ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 95: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Create AST

while ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 96: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Create AST

while ($ip < count($tokenStack)) {

$token = $tokenStack[$ip++];

if ($token->isOperator()) {

// (figure out $nodeType)

$right = array_pop($astStack);

$left = array_pop($astStack);

$astStack[] = new $nodeType($left, $right);

continue;

}

$astStack[] = new Node\Scalar\IntegerValue((int)$token->getLexeme());

}

Page 97: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Create AST

Node\BinaryOp\Add (

Node\Scalar\IntegerValue(1),

Node\BinaryOp\Multiply (

Node\Scalar\IntegerValue(2),

Node\Scalar\IntegerValue(3)

)

)

Page 98: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Step 3: Executing the AST

Page 99: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Compile & execute AST

private function compileNode(NodeInterface $node)

{

if ($node instanceof Node\BinaryOp\AbstractBinaryOp) {

return $this->compileBinaryOp($node);

}

if ($node instanceof Node\Scalar\IntegerValue) {

return $node->getValue();

}

}

Page 100: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Compile & execute AST

private function compileBinaryOp(Node\BinaryOp\AbstractBinaryOp $node)

{

$left = $this->compileNode($node->getLeft());

$right = $this->compileNode($node->getRight());

switch (get_class($node)) {

case Node\BinaryOp\Add::class:

return $left + $right;

case Node\BinaryOp\Subtract::class:

return $left - $right;

case Node\BinaryOp\Multiply::class:

return $left * $right;

case Node\BinaryOp\Divide::class:

return $left / $right;

}

}

Page 101: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

What does this mean for me?

Page 102: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

AST in userland

Page 103: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

php-ast extensionhttps://github.com/nikic/php-ast

Page 104: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

php-ast example usage

<?php

require 'path/to/util.php';

$code = <<<'EOC'

<?php

$var = 42;

EOC;

echo ast_dump(ast\parse_code($code, $version=35)), "\n";

// Output:

AST_STMT_LIST

0: AST_ASSIGN

var: AST_VAR

name: "var"

expr: 42

Page 105: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

astkithttps://github.com/sgolemon/astkit

Page 106: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

astkit example usage

$if = AstKit::parseString(<<<EOD

if (true) {

echo "This is a triumph.\n";

} else {

echo "The cake is a lie.\n";

}

EOD

);

$if->execute(); // First run, program is as-seen above

$const = $if->getChild(0)->getChild(0);

// Replace the "true" constant in the condition with false

$const->graft(0, false);

// Can also graft other AstKit nodes, instead of constants

$if->execute(); // Second run now takes the else path

Page 107: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PhpParserhttps://github.com/nikic/PHP-Parser

Page 108: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PHP Parser

<?php

use PhpParser\ParserFactory;

$parser = (new ParserFactory)

->create(ParserFactory::PREFER_PHP7);

print_r($parser->parse(

file_get_contents('ast-demo-src.php')

));

Page 109: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Better Reflectionhttps://github.com/Roave/BetterReflection

Page 110: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Better Reflection workflow

Reflector

Source Locator

PhpParser

Reflection

Page 111: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

PHP Reflection

$reflection = new ReflectionClass(

\My\ExampleClass::class

);

$this->assertSame(

'ExampleClass',

$reflection->getShortName()

);

Page 112: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Better Reflection

$reflection = (new BetterReflection())

->classReflector()

->reflect(\My\ExampleClass::class);

$this->assertSame(

'ExampleClass',

$reflection->getShortName()

);

Page 113: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Class BetterReflection

public function sourceLocator() : SourceLocator

{

$astLocator = $this->astLocator();

return $this->sourceLocator

?? $this->sourceLocator = new MemoizingSourceLocator(new AggregateSourceLocator([

new PhpInternalSourceLocator($astLocator),

new EvaledCodeSourceLocator($astLocator),

new AutoloadSourceLocator($astLocator),

]));

}

public function classReflector() : ClassReflector

{

return $this->classReflector

?? $this->classReflector = new ClassReflector($this->sourceLocator());

}

Page 114: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Given a class structure...

<?php

class Foo

{

private $bar;

public function thing()

{

}

}

Page 115: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

… we get the AST!

Class, name Foo

|-- Statements

| |-- Property, name bar

| | |-- Type [private]

| | `-- Attributes [start line: 7, end line: 9]

| `-- Method, name thing

| |-- Type [public]

| |-- Parameters [...]

| |-- Statements [...]

| `-- Attributes [start line: 7, end line: 9]

`-- Attributes [start line: 3, end line: 10]

Page 116: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

What can I use Better Reflection for?

Page 117: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Monkey patching example

class MyClass

{

public function foo()

{

return 5;

}

}

Page 118: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Monkey patching example

use Roave\BetterReflection\Reflector\ClassReflector;use Roave\BetterReflection\SourceLocator\Type\SingleFileSourceLocator;use Roave\BetterReflection\Util\Autoload\ClassLoader;use Roave\BetterReflection\Util\Autoload\ClassLoaderMethod\FileCacheLoader;

$loader = new ClassLoader(FileCacheLoader::defaultFileCacheLoader(__DIR__));

// Create the reflection first (without loading)$classInfo = (new ClassReflector( new SingleFileSourceLocator( __DIR__ . '/MyClass.php', (new BetterReflection())->astLocator() )))->reflect('MyClass');$loader->addClass($classInfo);

Page 119: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Monkey patching example

use Roave\BetterReflection\Reflector\ClassReflector;use Roave\BetterReflection\SourceLocator\Type\SingleFileSourceLocator;use Roave\BetterReflection\Util\Autoload\ClassLoader;use Roave\BetterReflection\Util\Autoload\ClassLoaderMethod\FileCacheLoader;

$loader = new ClassLoader(FileCacheLoader::defaultFileCacheLoader(__DIR__));

// Create the reflection first (without loading)$classInfo = (new ClassReflector( new SingleFileSourceLocator( __DIR__ . '/MyClass.php', (new BetterReflection())->astLocator() )))->reflect('MyClass');$loader->addClass($classInfo);

Page 120: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Monkey patching example

// Override the body...!

$classInfo->getMethod('foo')->setBodyFromClosure(

function () {

return 4;

}

);

$c = new MyClass();

echo $c->foo() . "\n";

Page 121: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

Monkey patching example

// Override the body...!

$classInfo->getMethod('foo')->setBodyFromClosure(

function () {

return 4;

}

);

$c = new MyClass();

echo $c->foo() . "\n"; // returns 4, not 5

Page 122: Climbing the Abstract Syntax Tree (PHP UK 2018)

@asgrim

To summarise

● For PHP engine:○ AST is an efficient data structure to represent code○ AST means faster compilation (ignoring opcache)○ Separation in PHP engine for parser and compiler○ https://wiki.php.net/rfc/abstract_syntax_tree

● Concepts can be used in userland○ PHP Parser library - https://github.com/nikic/php-parser○ Better Reflection - https://github.com/Roave/BetterReflection

■ Reflect on not-yet-loaded files■ Monkey patching in userland code (!)

○ Static analysis opportunities■ Better Reflection■ Exakat static analysis (uses own AST)■ Phan (uses php-ast extension)

Page 123: Climbing the Abstract Syntax Tree (PHP UK 2018)

Any questions?

Please leave feedback!https://joind.in/talk/63af9

James Titcumb@asgrim