go compiler for comp-520vfoley1/golang-compiler-presentation.pdf · go i created by unix old-timers...

57
Go Compiler for COMP-520 Vincent Foley-Bourgon Sable Lab McGill University November 2014

Upload: others

Post on 24-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Go Compiler for COMP-520

Vincent Foley-Bourgon

Sable LabMcGill University

November 2014

Page 2: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Agenda

I COMP-520

I Go

I My implementation

I Lexer gotchas

I Parser gotchas

I Recap

Questions welcome during presentation

2 / 47

Page 3: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

COMP-520

I Introduction to compilers

I Project-oriented

I Being updated

I One possible project: a compiler for Go

I Super fun, you should take it!

3 / 47

Page 4: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

COMP-520

I Introduction to compilers

I Project-oriented

I Being updated

I One possible project: a compiler for Go

I Super fun, you should take it!

3 / 47

Page 5: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Go

Page 6: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Go

I Created by Unix old-timers (Ken Thompson, Rob Pike)who happen to work at Google

I Helps with issues they see at Google (e.g. complexity,compilation times)

I Imperative with some OO concepts

I Methods and interfaces

I No classes or inheritance

I Focus on concurrency (goroutines and channels)

I GC

I Simple, easy to remember semantics

I Open source

5 / 47

Page 7: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Why Go for a compilers class?

I Language is simple

I Detailed online specification

I Encompasses all the classical compiler phases

I Allows students to work with a language that is quicklygrowing in popularity

6 / 47

Page 8: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Current work

Page 9: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

My compiler

I Explore the implementation of Go

I Pin-point the tricky parts

I Find a good subset

I Useful for writing programs

I Covered by important compiler topics

I Limit implementation drudgery

8 / 47

Page 10: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tools

I Language: OCaml 4.02

I Lexer generator: ocamllex (ships with OCaml)

I Parser generator: Menhir (LR(1), separate from OCaml)

9 / 47

Page 11: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Why OCaml?

I Good lexer and parser generators

I Algebraic data types are ideal to create ASTs and other IRs

I Pattern matching is great for acting upon AST

I I like it!

10 / 47

Page 12: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Lexer

Page 13: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Lexer

I Written with ocamllex

I ∼270 lines of code

I Go spec gives all the necessary details

I One tricky part: automatic semi-colon insertion

12 / 47

Page 14: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semi-colons

What you write What the parser expectspackage main

import (

"fmt"

"math"

)

func main() {

x := math.Sqrt (18)

fmt.Println(x)

}

13 / 47

Page 15: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semi-colons

What you write What the parser expectspackage main

import (

"fmt"

"math"

)

func main() {

x := math.Sqrt (18)

fmt.Println(x)

}

package main;

import (

"fmt";

"math";

);

func main() {

x := math.Sqrt (18);

fmt.Println(x);

};

14 / 47

Page 16: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semi-colons

When the input is broken into tokens, a semicolon isautomatically inserted into the token stream at theend of a non-blank line if the line’s final token is

I an identifier

I a literal

I one of the keywords break, continue,fallthrough, or return

I one of the operators and delimiters ++, --, ), ], or }

15 / 47

Page 17: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Solutionrule next_token = parse

(* ... *)

| "break" { T_break }

| ’\n’ { next_token lexbuf }

16 / 47

Page 18: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Solutionrule next_token = parse

(* ... *)

| "break" { yield lexbuf T_break }

| ’\n’ { if needs_semicolon lexbuf then

yield lexbuf T_semi_colon

else

next_token lexbuf

}

17 / 47

Page 19: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Solutionrule next_token = parse

(* ... *)

| "break" { yield lexbuf T_break }

| ’\n’ { if needs_semicolon lexbuf then

yield lexbuf T_semi_colon

else

next_token lexbuf

}

| "//" { line_comment lexbuf }

and line_comment = parse

| ’\n’ { if needs_semicolon lexbuf then

yield lexbuf T_semi_colon

else

next_token lexbuf

}

| _ { line_comment lexbuf }

18 / 47

Page 20: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Pause philosophique

Is Go lexically a regular language?

19 / 47

Page 21: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Lexer

Supports most of the Go specification

I Unicode characters are not allowed in identifiers

I No unicode support in char and string literals

I Don’t support second semi-colon insertion rulefunc () int { return 42; }

20 / 47

Page 22: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Parser

Page 23: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Parser & AST

I Parser written with Menhir

I Parser: ∼600 lines of code (incomplete)

I AST: ∼200 lines of code

I Some constructs are particularily tricky!

22 / 47

Page 24: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(str string , start int , length int)

// named arguments , long form

func substr(str string , start , length int)

// named arguments , short form

func substr(string , start , length int)

// Three parameters of type int

func substr(str string , start int , int)

// Syntax error

23 / 47

Page 25: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(str string , start int , length int)

// named arguments , long form

func substr(str string , start , length int)

// named arguments , short form

func substr(string , start , length int)

// Three parameters of type int

func substr(str string , start int , int)

// Syntax error

23 / 47

Page 26: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(str string , start int , length int)

// named arguments , long form

func substr(str string , start , length int)

// named arguments , short form

func substr(string , start , length int)

// Three parameters of type int

func substr(str string , start int , int)

// Syntax error

23 / 47

Page 27: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(str string , start int , length int)

// named arguments , long form

func substr(str string , start , length int)

// named arguments , short form

func substr(string , start , length int)

// Three parameters of type int

func substr(str string , start int , int)

// Syntax error

23 / 47

Page 28: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(str string , start int , length int)

// named arguments , long form

func substr(str string , start , length int)

// named arguments , short form

func substr(string , start , length int)

// Three parameters of type int

func substr(str string , start int , int)

// Syntax error

23 / 47

Page 29: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parametersfunc substr(string , int , int)

// unnamed arguments

func substr(string , start , length int)

// Three parameters of type int

24 / 47

Page 30: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parameters

How to figure out named and unnamed parameter?

I Read list of either type or identifier type

I Process list to see if all type or at least one identifier

type

I Generate the correct AST nodes (i.e. ParamUnnamed(type)or ParamNamed(id, type))

Only named parameters for project.

25 / 47

Page 31: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #1: function parameters

How to figure out named and unnamed parameter?

I Read list of either type or identifier type

I Process list to see if all type or at least one identifier

type

I Generate the correct AST nodes (i.e. ParamUnnamed(type)or ParamNamed(id, type))

Only named parameters for project.

25 / 47

Page 32: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

From the Go FAQ:

[...] Second, the language has been designed to be easy toanalyze and can be parsed without a symbol table.

26 / 47

Page 33: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

Type conversions in Go look like function calls:int (3.2) // type conversion

fib (24) // function call

... probably

I It depends: is fib is a type?

I How do we generate the proper AST node?

I We need to keep track of identifiers in scope, i.e. a symboltable

I More complex parsing:call ::= expr or type ’(’ expr* ’)’

e.g. []*int(z)

27 / 47

Page 34: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

Type conversions in Go look like function calls:int (3.2) // type conversion

fib (24) // function call ... probably

I It depends: is fib is a type?

I How do we generate the proper AST node?

I We need to keep track of identifiers in scope, i.e. a symboltable

I More complex parsing:call ::= expr or type ’(’ expr* ’)’

e.g. []*int(z)

27 / 47

Page 35: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

Type conversions in Go look like function calls:int (3.2) // type conversion

fib (24) // function call ... probably

I It depends: is fib is a type?

I How do we generate the proper AST node?

I We need to keep track of identifiers in scope, i.e. a symboltable

I More complex parsing:call ::= expr or type ’(’ expr* ’)’

e.g. []*int(z)

27 / 47

Page 36: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-insBuilt-ins also look like function calls:

xs := make ([]int , 3) // [0, 0, 0]

xs = append(xs, 1) // [0, 0, 0, 1]

len(xs) // 4

What’s different?

28 / 47

Page 37: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-insBuilt-ins also look like function calls:

xs := make([]int, 3) // [0, 0, 0]

xs = append(xs, 1) // [0, 0, 0, 1]

len(xs) // 4

What’s different?

I The first parameter of a built-in can be a type

I call ::= expr or type ’(’ expr or type* ’)’

I Very difficult to get right: expr and type conflict (i.e.identifier)

I Factor the type non-terminals (expr-term-factor)

I AST “pollution”

29 / 47

Page 38: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

FunCall

Id("fib") Int(24)

Call

Id("fib") Int(24)

Expr Expr

30 / 47

Page 39: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #2: Calls, conversions and built-ins

Call

Slice(Id("int"))

Expr Type

Id("make")

Expr

Call

Id("int") Float(3.2)

Expr Expr

Int(3)

Call

Ptr(int) Id("ptr")

Type Expr

Call

Slice(Id("int"))

Expr Type

Id("make")

Expr

Call

Id("int") Float(3.2)

Expr Expr

Int(3)

Call

Ptr(int) Id("ptr")

Type Expr

31 / 47

Page 40: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Pause philosophique

What does it mean to parse a language?

I For theorists: does a sequence of symbol belong to alanguage?

I For compiler writers: can I generate a semantically-preciseAST from this sequence of symbols?

32 / 47

Page 41: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Pause philosophique

What does it mean to parse a language?

I For theorists: does a sequence of symbol belong to alanguage?

I For compiler writers: can I generate a semantically-preciseAST from this sequence of symbols?

32 / 47

Page 42: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #3: chan directionality

I chan int: channel of ints

I chan<- int: send-only channel of ints

I <-chan int: receive-only channel of ints

What is chan <- chan int?

chan<- (chan int)

33 / 47

Page 43: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #3: chan directionality

I chan int: channel of ints

I chan<- int: send-only channel of ints

I <-chan int: receive-only channel of ints

What is chan <- chan int? chan<- (chan int)

33 / 47

Page 44: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Tricky construct #3: chan directionality

How do we implement this?

I Apply expr-term-factor factorization to types

I No PEDMAS for types

I Remember expr or type? Fun times :/

I Complicates parser

34 / 47

Page 45: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semantics

Page 46: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semantics - interfacesA type implicitly implements an interface if it has the rightmethods.

type Point struct {

x, y int

}

type Summable interface {

Sum() int

}

func (p Point) Sum() int {

return p.x + p.y

}

func Test(s Summable) {

fmt.Println(s.Sum())

}

func main() {

p := Point{ 3, 4 }

Test(p)

}

36 / 47

Page 47: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Semantics - constants

Go does not perform automatic type conversions:var x int = 15 // OK

var y float64 = x // Error

Constants are “untyped” however:var x int = 15 // OK

var y float64 = 15 // OK

var z int = 3.14 // Error

Constants are high-precision:Pi = 3.1415926535897932384626433832795028841971693993751

HalfPi = Pi / 2 // Also high -precision

37 / 47

Page 48: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status

Page 49: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status - Lexer

I As complete as I want it at the moment

I Don’t intend to add unicode support

39 / 47

Page 50: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status - Parser

I Lacks support for many constructs (e.g. type switches,implicitly initialized consts, chan directionality, etc.)

I Some cheating to make productions easier (e.g. disallowparenthesized types in expr or type)

40 / 47

Page 51: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status - AST

I In the process of simplifying the AST (e.g. generalized callnode)

I Create a new, semantically richer AST that will be a resultof type checking: type and scope information will be usedto create appropriate nodes

41 / 47

Page 52: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status - Semantic analysis

I On-going, currently doing basic types

I Scrapped some phases that turned out to be unnecessary

I Cheating by simplifying the rules of constants (e.g. var x

float64 = 3 is a type error)

I Haven’t started thinking about interface types yet; maybeleave them out

42 / 47

Page 53: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Status - Code generation

I Not started at the moment

I Probably going to target JS or C

I JS: easy support for closures and multi-value returns

43 / 47

Page 54: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Misc.

I How to support a mini stdlib?

I GC? Allocate and forget, that’s my motto!

44 / 47

Page 55: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Language subset

Page 56: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Language subset

I No concurrency support (channels, goroutines, select)

I Simplify some of the syntax (unnamed parameters)

I Eliminate “exotic” features (complex numbers, iota)

I Simplify constants

I Eliminate methods and interfaces

I No GC

46 / 47

Page 57: Go Compiler for COMP-520vfoley1/golang-compiler-presentation.pdf · Go I Created by Unix old-timers (Ken Thompson, Rob Pike) who happen to work at Google I Helps with issues they

Questions?