parrot in detail1 parrot in detail dan sugalski [email protected]

61
Parrot in detail 1 Parrot in Detail Dan Sugalski [email protected]

Upload: deborah-myra-stone

Post on 28-Dec-2015

229 views

Category:

Documents


4 download

TRANSCRIPT

Parrot in detail 1

Parrot in DetailParrot in Detail

Dan [email protected]

Parrot in detail 2

What is ParrotWhat is Parrot

• The interpreter for perl 6• A multi-language virtual

machine• An April Fools joke gotten out

of hand

Parrot in detail 3

VMs in a nutshellVMs in a nutshell

• Platform independence• Impedance matching• High-level base platform• Good target for one or more

classes of languages

Parrot in detail 4

Platform independencePlatform independence

• Allow emulation of missing platform features

• Allows unified view of common but differently implemented features

• Isolation of platform-specific underlying code

Parrot in detail 5

Impedance matchingImpedance matching

• Can be halfway between the hardware and the software

• Provides another layer of abstraction

• Allows a smoother connection between language and CPU

Parrot in detail 6

High-level base platformHigh-level base platform

• Provide single point of abstraction for many useful things, like:• Async I/O• Threads• Events• Objects

Parrot in detail 7

Good HLL targetGood HLL target

• Provide features that map well to a class of language constructs

• Reduce the “thought” load for compiler writers

• Allow tasks to be better partitioned and placed

Parrot in detail 8

Parts and PiecesParts and Pieces

The chunks of parrot

Parrot in detail 9

Parts and piecesParts and pieces

• Parser• Compiler• Optimizer• Interpreter

Parrot in detail 10

ParserParser

• Turns source into an AST• $a = $b + $c becomes

$a$b$c=+

Parrot in detail 11

Overriding the ParserOverriding the Parser

• New tokens can be added• Existing tokens can have their

meaning changed• Entire languages can be

swapped in or out• All done with Perl 6 grammars

Parrot in detail 12

GrammarsGrammars

• Perl 6 grammars are immensely powerful

• Combination of perl 5 regexes, lex, and yacc

• Grammars are object-oriented and overridable at runtime

Parrot in detail 13

Useful side-effectsUseful side-effects

• Modifying existing grammars lexically is straightforward

• If you have a perl 6 rule set for a language you’re halfway to getting it up on Parrot

• Conveniently, we have a yacc and BNF grammar to Perl regex converter

Parrot in detail 14

CompilerCompiler

• Turns AST into bytecode• Like the parser, is overridable• Essentially a fancy regex

engine, with some extras• No optimizations done here

Parrot in detail 15

Mostly standardMostly standard

• The compiler’s mostly standard• Compiling to a machine, even a

virtual one, is a well-known thing• There’s not much in the way of

surprise here• Still a fair amount of work

Parrot in detail 16

OptimizerOptimizer

• Takes AST and bytecode, and produces better bytecode

• Levels will depends on how perl’s invoked

• Works best with less uncertainty

Parrot in detail 17

Optimizing is difficultOptimizing is difficult

• Lots of uncertainty at compile time

• Active data (tied/overloaded) kills optimization

• Late code loading and creation causes headaches

Parrot in detail 18

Optimizing is difficultOptimizing is difficult

When can$x = 0;

foreach (1..10000) {

$x++;

}

become$x = 10000;

Parrot in detail 19

Optimizing ironiesOptimizing ironies

• Perl may well end up one of the least-optimized languages running on Parrot

• Perl 6 has some constructs to help with that

• The more you tell the compiler, the faster your code will run

Parrot in detail 20

InterpreterInterpreter

• Bytecode comes in, and something magic happens

• End destination for bytecode• May not actually execute the

bytecode, but generally will

Parrot in detail 21

InterpreterInterpreter

• As final destination, may do other things• Save to disk• Transform to an alternate form

(like C code)• JIT• Mock in iambic pentameter

Parrot in detail 22

Interpreter Design Details

Interpreter Design Details

In which we tell you far more than anyone sane wants to know about the

insides of our interpreter.(And this is the short form)

Parrot in detail 23

Parrot in buzzwordsParrot in buzzwords

• Bytecode driven• Runtime extensible• Register based• Language neutral• Garbage collected• Event capable• Object oriented• Introspective• Interpreter• With continuations

Parrot in detail 24

Bytecode engineBytecode engine

• Well, almost. 32 bit words• Precompiled form of your

program• Generally loaded from disk

(though not always)• Generally needs no

transformation to run

Parrot in detail 25

Opcode functionsOpcode functions

• Opcode function table is lexically scoped

• Functions return the next opcode to run

• Most opcodes can throw an exception• Opcode libraries can be loaded in on

demand• Most opcodes overridable• Bytecode loader is overridable

Parrot in detail 26

Fun tricks with dynamic opcodes

Fun tricks with dynamic opcodes

• Load in rarely needed functions only when we have to

• Allow piecemeal upgrading of a Parrot install

• We can be someone else cheaply• JVM• .NET• Z machine• Python• Perl 5• Ruby

Parrot in detail 27

RegistersRegisters

• 4 Sets of 32: Integer, String, Float, PMC

• Fast set of temporary locations• All opcodes operate on

registers

Parrot in detail 28

StacksStacks

• Six stacks• One per set of registers• One generic stack• One call stack• Stacks are segmented, and

have no size limit

Parrot in detail 29

StringsStrings

Buffer pointerBuffer lengthFlags

Buffer usedString start

String Length

EncodingCharacter

set

Parrot in detail 30

StringsStrings

• Strings are encoding-neutral• Strings are character set neutral• Engine knows how to ask

character sets to convert between themselves

• Unicode is our fallback (and sometimes pivot) set

Parrot in detail 31

PMCsPMCs

vtableProperty hash

flagsData pointerCache data

Synchronization

GC data

Parrot in detail 32

PMCsPMCs

• Parrot’s equivalent of perl 5’s variables

• Tying, overloading, and magic all rolled together

• Arrays, hashes, and scalars all rolled into one

Parrot in detail 33

PMCs are more than they seem

PMCs are more than they seem

• Lots of behaviour’s delegated to PMCs• PMC structures are generally opaque to

the VM• Lots of the power and modularity of

Parrot comes from PMCs• Engine doesn’t distinguish between

scalar, hash, and array variables at this level

• Done with the magic of vtables and multimethod dispatch

Parrot in detail 34

VtablesVtables

• Table of pointers to required functions• Allows each variable to have a custom

set of functions to do required things• Removes a lot of uncertainty from the

various functions which speed things up• Allow very customized behaviour• All load, store, and informational

functions here

Parrot in detail 35

Vtables bring multimethodsVtables bring multimethods

• Core engine has support for dispatch based on argument types

• Necessary for proper dispatch of many vtable functions

• Support extends to language level

Parrot in detail 36

Multimethod dispatch coreMultimethod dispatch core

• All binary PMC operations use MMD• Relatively recent change• Makes life simpler and vtables

smaller• Things are faster, too, since

everything of interest did MMD anyway.

Parrot in detail 37

Aggregate PMCsAggregate PMCs

• All PMCs can potentially be treated as aggregates

• All vtable entries have a _keyed variant

• Up to vtable to decide what’s done if an invalid key is passed

Parrot in detail 38

Keys for AggregatesKeys for Aggregates

• List ofkey typekey valueuse key as x

• Also plain one-dimensional integer index

• Keys are inherently multidimensional• Aggregate PMCs may consume

multiple keys

Parrot in detail 39

Advantages of keysAdvantages of keys

• Multidimensional aggregates• No-overhead tied hashes and

arrays• Allows potentially interesting

tied behavior

Parrot in detail 40

PMCs even hide aggregationPMCs even hide aggregation

• @foo = @bar * @bazTurns into

mul foo, bar, baz

Parrot in detail 41

ObjectsObjects

• Parrot has a low-level view of objects

• Things with methods and an array of attributes

• Both of which are ultimately delegated to the object

• Except when we cheat, of course

Parrot in detail 42

ObjectsObjects

• Objects may or may not be accessed by reference

• The provided object type is class-based

• Handles mixed-type inheritance with encapsulation and delegation

• Base system handles mixed-type dispatch properly

Parrot in detail 43

ExceptionsExceptions

• An exception handler may be put in place at any time

• Exception handlers remember their state (they capture a continuation)

• Handlers may decline any exception• Exceptions propagate outward• Exception handlers may target

specific classes of exceptions

Parrot in detail 44

Exceptios are:Exceptios are:

• TypedInformationWarningSevereFatalWe’re Doomed

• ClassedIOMath

• LanguagedPerlRuby

Parrot in detail 45

Throwing an ExceptionThrowing an Exception

• Any opfunc that returns 0 triggers an exception

• The throw opcode also throws an exception

• The exception itself is stored in the interpreter

• Exceptions and exception handlers are cheap, but not free

Parrot in detail 46

Realities of memory management

Realities of memory management

• Memory and structure allocation is a huge pain

• Terribly error prone• We have full knowledge of

what’s used if we choose to use it

Parrot in detail 47

Arena allocation of core structures

Arena allocation of core structures

• All PMCs and Strings are allocated from arenas

• Makes allocation faster and more memory efficient

• Allows us to trace all the core structures as we need for GC and DOD

Parrot in detail 48

Pool allocation of memory

Pool allocation of memory

• All ‘random’ chunks of memory are allocated from memory pools

• Allocation is extremely fast, typically five or six machine instructions

• Free memory is handled by the garbage collector

Parrot in detail 49

Garbage CollectionGarbage Collection

• Parrot has a tracing, compacting garbage collector

• No reference counting• Live objects are found by

tracing the root set

Parrot in detail 50

Garbage CollectionGarbage Collection

• All memory must be pointed to by a Buffer struct (A subset of a String)

• All Buffers must be pointed to by PMCs or string registers

• All PMCs must be pointed to by other PMCs or the root set

Parrot in detail 51

DOD and GC are separateDOD and GC are separate

• DOD finds dead structures• GC compacts memory• Typically chew up more

memory than structures.

Parrot in detail 52

I/OI/O

• Fully asynchronous I/O system by default

• Synchronous overlays for easier coding

• Perl 5/TCL/SysV style streams• C’s STDIO is dead

Parrot in detail 53

I/O streamsI/O streams

• All streams can potentially be filtered

• No limit to the number of filters on a stream

• Filters may run asynchronously, or in their own threads

• Filters may be sources or sinks as need be

Parrot in detail 54

I/O Stream examplesI/O Stream examples

• UTF8->UTF32 conversion• EBCDIC->ShiftJIS conversion• Auto-chomping• Tee-style fanout• GIF->PNG conversion

Parrot in detail 55

Unified I/O & Event systemUnified I/O & Event system

• I/O requests and events are versions of the same thing

• Event generators are just autonomous streams

Parrot in detail 56

Subs and sub callingSubs and sub calling

• Several sub types• Regular subs• Closures• Co-routines• Continuations

• All done with CPS, though it can be hidden

• Caller-save scheme for easier tail-calls

Parrot in detail 57

Parrot has calling conventions

Parrot has calling conventions

• One standard set• All languages that want to

interoperate should use them• Only use them for globally exposed

routines• Terribly boring except when you

don’t have them

Parrot in detail 58

Common object interfaceCommon object interface

• The OO part of the calling conventions• Method calling and attribute

storage/retrieval is standardized• Method dispatch is ultimately delegated• Attribute storage is also ultimately

delegated• Support multimethod dispatch,

prototype checking, and runtime mutability

Parrot in detail 59

ThreadsThreads

• Three models• Shared dependent• Shared independent• Completely independent

• Built in interpreter thread safety• Primitives to allow for language-

level thread safety and inter-thread communication

Parrot in detail 60

IntrospectionIntrospection

• All language data structures are accessible from parrot programs• Scratchpads• Global variable tables• Stacks

• Interpreter level stuff is accessible• Statistics• Internal pools

Parrot in detail 61

Runtime MutabilityRuntime Mutability

• Full runtime mutability is supported

• Parser and compiler are generally available

• Needed for things like perl’s string eval