parrot in detail1 parrot in detail dan sugalski [email protected]
TRANSCRIPT
Parrot in detail 2
What is ParrotWhat is Parrot
• The interpreter for perl 6• A multi-language virtual
machine• An April Fools joke gotten out
of hand
Parrot in detail 3
VMs in a nutshellVMs in a nutshell
• Platform independence• Impedance matching• High-level base platform• Good target for one or more
classes of languages
Parrot in detail 4
Platform independencePlatform independence
• Allow emulation of missing platform features
• Allows unified view of common but differently implemented features
• Isolation of platform-specific underlying code
Parrot in detail 5
Impedance matchingImpedance matching
• Can be halfway between the hardware and the software
• Provides another layer of abstraction
• Allows a smoother connection between language and CPU
Parrot in detail 6
High-level base platformHigh-level base platform
• Provide single point of abstraction for many useful things, like:• Async I/O• Threads• Events• Objects
Parrot in detail 7
Good HLL targetGood HLL target
• Provide features that map well to a class of language constructs
• Reduce the “thought” load for compiler writers
• Allow tasks to be better partitioned and placed
Parrot in detail 11
Overriding the ParserOverriding the Parser
• New tokens can be added• Existing tokens can have their
meaning changed• Entire languages can be
swapped in or out• All done with Perl 6 grammars
Parrot in detail 12
GrammarsGrammars
• Perl 6 grammars are immensely powerful
• Combination of perl 5 regexes, lex, and yacc
• Grammars are object-oriented and overridable at runtime
Parrot in detail 13
Useful side-effectsUseful side-effects
• Modifying existing grammars lexically is straightforward
• If you have a perl 6 rule set for a language you’re halfway to getting it up on Parrot
• Conveniently, we have a yacc and BNF grammar to Perl regex converter
Parrot in detail 14
CompilerCompiler
• Turns AST into bytecode• Like the parser, is overridable• Essentially a fancy regex
engine, with some extras• No optimizations done here
Parrot in detail 15
Mostly standardMostly standard
• The compiler’s mostly standard• Compiling to a machine, even a
virtual one, is a well-known thing• There’s not much in the way of
surprise here• Still a fair amount of work
Parrot in detail 16
OptimizerOptimizer
• Takes AST and bytecode, and produces better bytecode
• Levels will depends on how perl’s invoked
• Works best with less uncertainty
Parrot in detail 17
Optimizing is difficultOptimizing is difficult
• Lots of uncertainty at compile time
• Active data (tied/overloaded) kills optimization
• Late code loading and creation causes headaches
Parrot in detail 18
Optimizing is difficultOptimizing is difficult
When can$x = 0;
foreach (1..10000) {
$x++;
}
become$x = 10000;
Parrot in detail 19
Optimizing ironiesOptimizing ironies
• Perl may well end up one of the least-optimized languages running on Parrot
• Perl 6 has some constructs to help with that
• The more you tell the compiler, the faster your code will run
Parrot in detail 20
InterpreterInterpreter
• Bytecode comes in, and something magic happens
• End destination for bytecode• May not actually execute the
bytecode, but generally will
Parrot in detail 21
InterpreterInterpreter
• As final destination, may do other things• Save to disk• Transform to an alternate form
(like C code)• JIT• Mock in iambic pentameter
Parrot in detail 22
Interpreter Design Details
Interpreter Design Details
In which we tell you far more than anyone sane wants to know about the
insides of our interpreter.(And this is the short form)
Parrot in detail 23
Parrot in buzzwordsParrot in buzzwords
• Bytecode driven• Runtime extensible• Register based• Language neutral• Garbage collected• Event capable• Object oriented• Introspective• Interpreter• With continuations
Parrot in detail 24
Bytecode engineBytecode engine
• Well, almost. 32 bit words• Precompiled form of your
program• Generally loaded from disk
(though not always)• Generally needs no
transformation to run
Parrot in detail 25
Opcode functionsOpcode functions
• Opcode function table is lexically scoped
• Functions return the next opcode to run
• Most opcodes can throw an exception• Opcode libraries can be loaded in on
demand• Most opcodes overridable• Bytecode loader is overridable
Parrot in detail 26
Fun tricks with dynamic opcodes
Fun tricks with dynamic opcodes
• Load in rarely needed functions only when we have to
• Allow piecemeal upgrading of a Parrot install
• We can be someone else cheaply• JVM• .NET• Z machine• Python• Perl 5• Ruby
Parrot in detail 27
RegistersRegisters
• 4 Sets of 32: Integer, String, Float, PMC
• Fast set of temporary locations• All opcodes operate on
registers
Parrot in detail 28
StacksStacks
• Six stacks• One per set of registers• One generic stack• One call stack• Stacks are segmented, and
have no size limit
Parrot in detail 29
StringsStrings
Buffer pointerBuffer lengthFlags
Buffer usedString start
String Length
EncodingCharacter
set
Parrot in detail 30
StringsStrings
• Strings are encoding-neutral• Strings are character set neutral• Engine knows how to ask
character sets to convert between themselves
• Unicode is our fallback (and sometimes pivot) set
Parrot in detail 31
PMCsPMCs
vtableProperty hash
flagsData pointerCache data
Synchronization
GC data
Parrot in detail 32
PMCsPMCs
• Parrot’s equivalent of perl 5’s variables
• Tying, overloading, and magic all rolled together
• Arrays, hashes, and scalars all rolled into one
Parrot in detail 33
PMCs are more than they seem
PMCs are more than they seem
• Lots of behaviour’s delegated to PMCs• PMC structures are generally opaque to
the VM• Lots of the power and modularity of
Parrot comes from PMCs• Engine doesn’t distinguish between
scalar, hash, and array variables at this level
• Done with the magic of vtables and multimethod dispatch
Parrot in detail 34
VtablesVtables
• Table of pointers to required functions• Allows each variable to have a custom
set of functions to do required things• Removes a lot of uncertainty from the
various functions which speed things up• Allow very customized behaviour• All load, store, and informational
functions here
Parrot in detail 35
Vtables bring multimethodsVtables bring multimethods
• Core engine has support for dispatch based on argument types
• Necessary for proper dispatch of many vtable functions
• Support extends to language level
Parrot in detail 36
Multimethod dispatch coreMultimethod dispatch core
• All binary PMC operations use MMD• Relatively recent change• Makes life simpler and vtables
smaller• Things are faster, too, since
everything of interest did MMD anyway.
Parrot in detail 37
Aggregate PMCsAggregate PMCs
• All PMCs can potentially be treated as aggregates
• All vtable entries have a _keyed variant
• Up to vtable to decide what’s done if an invalid key is passed
Parrot in detail 38
Keys for AggregatesKeys for Aggregates
• List ofkey typekey valueuse key as x
• Also plain one-dimensional integer index
• Keys are inherently multidimensional• Aggregate PMCs may consume
multiple keys
Parrot in detail 39
Advantages of keysAdvantages of keys
• Multidimensional aggregates• No-overhead tied hashes and
arrays• Allows potentially interesting
tied behavior
Parrot in detail 40
PMCs even hide aggregationPMCs even hide aggregation
• @foo = @bar * @bazTurns into
mul foo, bar, baz
Parrot in detail 41
ObjectsObjects
• Parrot has a low-level view of objects
• Things with methods and an array of attributes
• Both of which are ultimately delegated to the object
• Except when we cheat, of course
Parrot in detail 42
ObjectsObjects
• Objects may or may not be accessed by reference
• The provided object type is class-based
• Handles mixed-type inheritance with encapsulation and delegation
• Base system handles mixed-type dispatch properly
Parrot in detail 43
ExceptionsExceptions
• An exception handler may be put in place at any time
• Exception handlers remember their state (they capture a continuation)
• Handlers may decline any exception• Exceptions propagate outward• Exception handlers may target
specific classes of exceptions
Parrot in detail 44
Exceptios are:Exceptios are:
• TypedInformationWarningSevereFatalWe’re Doomed
• ClassedIOMath
• LanguagedPerlRuby
Parrot in detail 45
Throwing an ExceptionThrowing an Exception
• Any opfunc that returns 0 triggers an exception
• The throw opcode also throws an exception
• The exception itself is stored in the interpreter
• Exceptions and exception handlers are cheap, but not free
Parrot in detail 46
Realities of memory management
Realities of memory management
• Memory and structure allocation is a huge pain
• Terribly error prone• We have full knowledge of
what’s used if we choose to use it
Parrot in detail 47
Arena allocation of core structures
Arena allocation of core structures
• All PMCs and Strings are allocated from arenas
• Makes allocation faster and more memory efficient
• Allows us to trace all the core structures as we need for GC and DOD
Parrot in detail 48
Pool allocation of memory
Pool allocation of memory
• All ‘random’ chunks of memory are allocated from memory pools
• Allocation is extremely fast, typically five or six machine instructions
• Free memory is handled by the garbage collector
Parrot in detail 49
Garbage CollectionGarbage Collection
• Parrot has a tracing, compacting garbage collector
• No reference counting• Live objects are found by
tracing the root set
Parrot in detail 50
Garbage CollectionGarbage Collection
• All memory must be pointed to by a Buffer struct (A subset of a String)
• All Buffers must be pointed to by PMCs or string registers
• All PMCs must be pointed to by other PMCs or the root set
Parrot in detail 51
DOD and GC are separateDOD and GC are separate
• DOD finds dead structures• GC compacts memory• Typically chew up more
memory than structures.
Parrot in detail 52
I/OI/O
• Fully asynchronous I/O system by default
• Synchronous overlays for easier coding
• Perl 5/TCL/SysV style streams• C’s STDIO is dead
Parrot in detail 53
I/O streamsI/O streams
• All streams can potentially be filtered
• No limit to the number of filters on a stream
• Filters may run asynchronously, or in their own threads
• Filters may be sources or sinks as need be
Parrot in detail 54
I/O Stream examplesI/O Stream examples
• UTF8->UTF32 conversion• EBCDIC->ShiftJIS conversion• Auto-chomping• Tee-style fanout• GIF->PNG conversion
Parrot in detail 55
Unified I/O & Event systemUnified I/O & Event system
• I/O requests and events are versions of the same thing
• Event generators are just autonomous streams
Parrot in detail 56
Subs and sub callingSubs and sub calling
• Several sub types• Regular subs• Closures• Co-routines• Continuations
• All done with CPS, though it can be hidden
• Caller-save scheme for easier tail-calls
Parrot in detail 57
Parrot has calling conventions
Parrot has calling conventions
• One standard set• All languages that want to
interoperate should use them• Only use them for globally exposed
routines• Terribly boring except when you
don’t have them
Parrot in detail 58
Common object interfaceCommon object interface
• The OO part of the calling conventions• Method calling and attribute
storage/retrieval is standardized• Method dispatch is ultimately delegated• Attribute storage is also ultimately
delegated• Support multimethod dispatch,
prototype checking, and runtime mutability
Parrot in detail 59
ThreadsThreads
• Three models• Shared dependent• Shared independent• Completely independent
• Built in interpreter thread safety• Primitives to allow for language-
level thread safety and inter-thread communication
Parrot in detail 60
IntrospectionIntrospection
• All language data structures are accessible from parrot programs• Scratchpads• Global variable tables• Stacks
• Interpreter level stuff is accessible• Statistics• Internal pools