building languages for the jvm - startechconf 2011

64
Building Languages for the JVM Charles Oliver Nutter Friday, November 4, 2011

Upload: charles-nutter

Post on 15-May-2015

2.165 views

Category:

Technology


3 download

DESCRIPTION

Talk on how JRuby is constructed and what it gets out of the JVM from StarTechConf 2011 in Santiago, Chile.

TRANSCRIPT

  • 1. Building Languages for the JVM Charles Oliver NutterFriday, November 4, 2011

2. Me Charles Oliver Nutter JRuby and JVM guy headius on most services [email protected], November 4, 2011 3. Who Was I? Java EE architect Successfully! Never wrote a parser Never wrote a compiler But I wanted to learn how...Friday, November 4, 2011 4. You Java? Ruby? Python? C#? Other?Friday, November 4, 2011 5. Why Create Languages? Nothing is perfect New problems need new solutions Language design can be fun Fame and fortune? Not really.Friday, November 4, 2011 6. Why Impl a Language? To learn it? Sort of... To learn the target platform Denitely! Fame and fortune? Well...getting there...Friday, November 4, 2011 7. Challenges Community Platform Specications ResourcesFriday, November 4, 2011 8. Community Investment in status quo Afraid to stand out Known quantities Everything else sucks Gotta get paid!Friday, November 4, 2011 9. Platform Matching language semantics JVM designed around Java JVM hides underlying platform Challenging to use Not bad...C/++ would be way worse Community may hate it ;-)Friday, November 4, 2011 10. SpecicationsIncomplete Ruby had none for years ...and no complete test suitesDifcult to implement Low level features Single-implementation quirksHard or impossible to optimizeFriday, November 4, 2011 11. Resources You gotta eat Not much money in language work Some parts are hard OSS is a necessityFriday, November 4, 2011 12. Why JVM? Because I am lazy Because VMs are *hard* Because I cant be awesome at everythingFriday, November 4, 2011 13. Ok, Why Really? Cross-platform Libraries Languages Memory management Tools OSSFriday, November 4, 2011 14. Cross-platform OpenJDK: Linux, Windows, Solaris, OS X, xBSD J9: Linux, zLinux, AS/400, ... HP: OpenVMS, HP/UX, ... Dalvik (Android): Linux on ARM, x86Friday, November 4, 2011 15. Libraries For any need, a dozen libraries And a couple of them are good! Cross-platform Leading edgeFriday, November 4, 2011 16. Selection of languages Java Scala Clojure JRuby Mirah Jython, Groovy, Fantom, Kotlin, Ceylon, ...Friday, November 4, 2011 17. Memory management Best GCs in the world Fastest object allocation Safe escape hatches like NIOFriday, November 4, 2011 18. Tools Debugging Proling Monitoring JVM internalsFriday, November 4, 2011 19. Open source? FOSS reference impl (OpenJDK) Mostly OSS libraries Heavy OSS culture Strong OSS inuence in OpenJDK coreFriday, November 4, 2011 20. Case Study: JRubyFriday, November 4, 2011 21. Ruby on the JVM All of Rubys power and beauty Solid VM underneath Just another JVM languageFriday, November 4, 2011 22. JVM LanguageFull interop with Java Tricky to do... Very rewarding for 99% caseVM concerns solved No need to write a GC No need to write a JIT ...oh, but wait...Friday, November 4, 2011 23. More than a JVMlanguage Use native code where JDK fails us Paper over ugly bits like CLASSPATH Matching Ruby semantics exactly* Push JVM forward too!Friday, November 4, 2011 24. Playing with JRuby Simple IRB demo JRuby on Rails - see Janos talk tomorrow JRuby performance PotC (???)Friday, November 4, 2011 25. How did we do it?Friday, November 4, 2011 26. JRuby Architecture Parser Abstract Syntax Tree (AST) Intermediate Representation (IR) Core classes CompilerFriday, November 4, 2011 27. Parser Port of MRIs Bison grammar Jay parser generator for Java Hand-written lexer Nearly as fast as the C version ...once it gets goingFriday, November 4, 2011 28. system ~/projects/jruby $ jruby -y -e "1 + 1" push state 0 value null reduce tate 0 uncover 0srule (1) $$1 : goto from state 0 to 2 push state 2 value null lex tate 2 reading tIDENTIFIER value Token { Value=load,s Position=file:/Users/headius/projects/jruby/lib/jruby.jar!/ jruby/kernel.rb:6} shiftfrom state 2 to 33 push state 33 value Token { Value=load, Position=file:/Users/ headius/projects/jruby/lib/jruby.jar!/jruby/kernel.rb:6} lex tate 33 reading tSTRING_BEG value Token { Value=,s Position=file:/Users/headius/projects/jruby/lib/jruby.jar!/ jruby/kernel.rb:6} reduce tate 33 uncover 2srule (487) operation : tIDENTIFIER goto from state 2 to 62 push state 62 value Token { Value=load, Position=file:/Users/ headius/projects/jruby/lib/jruby.jar!/jruby/kernel.rb:6} reduce tate 62 uncover 62 rule (252) $$6 :sFriday, November 4, 2011 29. system ~/projects/jruby $ jruby -y -e "1 + 1" push state 0 value null reduce tate 0 uncover 0srule (1) $$1 : goto from state 0 to 2 push state 2 value null lex tate 2 reading tIDENTIFIER value Token { Value=load,s Position=file:/Users/headius/projects/jruby/lib/jruby.jar!/ jruby/kernel.rb:6} shiftfrom state 2 to 33 You will never need this. push state 33 value Token { Value=load, Position=file:/Users/ headius/projects/jruby/lib/jruby.jar!/jruby/kernel.rb:6} lex tate 33 reading tSTRING_BEG value Token { Value=,s Position=file:/Users/headius/projects/jruby/lib/jruby.jar!/ jruby/kernel.rb:6} reduce tate 33 uncover 2srule (487) operation : tIDENTIFIER goto from state 2 to 62 push state 62 value Token { Value=load, Position=file:/Users/ headius/projects/jruby/lib/jruby.jar!/jruby/kernel.rb:6} reduce tate 62 uncover 62 rule (252) $$6 :sFriday, November 4, 2011 30. public class RubyYaccLexer {public static final Encoding UTF8_ENCODING = UTF8Encoding.INSTANCE;public static final Encoding USASCII_ENCODING = USASCIIEncoding.INSTANCE;public static final Encoding ASCII8BIT_ENCODING = ASCIIEncoding.INSTANCE;private static ByteList END_MARKER = new ByteList(new byte[] {_, E, N, D, _, _});private static ByteList BEGIN_DOC_MARKER = new ByteList(new byte[] {b, e, g, i, n});private static ByteList END_DOC_MARKER = new ByteList(new byte[] {e, n, d});private static final HashMap map;static {map = new HashMap();map.put("end", Keyword.END);map.put("else", Keyword.ELSE);map.put("case", Keyword.CASE);map.put("ensure", Keyword.ENSURE);map.put("module", Keyword.MODULE);map.put("elsif", Keyword.ELSIF);map.put("def", Keyword.DEF);map.put("rescue", Keyword.RESCUE);map.put("not", Keyword.NOT);map.put("then", Keyword.THEN);map.put("yield", Keyword.YIELD);map.put("for", Keyword.FOR);map.put("self", Keyword.SELF);map.put("false", Keyword.FALSE);Friday, November 4, 2011 31. public enum Keyword { END ("end", Tokens.kEND, Tokens.kEND, LexState.EXPR_END), ELSE ("else", Tokens.kELSE, Tokens.kELSE, LexState.EXPR_BEG), CASE ("case", Tokens.kCASE, Tokens.kCASE, LexState.EXPR_BEG), ENSURE ("ensure", Tokens.kENSURE, Tokens.kENSURE, LexState.EXPR_BEG), MODULE ("module", Tokens.kMODULE, Tokens.kMODULE, LexState.EXPR_BEG), ELSIF ("elsif", Tokens.kELSIF, Tokens.kELSIF, LexState.EXPR_BEG), DEF ("def", Tokens.kDEF, Tokens.kDEF, LexState.EXPR_FNAME), RESCUE ("rescue", Tokens.kRESCUE, Tokens.kRESCUE_MOD, LexState.EXPR_MID), NOT ("not", Tokens.kNOT, Tokens.kNOT, LexState.EXPR_BEG), THEN ("then", Tokens.kTHEN, Tokens.kTHEN, LexState.EXPR_BEG), YIELD ("yield", Tokens.kYIELD, Tokens.kYIELD, LexState.EXPR_ARG), FOR ("for", Tokens.kFOR, Tokens.kFOR, LexState.EXPR_BEG), SELF ("self", Tokens.kSELF, Tokens.kSELF, LexState.EXPR_END), FALSE ("false", Tokens.kFALSE, Tokens.kFALSE, LexState.EXPR_END), RETRY ("retry", Tokens.kRETRY, Tokens.kRETRY, LexState.EXPR_END), RETURN ("return", Tokens.kRETURN, Tokens.kRETURN, LexState.EXPR_MID), TRUE ("true", Tokens.kTRUE, Tokens.kTRUE, LexState.EXPR_END), IF ("if", Tokens.kIF, Tokens.kIF_MOD, LexState.EXPR_BEG), DEFINED_P ("defined?", Tokens.kDEFINED, Tokens.kDEFINED, LexState.EXPR_ARG),Friday, November 4, 2011 32. private int yylex() throws IOException {int c;boolean spaceSeen = false;boolean commandState;if (lex_strterm != null) {int tok = lex_strterm.parseString(this, src);if (tok == Tokens.tSTRING_END || tok == Tokens.tREGEXP_END) {lex_strterm = null;setState(LexState.EXPR_END);}return tok;}commandState = commandStart;commandStart = false;loop: for(;;) {c = src.read();switch(c) {Friday, November 4, 2011 33. case :return greaterThan();case ":return doubleQuote();case `:return backtick(commandState);case :return singleQuote();case ?:return questionMark();case &:return ampersand(spaceSeen);case |:return pipe();case +:return plus(spaceSeen);Friday, November 4, 2011 34. private int lessThan(boolean spaceSeen) throws IOException {int c = src.read();if (c == < && lex_state != LexState.EXPR_DOT && lex_state !=LexState.EXPR_CLASS &&!isEND() && (!isARG() || spaceSeen)) {int tok = hereDocumentIdentifier();if (tok != 0) return tok;}determineExpressionState();switch (c) {case =:if ((c = src.read()) == >) {yaccValue = new Token("", getPosition());return Tokens.tCMP;Friday, November 4, 2011 35. %%program : {lexer.setState(LexState.EXPR_BEG);support.initTopLocalVariables();} top_compstmt {// ENEBO: Removed !compile_for_eval which probably is to reducewarningsif ($2 != null) {/* last expression should not be void */if ($2 instanceof BlockNode) {support.checkUselessStatement($2.getLast());} else {support.checkUselessStatement($2);}}support.getResult().setAST(support.addRootNode($2, support.getPosition($2)));}Friday, November 4, 2011 36. stmt: kALIAS fitem {lexer.setState(LexState.EXPR_FNAME);} fitem {$$ = support.newAlias($1.getPosition(), $2, $4);}| kALIAS tGVAR tGVAR {$$ = new VAliasNode($1.getPosition(), (String)$2.getValue(), (String) $3.getValue());}| kALIAS tGVAR tBACK_REF {$$ = new VAliasNode($1.getPosition(), (String)$2.getValue(), "$" + $3.getType());}| kALIAS tGVAR tNTH_REF {support.yyerror("cant make alias for the numbervariables");}| kUNDEF undef_list {$$ = $2;}| stmt kIF_MOD expr_value {$$ = new IfNode(support.getPosition($1),support.getConditionNode($3), $1, null);}Friday, November 4, 2011 37. public Object yyparse (RubyYaccLexer yyLex) throws java.io.IOException { if (yyMax = yyStates.length) {" " " // dynamically increase int[] i = new int[yyStates.length+yyMax]; System.arraycopy(yyStates, 0, i, 0, yyStates.length); yyStates = i; Object[] o = new Object[yyVals.length+yyMax]; System.arraycopy(yyVals, 0, o, 0, yyVals.length); yyVals = o; } yyStates[yyTop] = yyState; yyVals[yyTop] = yyVal; if (yydebug != null) yydebug.push(yyState, yyVal);Friday, November 4, 2011 38. if (state == null) {yyVal = yyDefault(yyV > yyTop ? null : yyVals[yyV]);} else {yyVal = state.execute(support, lexer, yyVal, yyVals, yyTop);}Friday, November 4, 2011 39. states[23] = new ParserState() { public Object execute(ParserSupport support, RubyYaccLexer lexer, Object yyVal, Object[] yyVals, int yyTop) { yyVal = new IfNode(support.getPosition(((Node)yyVals [-2+yyTop])), support.getConditionNode(((Node)yyVals[0+yyTop])), ((Node) yyVals[-2+yyTop]), null); return yyVal; } }; Never look at this. states[24] = new ParserState() { public Object execute(ParserSupport support, RubyYaccLexer lexer, Object yyVal, Object[] yyVals, int yyTop) { yyVal = new IfNode(support.getPosition(((Node)yyVals [-2+yyTop])), support.getConditionNode(((Node)yyVals[0+yyTop])), null, ((Node)yyVals[-2+yyTop])); return yyVal; } };Friday, November 4, 2011 40. AST Interpreted directly Specialized in places Large and richFriday, November 4, 2011 41. $ ast -e "a = true; if a; 2; else; 3; end" AST: RootNode 0 BlockNode 0NewlineNode 0LocalAsgnNode:a 0TrueNode:true 0NewlineNode 0IfNode 0LocalVarNode:a 0NewlineNode 0FixnumNode 0NewlineNode 0FixnumNode 0Friday, November 4, 2011 42. public class IfNode extends Node {private final Node condition;private final Node thenBody;private final Node elseBody;public IfNode(ISourcePosition position, Node condition, Node thenBody,Node elseBody) {super(position);assert condition != null : "condition is not null";//assert thenBody != null : "thenBody is not null";//assert elseBody != null : "elseBody is not null";this.condition = condition;this.thenBody = thenBody;this.elseBody = elseBody;}Friday, November 4, 2011 43. @Overridepublic IRubyObject interpret(Ruby runtime, ThreadContext context,IRubyObject self, Block aBlock) {ISourcePosition position = getPosition();context.setFile(position.getFile());context.setLine(position.getStartLine());IRubyObject result = condition.interpret(runtime, context, self, aBlock);if (result.isTrue()) {return thenBody == null ? runtime.getNil() : thenBody.interpret(runtime, context, self, aBlock);} else {return elseBody == null ? runtime.getNil() : elseBody.interpret(runtime, context, self, aBlock);}}Friday, November 4, 2011 44. IR (future work) Control ow graph Ruby-specic instruction set Optimizing compiler Ruby-level optimizationsFriday, November 4, 2011 45. jruby -e 1 + 1Friday, November 4, 2011 46. 2011-11-04T05:23:09.375-03:00: IR_Printer: instrs: 0 %self = recv_self 1 %block(0:0) = recv_closure 2 file_name(-e) 3 line_num(0) 4 %v_0 = call(+, 1:fixnum, [1:fixnum]) 5 return(%v_0) 2011-11-04T05:23:09.375-03:00: IR_Printer: live variables: %v_0: 4-5Friday, November 4, 2011 47. a = 1; while a < 10; puts a; a += 1; endFriday, November 4, 2011 48. 2011-11-04T05:25:23.517-03:00: IR_Printer: instrs: 0%self = recv_self 1%block(0:0) = recv_closure 2file_name(-e) 3line_num(0) 4a(0:1) = 1:fixnum 5_LOOP_BEGIN_0: 6%v_1 = call([7], [8,2] BB [2:LBL_2]:>[3], [8],