an insight into cpython compiler design
DESCRIPTION
Describes the construction of the Python compiler by taking apart the Unladen Swallow codebaseTRANSCRIPT
![Page 1: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/1.jpg)
An Insight into CPython Compiler Design
Ramkumar Ramachandra
FOSS.IN/2009
03 December 2009
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 1 / 12
![Page 2: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/2.jpg)
Short Discussion of the CPython Compiler
1 Short Discussion of the CPython Compiler
2 A Gentle Introduction to LLVM
3 Enter: Unladen Swallow
4 Conclusion
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 2 / 12
![Page 3: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/3.jpg)
Short Discussion of the CPython Compiler
How Python is compiled
Do the boring grammar parsing
Compile the parse tree to bytecode
Apply optimizations
Interpret the bytecode
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 2 / 12
![Page 4: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/4.jpg)
Short Discussion of the CPython Compiler
The various stages of compilation
PyAST FromNode() in Python/ast.c | Parse tree → AST
PyAST Compile() in compile.c | AST → CFG → Bytecode
PyAST Compile() calls PySymtable Build() and compiler mod() | AST → CFG
assemble() | Post-order DFS | CFG → Bytecode
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 3 / 12
![Page 5: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/5.jpg)
Short Discussion of the CPython Compiler
What the final bytecode looks like
a, b = 1, 0if a or b:
print "Hello", a
1 0 LOAD_CONST 4 ((1, 0))3 UNPACK_SEQUENCE 26 STORE_NAME 0 (a)9 STORE_NAME 1 (b)
2 12 LOAD_NAME 0 (a)15 JUMP_IF_TRUE 7 (to 25)18 POP_TOP19 LOAD_NAME 1 (b)22 JUMP_IF_FALSE 13 (to 38)
>> 25 POP_TOP
3 26 LOAD_CONST 2 ('Hello ')29 PRINT_ITEM30 LOAD_NAME 0 (a)33 PRINT_ITEM34 PRINT_NEWLINE35 JUMP_FORWARD 1 (to 39)
>> 38 POP_TOP>> 39 LOAD_CONST 3 (None)
42 RETURN_VALUE
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 4 / 12
![Page 6: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/6.jpg)
Short Discussion of the CPython Compiler
Execute the bytecode
1 PyObject *PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) {2 PyObject *result;3 result = PyEval_EvalFrame(f);4 return result;5 }
1 PyObject *PyEval_EvalFrame(PyFrameObject *f)2 {3 register PyObject ** stack_pointer; /* Next free slot */4 register unsigned char *next_instr;5 register int opcode; /* Current opcode */6 register int oparg; /* Current opcode argument , if any */7 PyObject *retval = NULL; /* Return value */8 PyCodeObject *co; /* Code object */9 }
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 5 / 12
![Page 7: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/7.jpg)
A Gentle Introduction to LLVM
What is LLVM and why is it relevant?
Compiler infrastructure
Invents a new IR
Replaces lower levels of GCC
Provides static GCC-like compilation and JIT
Python frontend possible
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 6 / 12
![Page 8: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/8.jpg)
Enter: Unladen Swallow
How Unladen Swallow started
Objective: To speed up CPython
Experiment with Psyco
Temporarily use VMgen for eval loop
Remove rarely used opcodes
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 7 / 12
![Page 9: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/9.jpg)
Enter: Unladen Swallow
Compile Python bytecode to LLVM IR
1 extern "C" _LlvmFunction *2 _PyCode_ToLlvmIr(PyCodeObject *code)3 {4 _LlvmFunction *wrapper = new _LlvmFunction ();5 /* fbuilder functions in llvm_fbuilder.cc */6 wrapper ->lf_function = fbuilder.function ();7 return wrapper;8 }
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 8 / 12
![Page 10: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/10.jpg)
Enter: Unladen Swallow
Changes to the eval loop
1 static int2 mark_called_and_maybe_compile(PyCodeObject *co , PyFrameObject *f)3 {4 co->co_hotness += 10;5 if (co->co_hotness > PY_HOTNESS_THRESHOLD) {6 if (co->co_llvm_function == NULL) {7 int target_optimization =8 std::max(Py_DEFAULT_JIT_OPT_LEVEL ,9 Py_OptimizeFlag);
10 if (co ->co_optimization < target_optimization) {11 // If the LLVM version of the function wasn't12 // created yet , setting the optimization level13 // will create it.14 r = _PyCode_ToOptimizedLlvmIr(co , target_optimization);15 }16 }17 if (co->co_native_function == NULL) {18 // Now try to JIT the IR function to machine code.19 co->co_native_function =20 _LlvmFunction_Jit(co->co_llvm_function);21 }22 }23 return 0;24 }
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 9 / 12
![Page 11: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/11.jpg)
Enter: Unladen Swallow
Implement feedback-directed optimization
Optimize native code, not bytecode
Speed up builtin lookups/ inline simple builtins
Don’t compile cold branches
Inline simple operators using type feedback
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 10 / 12
![Page 12: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/12.jpg)
Conclusion
References
[1] Abelson, H., Sussman, G. J., and Sussman, J. Structure and Interpretation of ComputerPrograms (SICP). The MIT Press, 1984.
[2] Aho, A. V., Sethi, R., and Ullman, J. D. Compilers: Principles, Techniques, and Tools.Pearson Education, Inc, 2006.
[3] Aycock, J. Compiling Little Languages in Python.
[4] Cannon, B. Design of the CPython Compiler, 2005.
[5] Ertl, M. A., Gregg, D., Krall, A., and Paysan, B. vmgen — a generator of efficientvirtual machine interpreters. Software—Practice and Experience 32, 3 (2002), 265–294.
[6] Montanaro, S. In A Peephole Optimizer for Python (1998).
[7] Wang, D. C., Appel, A. W., Korn, J. L., and Serra, C. S. In The Zephyr AbstractSyntax Description Language (1997).
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 11 / 12
![Page 13: An Insight into CPython compiler design](https://reader034.vdocuments.us/reader034/viewer/2022042613/554a2a62b4c90542548b4ec8/html5/thumbnails/13.jpg)
Conclusion
Contact information
Ramkumar [email protected]://artagnon.com
Indian Institute of Technology, KharagpurPresentation source available on http://github.com/artagnon/foss.in
Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 12 / 12