numba - a dynamic python compiler for science
TRANSCRIPT
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
1/39
Numba: A dynamic Pythoncompiler for Science (i.e. for
NumPy and other typed containers)
March 16, 2013
Travis E. Oliphant, Jon Riehl
Mark Florisson, Siu Kwan Lam
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
2/39
Where Im coming from
AfterBefore
0(2f)2Ui(a, f) = [Cijkl(a, f)Uk,l(a, f)],j
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
3/39
1,000,000 to 2,000,000 users of NumPy!
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
4/39
NumFOCUS --- blatant ad!
www.numfocus.org
501(c)3 Public Charity
Join Us! http://numfocus.org/membership/
Saturday, March 16, 13
http://numfocus.org/membership/http://numfocus.org/membership/http://www.numfocus.org/http://www.numfocus.org/ -
8/13/2019 Numba - A Dynamic Python Compiler for Science
5/39
Code that users might write
xi =
i1X
j=0
kij,jaijaj
O = I ? F
Slow!!!!
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
6/39
Why is Python slow?
1. Dynamic typing
2. Attribute lookups
3. NumPy get-item (a[...])
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
7/39
What are Scientists doing Now?
Writing critical parts in C/C++/Fortran andwrapping with SWIG
ctypes Cython f2py (or fwrap) hand-coded wrappers
Writing new code in Cython directly Cython is modified Python with type information everywhere. It produces a C-extension module which is then compiled
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
8/39
Cython is the most popular
these days. But, speeding upNumPy-based codes should be
even easier!
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
9/39
NumPy Array is typed container
shape
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
10/39
Lets use this!
NumPy Users are already using typedcontainers with regular storage and accesspatterns. There is plenty of information tooptimize the code if we either:
Provide type information for functioninputs (jit)
Create a call-site for each function that
compiles and caches the result the firsttime it gets called with new types.
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
11/39
Requirements Part I
Work with CPython (we need the full scientificPython stack!)
Minimal modifications to code (use type inference) Programmer control over what and when to jit Ability to build static extensions (for libraries)
Fall back to Python C-API for object types.
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
12/39
Requirements Part II
Produce code as fast as C (maybe even Fortran) Support NumPy array-expressions and be able to
produce universal functions (e.g. y = sin(x)) Provide a tool that could adapt to provide
parallelism and produce code for modern vector
hardware (GPUs, accelerators, and many-coremachines)
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
13/39
Do we have to write the full compiler??
No!
LLVM hasdone much
heavy lifting
LLVM =
Compilers foreverybody
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
14/39
Face of a modern compiler
IntermediateRepresentation
(IR)
x86C++
ARM
PTX
C
Fortran
ObjC
Parsing Code Generation
Front-End Back-End
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
15/39
Face of a modern compiler
IntermediateRepresentation
(IR)
x86
ARM
PTX
Python
Code Generation
Back-End
Numba LLVM
ParsingFront-End
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
16/39
Example
Numba
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
17/39
NumPy + Mamba = Numba
LLVM Library
Intel Nvidia AppleAMD
OpenCLISPC CUDA CLANGOpenMP
LLVMPY
Python Function Machine Code
ARM
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
18/39
Simple APIjit --- provide type information (fastest to call at run-time) autojit --- detects input types, infers output, generates code
if needed, and dispatches (a little more run-time calloverhead)
#@jit('void(double[:,:], double, double)')@autojitdef numba_update(u, dx2, dy2): nx, ny = u.shape for i in xrange(1,nx-1): for j in xrange(1, ny-1): u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 + (u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2))
Comment out one of jit or autojit (dont use together)
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
19/39
Example
@numba.jit(f8(f8))def sinc(x): if x==0.0: return 1.0 else: return sin(x*pi)/(pi*x)
Numba
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
20/39
~150x speed-up Real-time imageprocessing (50 fps
Mandelbrot)
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
21/39
Speeding up Math Expressions
xi =
i1X
j=0
kij,jaijaj
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
22/39
Image Processing
@jit('void(f8[:,:],f8[:,:],f8[:,:])')def filter(image, filt, output): M, N = image.shape m, n = filt.shape for i in range(m//2, M-m//2): for j in range(n//2, N-n//2): result = 0.0 for k in range(m): for l in range(n): result += image[i+k-m//2,j+l-n//2]*filt[k, l] output[i,j] = result
~1500x speed-up
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
23/39
Compile NumPy array expressions
from numba import autojit
@autojitdef formula(a, b, c): a[1:,1:] = a[1:,1:] + b[1:,:-1] + c[1:,:-1]
@autojitdef express(m1, m2): m2[1:-1:2,0,...,::2] = (m1[1:-1:2,...,::2] *
m1[-2:1:-2,...,::2]) return m2
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
24/39
Fast vectorize
NumPys ufuncs take kernels andapply the kernel element-by-elementover entire arrays
Write kernels in
Python!from numba.vectorize import vectorizefrom math import sin
@vectorize([f8(f8), f4(f4)])def sinc(x): if x==0.0: return 1.0
else: return sin(x*pi)/(pi*x)
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
25/39
Case-study -- j0 from scipy.special
scipy.special was one of the first libraries I wrote extended umath module by adding new
universal functions to compute many scientific
functions by wrapping C and Fortran libs. Bessel functions are solutions to a differential
equation:x2
d2y
dx2+x
dy
dx+ (x2 2)y= 0
y=J(x)
Jn(x) = 1
Z
0
cos(n x sin()) d
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
26/39
scipy.special.j0 wraps cephes algorithm
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
27/39
Result --- equivalent to compiled code
In [6]: %timeit vj0(x)10000 loops, best of 3: 75 us per loop
In [7]: from scipy.special import j0
In [8]: %timeit j0(x)
10000 loops, best of 3: 75.3 us per loop
But! Now code is in Python and can be
experimented with more easily (and moved tothe GPU / accelerator more easily)!
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
28/39
Laplace Example
@jit('void(double[:,:], double, double)')def numba_update(u, dx2, dy2):
nx, ny = u.shape for i in xrange(1,nx-1): for j in xrange(1, ny-1): u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 + (u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2))
Adapted from http://www.scipy.org/PerformancePythonoriginally by Prabhu Ramachandran
@jit('void(double[:,:], double, double)')def numbavec_update(u, dx2, dy2): u[1:-1,1:-1] = ((u[2:,1:-1]+u[:-2,1:-1])*dy2 +
(u[1:-1,2:] + u[1:-1,:-2])*dx2) / (2*(dx2+dy2))
Saturday, March 16, 13
http://www.scipy.org/PerformancePythonhttp://www.scipy.org/PerformancePython -
8/13/2019 Numba - A Dynamic Python Compiler for Science
29/39
Results of Laplace example
Version Time S eed UNumPy 3.19 1.0
Numba 2.32 1.38Vect. Numba 2.33 1.37
Cython 2.38 1.34
Weave 2.47 1.29
Numexpr 2.62 1.22Fortran Loops 2.30 1.39
Vect. Fortran 1.50 2.13
https://github.com/teoliphant/speed.git
Saturday, March 16, 13
https://github.com/scipy/speed.githttps://github.com/scipy/speed.git -
8/13/2019 Numba - A Dynamic Python Compiler for Science
30/39
Numba can change the game!
LLVM IR
x86C++
ARM
PTX
C
Fortran
Python
Numba turns Python into a compiledlanguage (but much more flexible). You dont
have to reach for C/C++
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
31/39
Many More Advanced Features
Extension classes (jit a class --- autojit coming soon!) Struct support (NumPy arrays can be structs) SSA --- can refer to local variables as different types Typed lists and typed dictionaries and sets coming soon!
pointer support calling ctypes and CFFI functions natively pycc (create stand-alone dynamic library and executable) pycc --python (create static extension module for Python)
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
32/39
Uses of Numba
PythonFunction
Framework accepting dynamic function pointers
Ufuncs
Generalized
UFuncs
Function-
based
Indexing
Memory
Filters
Window
Kernel
Funcs
I/OFilters
Reduction
Filters
Computed
Columns
Numba
function pointer
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
33/39
Accelerate/NumbaPro -- blatant ad!
Python and NumPy compiled to
Parallel Architectures(GPUs and multi-coremachines)
Create parallel-for loops Parallel execution ofufuncs
Run ufuncs on the GPU Write CUDA directly in
Python! Free for Academics
fast development and fastexecution!
Currently premiumfeatures will becontributed to open-source over time!
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
34/39
Numba Development 1260 Mark Florisson203 Jon Riehl181 Siu Kwan Lam
110 Travis E. Oliphant 30 Dag Sverre Seljebotn 28 Hernan Grecco 19 Ilan Schnell 11 Mark Wiebe 8 James Bergstra 4 Alberto Valverde 3 Thomas Kluyver 2 Maggie Mari 2 Dan Yamins 2 Dan Christensen 1 timo 1 Yaroslav Halchenko
1 Phillip Cloud 1 Ond!ej "ertk 1 Martin Spacek 1 Lars Buitinck 1 Juan Luis Cano Rodrguez
git log --format=format:%an | sort | uniq -c | sort -r
Siu
Mark
Jon
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
35/39
Milestone Roadmap
Rapid progress this year Still some bugs -- needs users! Version 0.7 end of Feb. Version 0.8 in April
Version 0.9 June Version 1.0 by end of August Stable API (jit, autojit) easy to use Should be able to write equivalent of
NumPy and SciPy with Numba andmemory-views.
http://numba.pydata.orghttp://llvmpy.orghttp://compilers.pydata.org
We need you:
your use-cases your tests
developer help
Saturday, March 16, 13
http://compilers.pydata.org/http://compilers.pydata.org/http://llvmpy.org/http://llvmpy.org/http://numba.pydata.org/http://numba.pydata.org/ -
8/13/2019 Numba - A Dynamic Python Compiler for Science
36/39
Architectural OverviewPython
Source
Python Parser
PythonAST
Numba Stage 1 Numba Stage n
Numba CodeGenerator
NumbaEnvironment
NumbaAST
LLVM
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
37/39
Numba Architecture
! Entry points! /numba/decorators.py
! Environment! /numba/environment.py
!
Pipeline! /numba/pipeline.py
! Code generation! /numba/codegen/...
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
38/39
Development Roadmap
! Better stage separation, better modularity! Untyped Intermediate Representation (IR)! Typed IR! Specialized IR
! Module level entry points! Better Array Specialization
Saturday, March 16, 13
-
8/13/2019 Numba - A Dynamic Python Compiler for Science
39/39
Community Involvement
! ~/git/numba$ wc AUTHORS 25 88 1470 AUTHORS! (4 lines are blank or instructions)
! Github https://github.com/numba/numba!
Mailing list --- [email protected]! Sprints --- contact Jon Riehl! Examples:
! Hernan Grecco just contributed Python 3 support (Yeah!)! Dag collaborating on autojit classes with Mark F.! We need you to show off your amazing demo!
https://github.com/numba/numbahttps://github.com/numba/numba