ep2017 - bonobo etl · pdf file• cloveretl (ide/java) talend open studio . data...
TRANSCRIPT
![Page 1: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/1.jpg)
bonobo Simple ETL in Python 3.5+
![Page 2: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/2.jpg)
Romain Dorgueil @rdorgueil
CTO/Hacker in Residence Technical Co-founder
(Solo) Founder Eng. Manager
Developer
L’Atelier BNP Paribas WeAreTheShops RDC Dist. Agency Sensio/SensioLabs AffiliationWizard
Felt too young in a Linux Cauldron Dismantler of Atari computers
Basic literacy using a Minitel Guitars & accordions
Off by one baby Inception
![Page 3: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/3.jpg)
STARTUP ACCELERATION PROGRAMS
NO HYPE, JUST BUSINESS
launchpad.atelier.net
![Page 4: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/4.jpg)
bonobo Simple ETL in Python 3.5+
![Page 5: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/5.jpg)
• History of Extract Transform Load
• Concept ; Existing tools ; Related tools ; Ignition
• Practical Bonobo
• Tutorial ; Under the hood ; Demo ; Plugins & Extensions ; More demos
• Wrap up
• Present & future ; Resources ; Sprint ; Feedback
Plan
![Page 6: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/6.jpg)
Once upon a time…
![Page 7: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/7.jpg)
Extract Transform Load• Not new. Popular concept in the 1970s [1] [2]
• Everywhere. Commerce, websites, marketing, finance, …
[1] https://en.wikipedia.org/wiki/Extract,_transform,_load [2] https://www.sas.com/en_us/insights/data-management/what-is-etl.html
![Page 8: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/8.jpg)
Extract Transform Load
foo
bar
baz
Extract Transform Load
![Page 9: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/9.jpg)
Extract Transform Load
foo
bar
baz
Extract
Transform Load
Transformmore
JoinDB HTTP POST
log?
![Page 10: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/10.jpg)
Data Integration Tools• Pentaho Data Integration (IDE/Java)
• Talend Open Studio (IDE/Java)
• CloverETL (IDE/Java)
![Page 11: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/11.jpg)
Talend Open Studio
![Page 12: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/12.jpg)
![Page 13: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/13.jpg)
![Page 14: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/14.jpg)
Data Integration Tools• Java + IDE based, for most of them
• Data transformations are blocks
• IO flow managed by connections
• Execution
GUI first, eventually code :-(
![Page 15: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/15.jpg)
In the Python world …• Bubbles (https://github.com/stiivi/bubbles)
• PETL (https://github.com/alimanfoo/petl)
• (insert a few more here)
• and now… Bonobo (https://www.bonobo-project.org/)
You can also use amazing libraries including Joblib, Dask, Pandas, Toolz, but ETL is not their main focus.
![Page 16: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/16.jpg)
Other scales…
![Page 17: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/17.jpg)
Small Automation Tools
• Mostly aimed at simple recurring tasks.
• Cloud / SaaS only.
![Page 18: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/18.jpg)
Big Data Tools
• Can do anything. And probably more. Fast.
• Either needs an infrastructure, or cloud based.
![Page 19: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/19.jpg)
Story time
![Page 20: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/20.jpg)
Partner 1 Data Integration
![Page 21: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/21.jpg)
WE GOT DEALS !!!
![Page 22: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/22.jpg)
Partner 1 Partner 2 Partner 3 Partner 4 Partner 5
Partner 6 Partner 7 Partner 8 Partner 9 …
![Page 23: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/23.jpg)
Tiny bug there… Can you fix it ?
![Page 24: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/24.jpg)
![Page 25: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/25.jpg)
My need• A data integration / ETL tool using code as configuration.
• Preferably Python code.
• Something that can be tested (I mean, by a machine).
• Something that can use inheritance.
• Fast & cheap install on laptop, thought for servers too.
![Page 26: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/26.jpg)
And that’s Bonobo
![Page 27: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/27.jpg)
It is …• A framework to write ETL jobs in Python 3 (3.5+)
• Using the same concepts as the old ETLs.
• You can use OOP!
Code first. Eventually a GUI will come.
![Page 28: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/28.jpg)
It is NOT …• Pandas / R Dataframes
• Dask (but will probably implement a dask.distributed strategy someday)
• Luigi / Airflow
• Hadoop / Big Data / Big Query / …
• A monkey (spoiler : it’s an ape, damnit french language…)
![Page 29: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/29.jpg)
Let’s see…
![Page 30: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/30.jpg)
Create a project
~ $ pip install bonobo
~ $ bonobo init europython/tutorial
~ $ bonobo run europython/tutorial
![Page 31: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/31.jpg)
TEMPLATE
~ $ bonobo run .
…demo
![Page 32: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/32.jpg)
Write our ownimport bonobo
def extract(): yield 'euro' yield 'python' yield '2017'
def transform(s): return s.title()
def load(s): print(s)
graph = bonobo.Graph( extract, transform, load, )
![Page 33: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/33.jpg)
EXAMPLE_1
~ $ bonobo run .
…demo
![Page 34: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/34.jpg)
EXAMPLE_1
~ $ bonobo run first.py
…demo
![Page 35: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/35.jpg)
Under the hood…
![Page 36: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/36.jpg)
graph = bonobo.Graph(…)
![Page 37: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/37.jpg)
BEGIN
CsvReader( 'clients.csv' )
InsertOrUpdate( 'db.site', 'clients', key='guid' )
update_crm
retrieve_orders
![Page 38: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/38.jpg)
Graph…class Graph:
def __init__(self, *chain): self.edges = {} self.nodes = []
self.add_chain(*chain)
def add_chain(self, *nodes, _input=None, _output=None): # ...
![Page 39: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/39.jpg)
bonobo.run(graph)
or in a shell… $ bonobo run main.py
![Page 40: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/40.jpg)
BEGIN
CsvReader( 'clients.csv' )
InsertOrUpdate( 'db.site', 'clients', key='guid' )
update_crm
retrieve_orders
![Page 41: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/41.jpg)
BEGIN
CsvReader( 'clients.csv' )
InsertOrUpdate( 'db.site', 'clients', key='guid' )
update_crm
retrieve_orders
Context +
Thread
Context +
Thread
Context +
Thread
Context +
Thread
![Page 42: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/42.jpg)
Context…class GraphExecutionContext: def __init__(self, graph, plugins, services): self.graph = graph self.nodes = [ NodeExecutionContext(node, parent=self) for node in self.graph ] self.plugins = [ PluginExecutionContext(plugin, parent=self) for plugin in plugins ] self.services = services
![Page 43: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/43.jpg)
Strategy…class ThreadPoolExecutorStrategy(Strategy): def execute(self, graph, plugins, services): context = self.create_context(graph, plugins, services) executor = self.create_executor()
for node_context in context.nodes: executor.submit( self.create_runner(node_context) )
while context.alive: self.sleep()
executor.shutdown()
return context
![Page 44: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/44.jpg)
</ implementation details >
![Page 45: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/45.jpg)
Transformations
a.k.a nodes in the graph
![Page 46: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/46.jpg)
Functionsdef get_more_infos(api, **row): more = api.query(row.get('id'))
return { **row, **(more or {}), }
![Page 47: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/47.jpg)
Generatorsdef join_orders(order_api, **row): for order in order_api.get(row.get('customer_id')): yield { **row, **order, }
![Page 48: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/48.jpg)
Iteratorsextract = ( 'foo', 'bar', 'baz', )
extract = range(0, 1001, 7)
![Page 49: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/49.jpg)
Classesclass RiminizeThis: def __call__(self, **row): return { **row, 'Rimini': 'Woo-hou-wo...', }
Anything, as long as it’s callable().
![Page 50: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/50.jpg)
Configurable classesfrom bonobo.config import Configurable, Option, Service
class QueryDatabase(Configurable):
table_name = Option(str, default=‘customers')
database = Service('database.default')
def call(self, database, **row): customer = database.query(self.table_name, customer_id=row['clientId']) return { **row, 'is_customer': bool(customer), }
![Page 51: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/51.jpg)
Configurable classesfrom bonobo.config import Configurable, Option, Service
class QueryDatabase(Configurable):
table_name = Option(str, default=‘customers')
database = Service('database.default')
def call(self, database, **row): customer = database.query(self.table_name, customer_id=row['clientId']) return { **row, 'is_customer': bool(customer), }
![Page 52: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/52.jpg)
Configurable classesfrom bonobo.config import Configurable, Option, Service
class QueryDatabase(Configurable):
table_name = Option(str, default=‘customers')
database = Service('database.default')
def call(self, database, **row): customer = database.query(self.table_name, customer_id=row['clientId']) return { **row, 'is_customer': bool(customer), }
![Page 53: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/53.jpg)
Configurable classesfrom bonobo.config import Configurable, Option, Service
class QueryDatabase(Configurable):
table_name = Option(str, default=‘customers')
database = Service('database.default')
def call(self, database, **row): customer = database.query(self.table_name, customer_id=row['clientId']) return { **row, 'is_customer': bool(customer), }
![Page 54: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/54.jpg)
Configurable classesquery_database = QueryDatabase( table_name='test_customers', database='database.testing', )
![Page 55: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/55.jpg)
Services
![Page 56: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/56.jpg)
Define as namesclass QueryDatabase(Configurable):
database = Service('database.default')
def call(self, database, **row): return { … }
![Page 57: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/57.jpg)
Runtime injectionimport bonobo
graph = bonobo.Graph(...)
def get_services(): return { ‘database.default’: MyDatabaseImpl() }
![Page 58: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/58.jpg)
Bananas!
![Page 59: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/59.jpg)
Librarybonobo.FileReader(…) bonobo.CsvReader(…) bonobo.JsonReader(…) bonobo.PickleReader(…)
bonobo.ExcelReader(…) bonobo.XMLReader(…)
… more to come
bonobo.FileWriter(…) bonobo.CsvWriter(…) bonobo.JsonWriter(…) bonobo.PickleWriter(…)
bonobo.ExcelWriter(…) bonobo.XMLWriter(…)
… more to come
![Page 60: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/60.jpg)
Librarybonobo.Limit(limit) bonobo.PrettyPrinter() bonobo.Filter(…)
… more to come
![Page 61: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/61.jpg)
Extensions & Plugins
![Page 62: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/62.jpg)
Console Plugin
![Page 63: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/63.jpg)
Jupyter Plugin
![Page 64: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/64.jpg)
SQLAlchemy Extensionbonobo_sqlalchemy.Select( query, *, pack_size=1000, limit=None )
bonobo_sqlalchemy.InsertOrUpdate( table_name, *, fetch_columns, insert_only_fields, discriminant, … )
PREVIEW
![Page 65: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/65.jpg)
Docker Extension$ pip install bonobo[docker]
$ bonobo runc myjob.py
PREVIEW
![Page 66: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/66.jpg)
Dev KitPREVIEW
https://github.com/python-bonobo/bonobo-devkit
![Page 67: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/67.jpg)
More examples
?
![Page 68: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/68.jpg)
EXAMPLE_1 -> EXAMPLE_2
…demo
• Use filesystem service.
• Write to a CSV
• Also write to JSON
![Page 69: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/69.jpg)
EXAMPLE_3
Rimini open data
![Page 70: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/70.jpg)
~/bdk/demos/europython2017
Europython attendees
featuring… jupyter notebookselenium & firefox
![Page 71: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/71.jpg)
~/bdk/demos/sirene
French companies registry
featuring… docker
postgresql sql alchemy
![Page 72: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/72.jpg)
Wrap up
![Page 73: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/73.jpg)
Young• First commit : December 2016
• 23 releases, ~420 commits, 4 contributors
• Current « stable » 0.4.3
• Target : 1.0 early 2018
![Page 74: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/74.jpg)
Python 3.5+• {**}
• async/await
• (…, *, …)
• GIL :(
![Page 75: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/75.jpg)
1.0• 100% Open-Source.
• Light & Focused.
• Very few dependencies.
• Comprehensive standard library.
• The rest goes to plugins and extensions.
![Page 76: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/76.jpg)
Small scale• 1 minute to install
• Easy to deploy
• NOT : Big Data, Statistics, Analytics …
• IS : Lean manufacturing for data
![Page 77: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/77.jpg)
Interwebs are crazy
![Page 78: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/78.jpg)
Data Processing for Humans
![Page 79: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/79.jpg)
www.bonobo-project.org
docs.bonobo-project.org
bonobo-slack.herokuapp.com
github.com/python-bonobo
Let me know what you think!
![Page 80: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/80.jpg)
Sprint• Sprints at Europython are amazing
• Nice place to learn about Bonobo, basics, etc.
• Nice place to contribute while learning.
• You’re amazing.
![Page 82: EP2017 - Bonobo ETL · PDF file• CloverETL (IDE/Java) Talend Open Studio . Data Integration Tools • Java + IDE based, for most of them • Data](https://reader031.vdocuments.us/reader031/viewer/2022020302/5ab6f6587f8b9a2f438e428f/html5/thumbnails/82.jpg)
bonobo@monkcage