Data Science Apps: Beyond NotebooksNatalino Busa
2 Natalino Busa - @natbusa
Linkedin + Twitter + Github: @natbusa
DBS
Teradata
Cognitive Finance
ING Group
O’Reilly
Philips
3 Natalino Busa - @natbusa
Icons made by Gregor Cresnar from www.flaticon.com is licensed by CC 3.0 BY
Learning: The Scientific Method
Ørsted's "First Introduction to General Physics" (1811) https://en.m.wikipedia.org/wiki/History_of_scientific_method
observation hypothesis deduction synthesis
Hans Christian Ørsted
experiment
6 Natalino Busa - @natbusa
The Jupyter Projecthttp://jupyter.org
7 Natalino Busa - @natbusa
Jupyter notebook: what is it?
The Jupyter NotebookThe Jupyter Notebook is a web application that
allows you to create and share documents that
contain live code, equations, visualizations and
explanatory text.
Uses include: data cleaning and
transformation, numerical simulation,
statistical modeling, machine learning and
much more.
credit : Jupyter projectextracted from http://jupyter.org/index.html
8 Natalino Busa - @natbusa
Jupyter notebook: why?
Language of choice
The Notebook has support for over 40 programming languages, including those popular in Data Science such as Python, R, Julia and Scala.
Share notebooks
Notebooks can be shared with others using email, Dropbox, GitHub and the Jupyter Notebook Viewer.
Interactive widgets
Code can produce rich output such as images, videos, LaTeX, and JavaScript. Interactive widgets can be used to manipulate and visualize data in realtime.
Big data integration
Leverage big data tools, such as Apache Spark, from Python, R and Scala. Explore that same data with pandas, scikit-learn, ggplot2, dplyr, etc.
credit : Jupyter projectextracted from http://jupyter.org/index.html
9 Natalino Busa - @natbusa
Text Cell
Code Cell
Cell Input
Cell Output
Edit, Run, Kernel, Widgets Menu’s
Kernel Type
Cell output: ASCII, HTML, Image. etc
10 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
Jupyter Notebook Server Kernel∅MQ
Notebook files
Jupyter Notebook Web App
WebBrowser
HTTP
Websockets
https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
11 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
• Modular architecture:
Web App, Server, Kernel
• Kernels:
Python, R, Scala, Bash, SQL
• Web App:
Asynchronous, rich editing, syntax highlight, export and share
12 Natalino Busa - @natbusa
Jupyter Notebook
● Narratives and Use Cases
Narratives are collaborative, shareable, publishable, and reproducible. We believe that Narratives help both yourself and other researchers by sharing your use of Jupyter projects, technical specifics of your deployment, and installation and configuration tips so that others can learn from your experiences.
From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
13 Natalino Busa - @natbusa
Jupyter is more than Notebooks
“ What if I told you that the notebook
is NOT the only sort of narrative that
you can create with the Jupyter
project? ”
14 Natalino Busa - @natbusa
Examples of Jupyter powered narratives
● O’Reilly Orioles
● Examples - build your own!
16 Natalino Busa - @natbusa
Geolocated clustering and prediction
services with scikit-learn
Learn how to build a venue
recommender and a geofencing
alerting engine using geolocated data,
ML clustering algorithms, and
scikit-learn
17 Natalino Busa - @natbusa
Build your own narrative!
What do you need?
Understand how to communicate to the jupyter server
Two ways: websockets or http api endpoints
Build your own web application
Many ways: e.g. angular, polymer, dart, etc
1
2
18 Natalino Busa - @natbusa
Demos: kernel gateway
Purpose:
- Understand how to expose API endpoints
- Build your own narrative!
- Productivity gain: faster app prototyping
19 Natalino Busa - @natbusa
20 Natalino Busa - @natbusa
Jupyter Gateway: expose API endpoints
Declare the endpoint
Declear MIME type, Headers, Status
GET http://localhost:8800/counters/my_counter
21 Natalino Busa - @natbusa
Jupyter: docker stacks
Docker container:jupyter notebook + apache toree
https://github.com/jupyter/docker-stacks
22 Natalino Busa - @natbusa
Dockerize your jupyter gateway api
IMAGE=demos/kernel_gateway_demo
docker build -t $(IMAGE) .
docker run -p 8888:8888 $(IMAGE) \ jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 \ --KernelGatewayApp.port=8888 \ --KernelGatewayApp.api=notebook-http \ --KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
23 Natalino Busa - @natbusa
Big Data apps:Dockerize your jupyter gateway api with Toree
Jupyter Kernel Gateway Toree Kernel∅MQ
Notebook files
WebBrowser
Your ownWeb App
HTTP REST API
Docker Containers
on
e w
ebse
ssio
n =
o
ne
serv
er o
n a
clo
ud
24 Natalino Busa - @natbusa
Summary
• Jupyter notebook is a great way to create and share
data-driven uses cases and projects
• Jupyter is more than notebooks
– gateway, kernels, hub, etc
• Narratives powered by jupyter
– O’ Reilly Orioles
– build your own narrative
25 Natalino Busa - @natbusa
Resources
Jupyter
http://jupyter.org/index.html
https://jupyter.readthedocs.io/en/latest/index.html#
Jupyter Kernel Gateway
https://github.com/jupyter/kernel_gateway
http://jupyter-kernel-gateway.readthedocs.io/en/latest/
Jupyter Con (first of its kind!)
https://conferences.oreilly.com/jupyter/jup-ny
Apache Toree (Spark Kernel)
https://toree.apache.org/
Web application dev
https://angular.io/
https://www.polymer-project.org/1.0/
Docker
https://github.com/jupyter/docker-stacks
https://www.docker.com/