hands-on with google’s machine learning apis, 11/17/2016
TRANSCRIPT
Hands-On with Google’s Machine
Learning APIsStephen Wylie
11/17/2016
@SWebCEO+StephenWyliemrcity
About meWEBSITESwww.stev-o.usgoshtastic.blogspot.comwww.ledgoes.comwww.openbrite.com
[email protected]@openbrite.com
G+https://plus.google.com/u/1/+StephenWylie
TWITTER@SWebCEO
GITHUBmrcity
Senior Software Engineer at Capital One
Test/QA Lead for Auto Finance Innovation
Successful Kickstarter (1000% funded - BriteBlox)
Intel Innovator, DMS Member, Vintage computer collector/restorer/homebrewer, hackathoner
Civic Hacking Cmte Chair @ DMS@SWebCEO +StephenWylie #MachineLearning
Tonight’s Mission
Touch on ML API offerings Explore Google’s RESTful ML APIs
Cloud Vision & Natural Language API Prediction API TensorFlow (Not a RESTful API but still cool)
Allocate time to play, develop ideas Have good conversations, network Find learning partner or group@SWebCEO +StephenWylie #MachineLearning
Before We Start…
Hopefully you followed instructions on https://github.com/mrcity/mlworkshop/
Get access to the APIsGet a REST Client Install TensorFlow@SWebCEO +StephenWylie #MachineLearning
What is Machine Learning?
@SWebCEO +StephenWylie #MachineLearning
Who’s using ML?
Chat bots? Self-driving cars? Pepper the robot? Uber for selfie ID? Document recognition & field extraction
@SWebCEO +StephenWylie #MachineLearning
Machine Learning Tools Ecosystem
APIs you interface with HP, Amazon, Microsoft, IBM, Google, Facebook’s Caffe on mobile
& Web Software you use
Orange (U of Ljubljana, Slovenia) Weka (U of Waikato, New Zealand)
Hardware you compile programs to run on nVidia GPUs with CUDA, DGX-1 supercomputer Can BTC hardware be used for ML?
@SWebCEO +StephenWylie #MachineLearning
Google’s ML APIs In Particular
Google Play Services Mobile Vision API
RESTful ML Services Cloud Vision API Cloud Natural Language API Prediction API
Local ML Services TensorFlow SyntaxNet
@SWebCEO +StephenWylie #MachineLearning
^ Pre-defined models
v User-defined models
Cloud Vision APIWhat does it look like to you?
@SWebCEO +StephenWylie #MachineLearning
Detect Faces, Parse Barcodes, Segment Text
@SWebCEO +StephenWylie #MachineLearning
Availability
Native AndroidNative
iOSRESTful
API
FACE API BARCODE API TEXT API
What do you see in that cloud?
Breaks down into more features than just FACE, BARCODE, and TEXT:
From https://cloud.google.com/vision/docs/requests-and-responses
@SWebCEO +StephenWylie #MachineLearning
Feature Type DescriptionLABEL_DETECTION Execute Image Content Analysis on the entire image and returnTEXT_DETECTION Perform Optical Character Recognition (OCR) on text within the
imageFACE_DETECTION Detect faces within the imageLANDMARK_DETECTION
Detect geographic landmarks within the image
LOGO_DETECTION Detect company logos within the imageSAFE_SEARCH_DETECTION
Determine image safe search properties on the image
IMAGE_PROPERTIES Compute a set of properties about the image (such as the image's dominant colors)
Cloud Vision APIs
Can simultaneously detect multiple featuresFeatures billed individually per use on imageNo Barcode feature as yetSimple JSON request/response formatSubmit image from Cloud Storage or in
Base64Returns 0 or more annotations by confidence@SWebCEO +StephenWylie #MachineLearning
Response types
@SWebCEO +StephenWylie #MachineLearning
Feature Returns
Label Description of the picture’s contentsConfidence score
Text, Logo Text contents or logo owner nameBounding polygon containing the text or logo[Logo only] Confidence score
Face Bounding polygon and rotational characteristics of the facePositions of various characteristics such as eyes, ears, lips, chin, foreheadConfidence score of exhibiting joy, sorrow, anger, or surprise
Landmark
Safe Search Likelihood of the image containing adult or violent content, that it was a spoof, or contains graphic medical imagery
Image properties
Dominant RGB colors within the image, ordered by fraction of pixels
Demo
@SWebCEO +StephenWylie #MachineLearning
Mobile Vision vs. Cloud Vision
Mobile Vision is for Native Android
Handles more data processingCan utilize camera videoTakes advantage of OpenGL
(& hardware?)@SWebCEO +StephenWylie #MachineLearning
Cloud Natural Language APIMaking computers speak human
@SWebCEO +StephenWylie #MachineLearning
Natural Language API: Analyze Any ASCIIParses text for parts of speechDiscovers entities like organizations, people,
locationsAnalyzes text sentimentUse Speech, Vision, Translate APIs upstreamWorks with English, Spanish, or JapaneseSentiment analysis only available for English@SWebCEO +StephenWylie #MachineLearning
Sample NL API Request
@SWebCEO +StephenWylie #MachineLearningFrom https://cloud.google.com/natural-language/docs/basics
<- Optional, can be guessed automatically
<- Not required for Sentiment Analysis queries
<- Optional, defaults to Entities
Interpreting NL API Sentiment Responses
@SWebCEO +StephenWylie #MachineLearning
POLARITY
-1 1
MAGNITUDE
0 ∞
1
10
102
103
Sample analyzeSentiment response for the
Gettysburg Address:
{ “polarity”: 0.4, “magnitude”: 3.8}
Demo
@SWebCEO +StephenWylie #MachineLearning
Google Prediction APITo further their conquest for all knowledge past, present, and future
@SWebCEO +StephenWylie #MachineLearning
Making Predictions With Google
Build “trained” model or use “hosted” model Hosted models (all demos):
Language identifier Tag categorizer (as android, appengine, chrome, youtube) Sentiment predictor
Trained models: Submit attributes and labels for each example Need at least six examples Store examples in Cloud Storage
@SWebCEO +StephenWylie #MachineLearning
Don’t Model Trains; Train Your Model
Train API against dataprediction.trainedmodels.insert
Send prediction queryprediction.trainedmodels.predict
Update the modelprediction.trainedmodels.update
Other CRUD operations: list, get, delete@SWebCEO +StephenWylie #MachineLearning
Don’t Model Trains; Train Your Model
Insert query requires:idmodelTypestorageDataLocation
Don’t forget: poll for status updates
@SWebCEO +StephenWylie #MachineLearning
Demo
@SWebCEO +StephenWylie #MachineLearning
TensorFlowAll that Linear Algebra you slept through in college
@SWebCEO +StephenWylie #MachineLearning
About TensorFlow
Offline library for large-scale numerical computation
Think of a graph: Nodes represent
mathematical operations Edges represent tensors
flowing between them Excellent at building
deep neural networks@SWebCEO +StephenWylie #MachineLearning
𝑅𝑒𝐿𝑈𝑛 ≈ 𝑓 (𝑥 )
Tense About Tensors?
Think about MNIST handwritten digitsEach number is 28 pixels squaredThere are 10 numbers, 0-9
@SWebCEO +StephenWylie #MachineLearning
Tense About Tensors?
Define an input tensor of shape(any batch size, 784) x = tf.placeholder(tf.float32, shape=[None, 784])
Define a target output tensor of shape(any batch size, 10) y_ = tf.placeholder(tf.float32, shape=[None, 10])
Define weights matrix (784x10)and biases vector (10-D)
@SWebCEO +StephenWylie #MachineLearning
One-Hot: Cool To the Touch Load the input data
from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)
One-hot?! Think about encoding categorical features:
US = 0, UK = 1, India = 2, Canada = 3, … This implies ordinal properties and confuses learners Break encoding into Booleans:
This is where the 10-D target output tensor comes from
@SWebCEO +StephenWylie #MachineLearning
US = [1, 0, 0, 0]UK = [0, 1, 0, 0]Etc…
TensorFlow Data Structures - Placeholders Come from inputs prior to computation x (input picture as vector), y_ (one-hot
10-D classification vector)
@SWebCEO +StephenWylie #MachineLearning
[0, 0, 0, 0, 0, 0, 0, 0, …
0, 0, 0, 0, 0, 1, 1, 0, …
0, 0, 0, 0, 1, 1, 1, 0, …
0, 0, 0, 0, 1, 1, 0, 0, …
0, 0, 0, 1, 1, 1, 0, 0, …...
[0, 0, 0, 0, 1,0, 0, 0, 0, 0]
Input x y_
TensorFlow Data Structures – Variables Values (i.e. model parameters) inside
nodes in a graph Used and modified by learning process Need to be initialized with
init = tf.initialize_all_variables() W (weights to scale inputs by), b (bias to
add to scaled value)@SWebCEO +StephenWylie #MachineLearning
Training a Dragon, if the Dragon is a Model Your Simple Model:
y = tf.nn.softmax(tf.matmul(x, W) + b) Cross-entropy: distance between guess & correct answer
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
Gradient descent: minimize cross-entropytrain_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
Learning rate: 0.5@SWebCEO +StephenWylie #MachineLearning
𝐻 𝑦 ′ (𝑦 )=−∑𝑖𝑦 𝑖′ log (𝑦 𝑖)
Dragon Get Wiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiings!
Start a Session Run initialize_all_variables Run training for 1000 steps
sess = tf.Session()sess.run(init)for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
Expensive to use all training data at once! Pick 100 random samples each step
@SWebCEO +StephenWylie #MachineLearning
Test Flight Evaluation
Compare labels between guess y and correct y_
correct_prediction =tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
Cast each Boolean result into either a 0 or 1, then average itaccuracy = tf.reduce_mean( tf.cast(correct_prediction, tf.float32))
Print the final figureprint(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
@SWebCEO +StephenWylie #MachineLearning
Demo
@SWebCEO +StephenWylie #MachineLearning
Future Talks, Other Talks
Follow me if you want to hear these! Build a Neural Network in Python with NumPy Build a Neural Network with nVidia CUDA
Elsewhere, Mapmaking with Google Maps API, Polymer,
and Firebase The Process Of Arcade Game ROM Hacking
@SWebCEO +StephenWylie #MachineLearning
More Resources
Google’s “Googly Eyes” Android app [Mobile Vision API]https://github.com/googlesamples/android-vision/tree/master/visionSamples/googly-eyes
Quick, Draw! Google classification API for sketcheshttps://quickdraw.withgoogle.com/
@SWebCEO +StephenWylie #MachineLearning
Thank You@SWebCEO +StephenWylie #MachineLearning