machine learning on azure - azureconf

Post on 24-May-2015

764 Views

Category:

Software

7 Downloads

Preview:

Click to see full reader

DESCRIPTION

Machine Learning can often be a daunting subject to tackle much less utilize in a meaningful manner. In this session, attendees will learn how to take their existing data, shape it, and create models that automatically can make principled business decisions directly in their applications. The discussion will include explanations of the data acquisition and shaping process. Additionally, attendees will learn the basics of machine learning - primarily the supervised learning problem.

TRANSCRIPT

Machine Learning on

AzureSeth Juarez

Analytics Program ManagerDevExpress

@sethjuarez

Questions?#azureconf

on Twitter

Agenda

1)data science2)prediction3)process4)models5)AzureML

data science• key word: “science”• try stuff• it (might not | won’t) work

the first time

• this might work…question

• wikipedia timeresearch

• I have an ideahypothesis

• try it outexperiment

• did this even work?analysis

• time for a better idea

conclusion

machine learning• finding (and exploiting) patterns in data• replacing “human writing code” with “human supplying data”• system figures out what the person wants based on examples• need to abstract from “training” examples to “test” examples•most central issue in ML: generalization

machine learning

•split into two (ish) areas•supervised learning• predicting the future• learn from past examples to predict future

•unsupervised learning• understanding the past• making sense of data• learning structure of data• compressing data for consumption

neat applications

neat applications

9

neat applications• spam catchers• ocr (optical character recognition)• natural language processing•machine translation• biology•medicine• robotics (autonomous systems)• etc…

predictionmaking decisions

11

making decisions

•what kinds of decisions are we making?• binary classification• yes/no, 1/0, male/female

•multi-class classification• {A, B, C, D, F} (Grade),

{1, 2, 3, 4} (Class), {teacher, student, secretary}

• regression• number between 0 and 100, real value

process

data

1clean

transformmaths

2model

3

predict

4

dataClass Outlook Temp. Windy

Play Sunny Low Yes

No Play Sunny High Yes

No Play Sunny High No

Play Overcast Low Yes

Play Overcast High No

Play Overcast Low No

No Play Rainy Low Yes

Play Rainy Low No

? Sunny Low No

label (y)play / no play

featuresoutlook, temp, windy

values (x)[Sunny, Low, Yes]

Labeled dataset is a collection of (X, Y) pairs.Given a new x, how do we predict y?

clean / transform / mathsClass Outlook Temp. Windy

Play Sunny Lowest Yes

No Play ? High Yes

No Play Sunny High KindOf

Play Overcast ? Yes

Play Turtle Cloud

High No

Play Overcast ? No

No Play Rainy Low 28%

Play Rainy Low No

? Sunny Low No

need to clean up dataneed to convert to model-able form (linear algebra)

yak shavingAny apparently useless activity which, by allowing you to overcome intermediate difficulties, allows you to solve a larger problem.

I was doing a bit of yak shaving this morning, and it looks like it might have paid off.

http://en.wiktionary.org/wiki/yak_shaving

clean / transform / mathsClass Outlook Temp. Windy

Play Sunny Low Yes

No Play Sunny High Yes

No Play Sunny High No

Play Overcast Low Yes

Play Overcast High No

Play Overcast Low No

No Play Rainy Low Yes

Play Rainy Low No

? Sunny Low No

need to clean up dataneed to convert to model-able form (linear algebra)

modelClass Outlook Temp. Windy

Play Sunny Low Yes

No Play Sunny High Yes

No Play Sunny High No

Play Overcast Low Yes

Play Overcast High No

Play Overcast Low No

No Play Rainy Low Yes

Play Rainy Low No

? Sunny Low No

predict

PLAY!!!

Class Outlook

Temp. Windy

? Sunny Low No

modelshow do we build them?

19

linear classifiers

• in order to classify things properly we need:• a way to mathematically represent examples• a way to separate classes (yes/no)

•“decision boundary”•excel example•graph example

MODELS

20

linear classifiers

•dot product of vectors• [ 3, 4 ] ● [ 1, 2 ] = (3 × 1) + (4 × 2) = 11• a ● b = | a | × | b | cos θ•When does this equal 0?

•why would this be useful?• decision boundary can be represented using a single vector

MODELS

perceptron…and other linear models

22

linear classifiers

•Frank Rosenblatt, Cornell 1957• let’s make a line (by using a single vector) • take the dot product between the line and the new point• > 0 belongs to class 1• < 0 belongs to class 2• == 0 flip a coin we don’t know

• for each example, if we make a mistake, move the line

MODELS

perceptronpoint demo

perceptron

numerical features

× (dot)

learned vector

+1 / -1

what if….

kernel methodsmodels

kernel methods

=

features….

perceptron

•minimize mistakes by moving w

subject to:

REMINDER

perceptron

•eventually this becomes an optimization problem

subject to:

REMINDER

perceptron

•eventually this becomes an optimization problem

subject to:

REMINDER

perceptron

•eventually this becomes an optimization problem

subject to:

REMINDER

dot product

32

perceptron

•Frank Rosenblatt, Cornell 1957• let’s make a line (by using a single vector) • take the dot product between the line and the new point• > 0 belongs to class 1• < 0 belongs to class 2• == 0 flip a coin we don’t know

• for each example, if we make a mistake, move the line

REMINDER

kernel (one weird trick….)

•store dot product in a table

•call it the “kernel matrix” and “kernel trick”•project into any space and still learn a linear model

MODELS

support vector machines

• this method is the basis for SVM’s• returns a set of vectors (<< n) to make decision•essentially changed the space to make it separable

MODELS

kernels

•polynomial kernel

•RBF kernel

MODELS

1

36

what if….

neural networksmodels

neural networks

neural networks

Play?

h1 ( ¿ ) h2 ( ¿ )

h3 ( ¿ )𝐵1

LINEAR METHODS

• perceptron (what if we can’t make a line?)• svm – change the space• neural networks – change the function

(linear?)

decision treesmodels

decision treesClass Outlook Temp. Windy

Play Sunny Low Yes

No Play Sunny High Yes

No Play Sunny High No

Play Overcast Low Yes

Play Overcast High No

Play Overcast Low No

No Play Rainy Low Yes

Play Rainy Low No

? Sunny Low No

decision trees

•how should the computer split?• information gain (with entropy)• entropy measures how disorganized your answer is.• information gain says:• if I separate the answer by the values in

a particular column, does the answer become *more* organized?

decision trees

•calculating information gain:

• – how messy is the answer•– how messy is the answer if we know a?

decision treesdemo

POPULAR MODELS

• support vector machines• neural networks• decision trees

do they work?testing

how well is it doing?

Train Test

Use 80% Use 20%

AzureMLputting it all together

50

process reminder (same on Azure)

data

1clean

transformmaths

2model

3

predict

4

experimentsputting it all together

52

Truthtrue false

Guess

positivenegative

confusion matrix

AzureML WebServicesputting it all together

54

Get started with a free trial

Or, use your existing benefits…

http://aka.ms/AzureConf2014

http://aka.ms/AzureConf-MemberOffers

THANK YOU!!!

AND STAY TUNED FOR THE NEXT SESSIONS!!!!!

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Seth Juarez

Analytics Program Manager, DevExpress

@sethjuarez

sethj@devexpress.com

top related