Download - Pycon 2012 Scikit-Learn
![Page 1: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/1.jpg)
Learning from the Past: with Scikit-Learn
ANOOP THOMAS MATHEWProfoundis Labs Pvt. Ltd.
![Page 2: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/2.jpg)
![Page 3: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/3.jpg)
●Basics of Machine Learning
● Introduction some common techniques
●Let you know scikit-learn exists
●Some inspiration on using machine learning in daily life scenarios and live projects.
Agenda
![Page 4: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/4.jpg)
How to draw a snake?How to draw a snake?
![Page 5: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/5.jpg)
How to draw a snake?How to draw a snake?
This is WEIRD!
![Page 6: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/6.jpg)
A lot of Data!
What to do???
Introduction
![Page 7: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/7.jpg)
What is
Machine Learning (Data Mining)?
(in plain english)
Introduction
![Page 8: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/8.jpg)
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E"
Tom M. Mitchell
Machine Learning
![Page 9: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/9.jpg)
●Supervised Learning - model.fit(X, y)●Unsupervised Learning - model.fit(X)
Machine Learning
![Page 10: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/10.jpg)
Supervised Learning
![Page 11: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/11.jpg)
from sklearn.linear_model import Ridge as RidgeRegressionfrom sklearn import datasetsfrom matplotlib import pyplot as plt
boston = datasets.load_boston()X = boston.datay = boston.targetclf = RidgeRegression()clf.fit(X, y)clf.predict(X)
For example ...
![Page 12: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/12.jpg)
Unsupervised Learning
![Page 13: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/13.jpg)
from sklearn.cluster import KMeansfrom numpy.random import RandomStaterng = RandomState(42)k_means = KMeans(3, random_state=rng)k_means.fit(X)
For example ...
![Page 14: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/14.jpg)
ClusteringClassificationRegression
What can Scikit-learn do?
![Page 15: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/15.jpg)
• Model the collection of parameters you are trying to fit
• Data what you are using to fit the model
• Target the value you are trying to predict with your model
• Features attributes of your data that will be used in prediction
• Methods algorithms that will use your data to fit a model
Terminology
![Page 16: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/16.jpg)
● Understand the task. See how to measure the performance.
● Choose the source of training experience.
● Decide what will be input and output.
● Choose a set of models to the output function.
● Choose a learning algorithm.
Steps for Analysis
![Page 17: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/17.jpg)
● Understand the task. See how to measure the performance. Find the right question to ask.
● Choose the source of training experience.● Keep training and testing dataset separate. Beware of overfitting !
● Decide what will be input and expected output.
● Choose a set of models to approximate the output function. (use dimensinality reduction)
● Choose a learning algorithm. Try different ones ;)
Steps for Analysis
![Page 18: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/18.jpg)
Principal Component Analysis
Some Common AlgorithmsSome Common Algorithms
![Page 19: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/19.jpg)
● Support Vector Machine
Some Common AlgorithmsSome Common Algorithms
![Page 20: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/20.jpg)
● Nearest Neighbour Classifier
Some Common AlgorithmsSome Common Algorithms
![Page 21: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/21.jpg)
● Decision Tree Learning
Some Common AlgorithmsSome Common Algorithms
![Page 22: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/22.jpg)
k-means clustering
Some Common AlgorithmsSome Common Algorithms
![Page 23: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/23.jpg)
DB SCAN Clustering
Some Common AlgorithmsSome Common Algorithms
![Page 24: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/24.jpg)
●Log file analysis●Outlier dectection●Fraud Dectection●Forcasting●User patterns
Some Example UsecasesSome Example Usecases
![Page 25: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/25.jpg)
●nltk is a good(better) for text processing
●scikit-learn is for medium size problems
● for humongous projects, think of mahout●matplotlib can be used for visualization
● visualize it in browser using d3.js● have a look at pandas for numerical analysis
A few commentsA few comments
![Page 26: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/26.jpg)
●This is just the tip of an iceberg.●Scikit-learn is really cool to hack with.●A lot of examples(http://scikit-learn.org/stable/auto_examples/index.html)
Conclusion Conclusion
![Page 27: Pycon 2012 Scikit-Learn](https://reader034.vdocuments.us/reader034/viewer/2022050808/54c6a5184a7959d9228b4578/html5/thumbnails/27.jpg)
pip install scikit-learn
Its all in the internet.
Happy Hacking!
Final words Final words