l 1 intro machine learning

Upload: prateek-singh

Post on 03-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 l 1 Intro Machine Learning

    1/45

    Unit Coverage

    Machine Learning Basics

    Machine Learning Applications

    Supervised and Unsupervised learnin

    Gradient Descent For learningCopyright @ gdeepak.com6/4/2014 6:53 PM

  • 8/12/2019 l 1 Intro Machine Learning

    2/45

    Machine Learning

    Machine learning, a branch of artificial intelligenconcerns the construction and study of systems thatlearn from data. -Wikipedia

    It is a type of AI that provides computers with the abilitylearn without being explicitly programmed. Machlearning focuses on the development of computer prograthat can teach themselves to grow and change wh

    exposed to new data.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    3/45

    Machine Learning-Abstract Definitio

    A Machine (Computer Program) Learns with experience E forSome Task T and Performance Measure P, if P keeps onincreasing with increase in E.

    Example: Someone Writes a Program to classify and filter you

    emails as spam or not based on your marking of individual maas spam or not. What is T, E and P in this example.

    Your marking of an email as Spam or not

    Percentage of emails being true positive for spam

    Recording of your labelling or classification of email as spam

    or not

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    4/45

    Characteristic of Machine Learning

    - Way learning happens is very critical- Applies to tasks that can not be defined well, except

    examples

    - To find relationships and correlations that can be hidd

    in the data- To learn in proportion to the experience e.g. becoming

    better player after playing many games

    - Results may vary vastly if we apply different learn

    paths or different algorithms of ML

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    5/45

    Two Basic Types

    Supervised LearningIn general it means that we are going to supervise

    learning mechanism or we are going to supply someguidelines/ parameters/ labelling regarding the data

    In supervised learning, training patterns giving inputs athe corresponding correct outputs are available.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    6/45

    Unsupervised Learning

    Learning happens automatically and the structuhidden in the data are recognised by the system

    System must find interesting and/or significant pattein the data without any feedback as to what is right

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    7/45

    Grading Example

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    A

    B

    C

    D

    Grades

    Markss

  • 8/12/2019 l 1 Intro Machine Learning

    8/45

    Handwriting Recognition

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    S

    S

    S written by Different People

  • 8/12/2019 l 1 Intro Machine Learning

    9/45

    Supervised Learning

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    For each Data Point we will let the machine know whetheit is a star or smiley

  • 8/12/2019 l 1 Intro Machine Learning

    10/45

    Classification or regression

    When we have discrete outputs then the problem iclassification problem

    When we have continuous outputs then it is a regressproblem

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    11/45

    Traffic Time Prediction

    You Give 10 Actual Timings to reach from AmbalaDelhi if you start at different times of the day starting9 A.M.

    Now you want to Predict the timings at some other tim

    You may use different Curve Fittings

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    12/45

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    13/45

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    14/45

    Speech Recognition

    Database of Requests User speaks something, you need to Identify the reque

    You need to capture the individual recordings frmeeting of four people

    You need t0 separate the conversation from backgrnoise or music

    You need to understand the speech in a particular lanand do the text labelling

    You need to convert the speech in some other languag

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    15/45

    Medical Diagnosis

    Given Symptom and Disease database A new patient with some symptom comes, you need to

    identify the disease

    To diagnose the disease from the test results and by

    analysing the images from the medical equipment T0 recognise disability by looking at the photograph o

    the person

    Different machine learning tests for various disabilitie

    e.g. hearing test

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    16/45

    Unsupervised Learning

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Each Data Point is given but not labelled; machine is supposedfind some structure in the data; in this example called clusters

  • 8/12/2019 l 1 Intro Machine Learning

    17/45

    News.google.com

    All similar stories are clustered at one place

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    18/45

    Participant Segmentation

    If I give all the Registration Form data of IWMLDA to Sounsupervised learning based program and it comes out wsome grouping based on the distinguishing features giin the Registration Form.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    19/45

    Social Network Analysis

    To find groups of certain kind on theNetwork based on their activity, Cohesiveness,

    Type of Chatter, Type of likes etc

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    20/45

    Sentiment Analysis

    You want to buy a product and you want to knowsentiments of the public who has previously usedbought that product.

    There can be different kinds of sentiments that hbeen expressed online; It may be related to spoPolitics, Tragedy, Agitation/ Revolution

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    21/45

    Semi-Supervised Learning

    Semi-supervised learning is a class of supervised learntasks and techniques that also make use of unlabeled dfor training - typically a small amount of labeled data witlarge amount of unlabeled data.

    Actually it will depend upon the type and size of davailable.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    22/45

    Training Set-Old car price example

    Mileage Car Price

    2000 300000

    20000 200000

    18000 220000

    100000 100000

    50000 150000

    80000 130000

    10000 250000

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    23/45

    Some Terminology

    m= number of training records x= input features/ input values of the variables (can be

    more than one)

    y= output value ( Can be more than one)

    (x,y) is a pair of one training record

    (x (i), y(i)) is ith pair of training example

    i is parameter of feature x

    What is y4 and what is x2 on the previous slide

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    24/45

    General Model

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Mileage (x)Hypothesis

    (h)

    Training Records

    Car Price (y)

    Learning

    Algorithm

  • 8/12/2019 l 1 Intro Machine Learning

    25/45

    Hypothesis Parameters

    Only point to be kept in mind while selecting i is tha

    should give the value of the hypothesis as close to y intraining record as possible

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    26/45

    Cost Function using Squared Error Functi

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    C Pl

  • 8/12/2019 l 1 Intro Machine Learning

    27/45

    Contour Plots

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    28/45

    Contour Figure

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    29/45

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    30/45

    Another Example

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    31/45

    Gradient Descent

    To minimize the cost functionMin

    (0, 1, 2.. n)J (0, 1, 2.. n)

    Start with some initial values of

    Keep applying gradient descent until we reach to theminimum possible value, which may be the optimal valuthe cost function.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    32/45

    Different Shape Bowls

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    33/45

    For convergence

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Where alpha is the learning rate. Learning rate also pla

    important role in the slow and fast convergence of GrDescent, but there is always a trade off. With small learningalgorithm may take many iteration and will be slow, whilelarge learning rate, the algorithm may be fast but it maconverge at all and we may skip or bypass the local or global m

    All values of j f

    to n shouldsimultaneouslyupdated

    Concept of learning rate on bowl shape

  • 8/12/2019 l 1 Intro Machine Learning

    34/45

    Concept of learning rate on bowl shapecurve

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    35/45

    Gradient Descent with linear regressio

    We repeat the following expression until the functionconverge for all values of .

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    If each step of the the gradient descent uses all the training records then

    algorithm comes under the category of batch gradient descent

  • 8/12/2019 l 1 Intro Machine Learning

    36/45

    Dealing with multiple variablesMileage Car Price Engine Size

    (No. of

    Cylinders)

    OriginalPrice

    AccessorCost

    2000 300000 4 500000 40000

    20000 200000 6 600000 3000

    18000 220000 4 450000 100000

    100000 100000 4 400000 5000050000 150000 8 800000 110000

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    37/45

    Feature scaling

    Since the range of values of raw data varies widely, in somachine learning algorithms, objective functions willwork properly without normalization. For example,majority of classifiers calculate the distance between tpoints by the distance. If one of the features has a br

    range of values, the distance will be governed by tparticular feature. Therefore, the range of all featushould be normalized so that each feature contribuapproximately proportionately to the final distance.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

  • 8/12/2019 l 1 Intro Machine Learning

    38/45

    How to do feature scaling

    Where is the mean or average value of the training valuof that feature and s is the range (max-min) of that featutraining value. We try to get every feature into

    Range. However if the feature values are not too muchdistorted then we may not decide to go for feature scaling

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    li l

  • 8/12/2019 l 1 Intro Machine Learning

    39/45

    Feature Scaling Example

    Average : 303000/5= 60600

    Max-Min= 107000

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Mileage Car Price Engine Size(No. of

    Cylinders)

    AccessoryCost

    AfterScaling

    2000 300000 4 40000 -0.19

    20000 200000 6 3000 -0.53

    18000 220000 4 100000 +0.37

    100000 100000 4 50000 -0.150000 150000 8 110000 +0.46

    C bi i

  • 8/12/2019 l 1 Intro Machine Learning

    40/45

    Combining Features

    Few features may have same values but may have bgiven in different units. For ex. Height in cm and heighinches. Similarly few features have parallel values e.g lenof the string, number of characters in the string etc

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Other Imp points regarding Convergence

  • 8/12/2019 l 1 Intro Machine Learning

    41/45

    Other Imp points regarding Convergence Gradient Descent

    For small learning rate J() should decrease on eviteration of the algorithm.

    Having learning rate too small or too large will haveown issues as discussed before.

    The number of iterations may vary from two digitsmany digits.

    If J() decreases by less than 0.001 then we can declconvergence, otherwise the delta change will besmall.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Q i

  • 8/12/2019 l 1 Intro Machine Learning

    42/45

    Question

    Does the learning Rate remains same or it changes ovetime. If yes, why. If No, Why.

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Q ti

  • 8/12/2019 l 1 Intro Machine Learning

    43/45

    Question

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Whether test sample (green circle)should be classified either to the firstclass of blue squares or to the secondclass of red triangles using k-NN

    technique.Ifk = 3 (solid line circle)Ifk = 5(dashed line circle)

    Q ti

  • 8/12/2019 l 1 Intro Machine Learning

    44/45

    Question

    What will be your criteria to decide whether to use featurscaling or not?

    6/4/2014 6:53 PM Copyright @ gdeepak.com

    Q ti S ti d C

  • 8/12/2019 l 1 Intro Machine Learning

    45/45

    Questions, Suggestions and Commen

    6/4/2014 6:53 PM Copyright @ gdeepak.com