ai lecture 11 & 12 - machine learning

Upload: anum-khawaja

Post on 03-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    1/53

    Machine LearningLecture 11 & 12

    Artificial Intelligence Spring 2013

    1

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    2/53

    2

    What is M achine L earni ng?

    The field of machine learning is concerned with the questionof how to construct computer programs that automatically

    improve with experience (T. Mitchell)

    Principles, methods, and algorithms for learning andprediction on the basis of past experience

    In the broadest sense, any method that incorporatesinformation from training samples in the design of aclassifier employs learning

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    3/53

    3

    What is M achine L earni ng?

    Our tendency is to view learning only in the manner in whichhumans learn, i.e. incrementally over time. This may not be

    the case when ML algorithms are concerned.

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    4/53

    4

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    5/53

    5

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    6/53

    6

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    7/53

    7

    A simple decision model

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    8/538

    An overly complex decision modelThis may lead to worse classification than a simple model

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    9/539

    May be this model is an optimal trade off between modelcomplexity and and performance on the training set

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    10/5310

    A classification problem: the grades for students taking thiscourse

    Key Steps:1. Data (what past experience can we rely on?)2. Assumptions (what can we assume about the students

    or the course?)3. Representation (how do we summarize a student?)4. Estimation (how do we construct a map from students

    to grades?)5. Evaluation (how well are we predicting?)6. Model Selection (perhaps we can do even better?)

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    11/5311

    1. Data: The data we have available may be:

    - names and grades of students in past years ML courses- academic record of past and current students

    Student M L Course X Course Y Peter A B A Training

    David B A A data

    Jack ? C A CurrentKate ? A A data

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    12/5312

    2. Assumptions:

    There are many assumptions we can make to facilitatepredictions

    1. The course has remained same roughly over the years2. Each student performs independently from others

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    13/5313

    3. Representation:

    Academic records are rather diverse so we might limit thesummaries to a select few coursesFor example, we can summarize the i th student (say Pete)with a vector

    X i = [A C B]

    where the grades may correspond to numerical values

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    14/5314

    3. Representation:

    The available data in this representation is:

    Training data Data for predictionStudent ML grade Student ML gradeX1t B X 1p ?

    X2t A X 2p ?

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    15/5315

    4. Estimation

    Given the training data

    Student ML gradeX1t BX2t A

    we need to find a mapping from input vectors x to labels y encoding the grades for the ML course.

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    16/5316

    Possible solution (nearest neighbor classifier):

    1. For any student x find the closest student xi

    in thetraining set2. Predict y i, the grade of the closest student

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    17/5317

    5. Evaluation

    How can we tell how good our predictions are?- we can wait till the end of this course...- we can try to assess the accuracy based on the data

    we already have (training data)

    Possible solution:- divide the training set further into training and test

    sets- evaluate the classifier constructed on the basis of

    the training set on the test set

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    18/5318

    6. Model Selection

    We can refine- The estimation algorithm (e.g., using a classifier

    other than the nearest neighbor classifier)- The representation (e.g., base the summaries on a

    different set of courses)

    - The assumptions (e.g., perhaps students work ingroups) etc.

    We have to rely on the method of evaluating the accuracyof our predictions to select among the possible refinements

    What is M achine L earni ng?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    19/5319

    Types of M achi ne L earn ing

    Data can be- Symbolic or Categorical (e.g. High Temperature)

    - Numerical (e.g. 450

    C)

    We will be primarily dealing with Symbolic data

    Numerical data is primarily dealt by Artificial Neural Networks, whi ch have evolved into a separate f ield

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    20/5320

    Types of M achi ne L earn ing

    From the available data we can- Model the system which has generated the data

    - Find interesting patterns in the data

    We will be primarily concerned with rule based modelling of the system from which the data was generated

    The search for interesting patterns is considered to be the domain of Data M ining

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    21/5321

    Types of M achi ne L earn ing

    Complete Pattern Recognition (or classification) system

    consists of several steps

    We will be primarily concernedwith the development of classifier systems

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    22/5322

    Types of M achi ne L earn ing

    Supervised learning , where we get a set of training inputsand outputs. The correct output for the training samples is

    available

    Unsupervised learning , where we are interested in capturinginherent organization in the data. No specific output valuesare supplied with the learning patterns

    Reinforcement learning , where there are no exact outputssupplied, but there is a reward (reinforcement) for desirablebehaviour

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    23/5323

    First, there are problems for which there exist no humanexperts.

    Example: in modern automated manufactur ing f acil i ties, there is a need to predict machine fail ures before they occur by analyzing sensor readings. Because the machines are new,there are no human experts who can be interviewed by a

    programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rul es.

    Why Use M achine L earning?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    24/5324

    Second, there are problems where human experts exist, butwhere they are unable to explain their expertise.

    This is the case in many perceptual tasks, such as speech recogni tion, hand-wri ting recognition, and natur al l anguage understanding. Virtually all humans exhibit expert-level abil i ties on these tasks, but none of them can describe the

    detailed steps that they follow as they perform them.F ortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learn ing algor i thms can learn to map the inputs to the outputs.

    Why Use M achine L earning?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    25/5325

    Why Use M achine L earning?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    26/5326

    Third, there are problems where phenomena are changingrapidly.

    Example: people would like to predict the future behavior of the stock market, of consumer pur chases, or of exchange rates.The rules and parameters governi ng these behaviors change f requently, so that the computer program for prediction would need to be rewritten f requently.

    Why Use M achine L earning?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    27/5327

    Fourth, there are applications thatneed to be customized for each

    computer user separately.

    Example: a program to f i l ter unwanted electronic mail messages. Different users wil l need different f i l ters.

    Why Use M achine L earning?

    INTRODUCTION

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    28/5328

    Learning has been classified into several types

    Much of human learning involves acquiring generalconcepts from specific training examples (this is cal led inductive learning)

    VERSION SPACE

    Concept L earning by I nduction

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    29/5329

    Example: Concept of ball

    * red, round, small

    * green, round, small* red, round, medium

    Complicated concepts: situations in which I should

    study more to pass the exam

    VERSION SPACE

    Concept L earning by I nduction

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    30/53

    30

    Each concept can be thought of as a Boolean-valuedfunction whose value is true for some inputs and falsefor all the rest

    (e.g. a function defined over all the animals, whosevalue is true for birds and false for all the otheranimals)

    Problem of automatically inferring the generaldefinition of some concept, given examples labeled asmembers or nonmembers of the concept. This task iscalled concept learn ing , or approximating (inferring) aBoolean valued function from examples

    VERSION SPACE

    Concept L earning by I nduction

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    31/53

    31

    Target Concept to be learnt: Days on which Aldoenjoys his favorite water sport

    Training Examples present are:

    VERSION SPACE

    Concept L earning by I nduction

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    32/53

    32

    The training examples are described by the values of seven Attributes

    The task is to learn to predict the value of the attributeEnjoySport for an arbitrary day, based on the values of its other attributes

    VERSION SPACE

    Concept L earning by I nduction

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    33/53

    33

    The possible concepts are called Hypotheses and weneed an appropriate representation for the hypotheses

    Let the hypothesis be a conjunction of constraints onthe attribute-values

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    34/53

    34

    If sky = sunny temp = warm humidity = ?

    wind = strong water = ? forecast = samethenEnjoy Sport = YeselseEnjoy sport = No

    Alternatively, this can be written as:{sunny, warm, ?, strong, ?, same}

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    35/53

    35

    For each attribute, the hypothesis will have either? Any value is acceptable

    Value Any single value is acceptableNo value is acceptable

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    36/53

    36

    If some instance (example/observation) satisfies all theconstraints of a hypothesis, then it is classified as

    positive (belonging to the concept)

    The most general hypothesis is {?, ?, ?, ?, ?, ?}

    It would classify every example as a positive example

    The most specific hypothesis is { , , , , , }It would classify every example as negative

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    37/53

    37

    Alternate hypothesis representation could have beenDi sjunction of several conjunction of constraints

    on the attr ibute-values Example:

    {sunny, warm, normal, strong, warm, same} {sunny, warm, high, strong, warm, same} {sunny, warm, high, strong, cool, change}

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    38/53

    38

    Another alternate hypothesis representation couldhave been

    Conj unction of constraints on the attribute-values where each constraint may be a disjunction of values

    Example:

    {sunny, warm, normal high, strong, warm cool,same change}

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    39/53

    39

    Yet another alternate hypothesis representation couldhave incorporated negations

    Example:

    { sunny, warm, (normal high), ?, ?, ?}

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    40/53

    40

    By selecting a hypothesis representation, the space of allhypotheses (that the program can ever represent andtherefore can ever learn) is implicitly defined

    In our example, the instance space X can contain 3.2.2.2.2.2 =96 distinct instances

    There are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses.Since every hypothesis containing even one classifies everyinstance as negative, hence semantically distinct hypothesesare: 4.3.3.3.3.3 + 1 = 973

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    41/53

    41

    Most practical learning tasks involve much larger, sometimesinfinite, hypothesis spaces

    VERSION SPACE

    Concept Learning by I nduction: H ypothesis Representation

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    42/53

    42

    Concept learning can be viewed as the task of searchingthrough a large space of hypotheses implicitly defined bythe hypothesis representation

    The goal of this search is to find the hypothesis that best fitsthe training examples

    VERSION SPACE

    Concept L earning by I nduction: Search in H ypotheses Space

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    43/53

    43

    Once a hypothesis that best fits the training examples isfound, we can use it to predict the class label of newexamples

    The basic assumption while using this hypothesis is:

    Any hypothesis found to approximate the target function well over a suf f icientl y large set of training examples wil l also approximate the target function well over other unobserved examples

    VERSION SPACE

    Concept Learni ng by I nduction: Basic Assumption

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    44/53

    44

    If we view learning as a search problem, then it is naturalthat our study of learning algorithms will examinedifferent strategies for searching the hypothesis space

    Many algorithms for concept learning organize thesearch through the hypothesis space by relying on ageneral to specific ordering of hypotheses

    VERSION SPACE

    Concept L earning by I nduction: General to Specif ic Order ing

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    45/53

    45

    Example:Consider h1 = {sunny, ?, ?, strong, ?, ?}

    h2 = {sunny, ?, ?, ?, ?, ?}any instance classified positive by h1 will also beclassified positive by h2 (because it imposes fewerconstraints on the instance)

    Hence h2 is more general than h1 and h1 is morespecific than h2

    VERSION SPACE

    Concept L earning by I nduction: General to Specif ic Order ing

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    46/53

    46

    Consider the three hypotheses h1, h2 and h3

    VERSION SPACE

    Concept L earning by I nduction: General to Specif ic Order ing

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    47/53

    47

    Neither h1 nor h3 is more general than the other

    h2 is more general than both h1 and h3

    Note that the more -general- than relationship isindependent of the target concept. It depends only onwhich instances satisfy the two hypotheses and not on theclassification of those instances according to the targetconcept

    VERSION SPACE

    Concept L earning by I nduction: General to Specif ic Order ing

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    48/53

    48

    How to find a hypothesis consistent with the observedtraining examples?

    - A hypothesis is consistent with the tr aini ng examples if i t correctly classif ies these examples

    One way is to begin with the most specific possible

    hypothesis, then generalize it each time it fails to cover apositive training example (i.e. classifies it as negative)

    The algorithm based on this method is called Find-S

    VERSION SPACE

    F ind-S Algorithm

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    49/53

    49

    We say that a hypothesis covers a positive training exampleif it correctly classifies the example as positive

    A positive training example is an example of the concept tobe learnt

    Similarly a negative training example is not an example of the concept

    VERSION SPACE

    F ind-S Algorithm

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    50/53

    50

    VERSION SPACE

    Find-S Algorithm

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    51/53

    51

    VERSION SPACE

    F ind-S Algor ithm

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    52/53

    52

    The nodes shown in the diagram are the possible hypothesesallowed by our hypothesis representation scheme

    Note that our search is guided by the positive examples andwe consider only those hypotheses which are consistentwith the positive training examples

    The search moves from hypothesis to hypothesis, searchingfrom the most specific to progressively more generalhypotheses

    VERSION SPACE

    F ind-S Algorithm

  • 7/28/2019 AI Lecture 11 & 12 - Machine Learning

    53/53

    At each step, the hypothesis is generalized only as far asnecessary to cover the new positive example

    Therefore, at each stage the hypothesis is the most specifichypothesis consistent with the training examples observedup to this point

    Hence, it is called Find-S

    VERSION SPACE

    F ind-S Algorithm