learning and decision trees

Upload: ankurgupta

Post on 02-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Learning and Decision Trees

    1/16

    Learning And Decision Trees

  • 8/10/2019 Learning and Decision Trees

    2/16

    Learning decision trees

    Problem: decide whether to wait for a table at arestaurant, based on the following attributes:

    1. Alternate: is there an alternative restaurant nearby?

    2. Bar: is there a comfortable bar area to wait in?

    3. Fri/Sat: is today Friday or Saturday?4. Hungry: are we hungry?

    5. Patrons: number of people in the restaurant (None,Some, Full)

    6. Price: price range ($, $$, $$$)

    7. Raining: is it raining outside?8. Reservation: have we made a reservation?

    9. Type: kind of restaurant (French, Italian, Thai,Burger)

    10.WaitEstimate: estimated waiting time (0-10, 10-30,

    30-60, >60)

  • 8/10/2019 Learning and Decision Trees

    3/16

    Attribute-based representations

    Examples described by attribute values (Boolean, discrete,continuous)

    E.g., situations where I will/won't wait for a table:

    Classificationof examples is positive(T) or negative(F)

  • 8/10/2019 Learning and Decision Trees

    4/16

    Learning Decision Trees

    problem: find a decision tree that agreeswith the training set

    trivial solution: construct a tree with onebranch for each sample of the training set works perfectly for the samples in the training setmay not work well for new samples

    (generalization) results in relatively large trees

    better solution: find a concise tree that stillagrees with all samples corresponds to the simplest hypothesis that is

    consistent with the training set

  • 8/10/2019 Learning and Decision Trees

    5/16

    Constructing Decision Trees

    in general, constructing the smallestpossible decision tree is an intractableproblem

    algorithms exist for constructingreasonably small trees

    basic idea: test the most importantattribute first attribute that makes the most difference for the

    classification of an examplecan be determined through information theory

    hopefully will yield the correct classification withfew tests

  • 8/10/2019 Learning and Decision Trees

    6/16

    Decision Tree Algorithm

    recursive formulation select the best attribute to split positive and negative

    examples if only positive or only negative examples are left, we

    are done if no examples are left, no such examples were

    observersreturn a default value calculated from the majority

    classification at the nodes parent

    if we have positive and negative examples left, but noattributes to split them we are in troublesamples have the same description, but different

    classificationsmay be caused by incorrect data (noise), or by a lack of

    information, or by a truly non-deterministic domain

  • 8/10/2019 Learning and Decision Trees

    7/16

    Decision Tree Learning

    Aim: find a small tree consistent with the training examples

    Idea: (recursively) choose "most significant" attribute asroot of (sub)tree

  • 8/10/2019 Learning and Decision Trees

    8/16

    Information

  • 8/10/2019 Learning and Decision Trees

    9/16

    Entropy

  • 8/10/2019 Learning and Decision Trees

    10/16

  • 8/10/2019 Learning and Decision Trees

    11/16

    Information gain

    For the training set, p= n= 6, I(6/12, 6/12) = 1

    Consider the attributes Patronsand Type(and others too):

    Patronshas the highest IG of all attributes and so is chosen bythe DTL algorithm as the root

    2 4 6 2 4IG(Patrons)=1-[ I(0,1)+ I(1,0)+ I( , )]=.054112 12 12 6 6

    2 1 1 2 1 1 4 2 2 4 2 2IG(Type)=1-[ I( , )+ I( , )+ I( , )+ I( , )]=0

    12 2 2 12 2 2 12 4 4 12 4 4

  • 8/10/2019 Learning and Decision Trees

    12/16

    Example contd.

    Decision tree learned from the 12 examples:

    Substantially simpler than true tree---a more complexhypothesis isnt justified by small amount of data

  • 8/10/2019 Learning and Decision Trees

    13/16

    Expressiveness of Decision Trees

    decision trees can also be expressed asimplication sentences

    in principle, they can express propositional

    logic sentences each row in the truth table of a sentence can be

    represented as a path in the tree

    often there are more efficient trees

    some functions require exponentially largedecision trees

    parity function, majority function

  • 8/10/2019 Learning and Decision Trees

    14/16

    Expressiveness

    Decision trees can express any function of the inputattributes.

    E.g., for Boolean functions, truth table row path to leaf:

    Trivially, there is a consistent decision tree for any trainingset with one path to leaf for each example (unless fnondeterministic inx) but it probably won't generalize tonew examples

  • 8/10/2019 Learning and Decision Trees

    15/16

    Performance of Decision Tree Learning

    quality of predictions predictions for the classification of unknown

    examples that agree with the correct result areobviously better

    can be measured easily after the fact it can be assessed in advance by splitting the

    available examples into a training set and a testsetlearn the training set, and assess the

    performance via the test set

    size of the tree a smaller tree (especially depth-wise) is a more

    concise representation

  • 8/10/2019 Learning and Decision Trees

    16/16