decision trees: representation - svivek · summary: decision trees • decision trees can represent...
TRANSCRIPT
MachineLearning
DecisionTrees:Representation
1SomeslidesfromTomMitchell,DanRothandothers
Keyissuesinmachinelearning
• ModelingHowtoformulateyourproblemasamachinelearningproblem?Howtorepresentdata?Whichalgorithmstouse?Whatlearningprotocols?
• RepresentationGoodhypothesisspacesandgoodfeatures
• Algorithms– Whatisagoodlearningalgorithm?– Whatissuccess?– Generalizationvs overfitting– Thecomputationalquestion:Howlongwilllearningtake?
2
Comingup…(therestofthesemester)
Differenthypothesisspacesandlearningalgorithms– DecisiontreesandtheID3algorithm– Linearclassifiers
• Perceptron• SVM• Logisticregression
– Combiningmultipleclassifiers• Boosting,bagging
– Non-linearclassifiers– Nearestneighbors
3
Comingup…(therestofthesemester)
Differenthypothesisspacesandlearningalgorithms– DecisiontreesandtheID3algorithm– Linearclassifiers
• Perceptron• SVM• Logisticregression
– Combiningmultipleclassifiers• Boosting,bagging
– Non-linearclassifiers– Nearestneighbors
4
Importantissuestoconsider
1. Whatdothesehypothesesrepresent?
2. Implicitassumptionsandtradeoffs
3. Generalization?
4. Howdowelearn?
Thislecture:LearningDecisionTrees
1. Representation:Whataredecisiontrees?
2. Algorithm:Learningdecisiontrees– TheID3algorithm:Agreedyheuristic
3. Someextensions
5
Thislecture:LearningDecisionTrees
1. Representation:Whataredecisiontrees?
2. Algorithm:Learningdecisiontrees– TheID3algorithm:Agreedyheuristic
3. Someextensions
6
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Name Label
ClaireCardie -PeterBartlett +EricBaum +Haym Hirsh +Shai Ben-David +MichaelI.Jordan -
7
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -8
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Withthesefourattributes,howmanyuniquerowsarepossible?2· 26· 26· 2=2704
Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100
9
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208
Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100
10
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208
Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?2100
11
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208
Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?(100times)2×2×2×⋯×2 = 2)**
12
Name Namehaspunctuation?
Secondcharacteroffirstname
Lengthoffirst
name>5?
Samefirstletterintwonames?
Label
ClaireCardie No l Yes Yes -PeterBartlett No e No No +EricBaum No r No No +Haym Hirsh No a No Yes +Shai Ben-David
Yes h No No +
MichaelI.Jordan
Yes i Yes No -
Representingdata
Datacanberepresentedasabigtable,withcolumnsdenotingdifferentattributes
Withthesefourattributes,howmanyuniquerowsarepossible?2×26×2×2 = 208
Ifthereare100attributes,allbinary,howmanyuniquerowsarepossible?(100times)2×2×2×⋯×2 = 2)**
13
Ifwewantedtostoreallpossiblerows,thisnumberistoolarge.
Weneedtofigureouthowtorepresentdatainabetter,moreefficientway
Whataredecisiontrees?
Ahierarchicaldatastructurethatrepresentsdatausingadivide-and-conquerstrategy
Canbeusedashypothesisclassfornon-parametricclassificationorregression
Generalidea:Givenacollectionofexamples,learnadecisiontreethatrepresentsit
14
Whataredecisiontrees?
• Decisiontreesareafamilyofclassifiersforinstancesthatarerepresentedbycollectionsofattributes(i.e.features)
• Nodes aretestsforfeaturevalues
• Thereisonebranch foreveryvaluethatthefeaturecantake
• Leaves ofthetreespecifytheclasslabels
15
Let’sbuildadecisiontreeforclassifyingshapes
Label=ALabel=C Label=B
16
Let’sbuildadecisiontreeforclassifyingshapes
17
Beforebuildingadecisiontree:
Whatisthelabelforaredtriangle?Andwhy?
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?
18
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape
19
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
20
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Blue Red Green
21
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Blue Red Green
B
22
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Blue Red Green
B
squaretriangle circle
CAB
Shape?
23
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Shape?circlesquare
AB
Blue Red Green
B
squaretriangle circle
CAB
Shape?
24
Label=ALabel=C Label=B
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Shape?circlesquare
AB
Blue Red Green
squaretriangle circle
CAB
Shape?
1. Howdowelearn adecisiontree?Comingupsoon…
2. Howtouseadecisiontreeforprediction?• Whatisthelabelforared triangle?
• Justfollowapathfromtheroottoaleaf
• Whataboutagreentriangle?
25
B
Label=ALabel=C Label=B
Let’sbuildadecisiontreeforclassifyingshapes
Whataresomeattributesoftheexamples?Color,Shape Color?
Shape?circlesquare
AB
Blue Red Green
squaretriangle circle
CAB
Shape?
1. Howdowelearn adecisiontree?Comingupsoon…
2. Howtouseadecisiontreeforprediction?• Whatisthelabelforared triangle?
• Justfollowapathfromtheroottoaleaf
• Whataboutagreen triangle?
26
B
ExpressivityofDecisiontrees
WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction
(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….
Everypathfromthetreetoarootisarule
Thefulltreeisequivalenttotheconjunctionofalltherules
AnyBooleanfunctioncanberepresentedasadecisiontree.
27
ExpressivityofDecisiontrees
WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction
(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….
AnyBooleanfunctioncanberepresentedasadecisiontree.
28
Everypathfromthetreetoarootisarule
Thefulltreeisequivalenttotheconjunctionofalltherules
ExpressivityofDecisiontrees
WhatBooleanfunctionscandecisiontreesrepresent?– AnyBooleanfunction
(Color=blue ANDShape=triangle) Label=B)AND(Color=blue ANDShape=square) Label=A) AND(Color=blueANDShape=circle) Label=C)AND….
AnyBooleanfunctioncanberepresentedasadecisiontree.
29
Everypathfromthetreetoarootisarule
Thefulltreeisequivalenttotheconjunctionofalltherules
DecisionTrees
• Outputsarediscretecategories
• Butrealvaluedoutputsarealsopossible(regressiontrees)
• Wellstudiedmethodsforhandlingnoisydata(noiseinthelabelorinthefeatures)andforhandlingmissingattributes– Pruningtreeshelpswithnoise– Moreonthislater…
30
Numericattributesanddecisionboundaries
• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical
• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles
31
Numericattributesanddecisionboundaries
• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical
• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles
32
Numericattributesanddecisionboundaries
• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical
• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles
13X
7
5
Y
- +
+ +
+ +
-
-
+
33
Numericattributesanddecisionboundaries
• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical
• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles
13X
7
5
Y
- +
+ +
+ +
-
-
+
34
X<3
Y<5
no yes
Y>7yesno
X<1
no yes
- + +
+ -yesno
Numericattributesanddecisionboundaries
• Wehaveseeninstancesrepresentedasattribute-valuepairs(color=blue,secondletter=e,etc.)– Valueshavebeencategorical
• Howdowedealwithnumericfeaturevalues?(eg length=?)– Discretizethemorusethresholdsonthenumericvalues– Thisexampledividesthefeaturespaceintoaxisparallelrectangles
13X
7
5
Y
- +
+ +
+ +
-
-
+Decisionboundariescanbenon-linear
35
X<3
Y<5
no yes
Y>7yesno
X<1
no yes
- + +
+ -yesno
Summary:Decisiontrees
• DecisiontreescanrepresentanyBooleanfunction• Awaytorepresentlotofdata• Anaturalrepresentation(think20questions)• Predicting withadecisiontreeiseasy
• Clearly,givenadataset,therearemanydecisiontreesthatcanrepresentit.[Exercise:Why?]
• Learningagoodrepresentationfromdataisthenextquestion
36
Summary:Decisiontrees
• DecisiontreescanrepresentanyBooleanfunction• Awaytorepresentlotofdata• Anaturalrepresentation(think20questions)• Predicting withadecisiontreeiseasy
• Clearly,givenadataset,therearemanydecisiontreesthatcanrepresentit.[Exercise:Why?]
• Learningagoodrepresentationfromdataisthenextquestion
37
Exercises
1. WritedownthedecisiontreefortheshapesdataiftherootnodewasShape insteadofColor.
2. Willthetwotreesmakethesamepredictionsforunseenshapes/colorcombinations?
3. ShowthatmultiplestructurallydifferentdecisiontreescanrepresentthesameBooleanfunctionoftwoormorevariables.
38
Label=ALabel=C Label=B
(thinkaboutwhatitmeansfortwotreestobestructurallydifferent)