chapter 14. classification and regression treesbenedikt/courses/ch14_2005.pdf · chapter 14: cart 5...
TRANSCRIPT
Chapter 14. Classification and Regression Trees
Chapter 14. Classification and Regression Trees
Chapter 14: CART
Chapter 14: CART
2
Decision TreesDecision Trees
Chapter 14: CART
3
IntroductionIntroduction
ANFIS training can be split into two parts:• Structure identification• Parameter identification
Chapter 14: CART
4
IntroductionIntroduction
Structure identification in fuzzy modeling includes the following basic items:
• Select the relevant input variables• Select initial structure for ANFIS:
- Input partition- Number of membership functions for each input- Number of IF-THEN rules- Prerequisite part for fuzzy rules- Consequent part for fuzzy rules
• Initialization of parameters for membership functions
Chapter 14: CART
5
IntroductionIntroduction
Consider tree-partioning in fuzzy modeling:• The CART (Classification and Regression Tree)
algorithm• CART offers a fast method for structure
identification• CART does not have problems with the curse of
dimensionality• ANFIS based on CART is powerful in training and
application because weight normalization is included in the procedure
Chapter 14: CART
6
The CART AlgorithmThe CART Algorithm
The CART algorithm is a powerful algorithm which is nonparametric and has the following characteristics:
• Based on a simple idea• Computationally powerful• Can both be used in classification and regression
problems• Based on a solid statistical foundation• Suitable for high-dimensional data• Can identify important variables
Chapter 14: CART
7
The CART AlgorithmThe CART Algorithm
Example:IF x < a AND y < b THEN z = f1
IF x < a AND y > b THEN z = f2
IF x > a AND y < c THEN z = f3
IF x > a AND y > c THEN z = f4
See next slide
Chapter 14: CART
8
Decision TreesDecision Trees
Chapter 14: CART
9
Decision Trees Decision Trees
Decision tree:• Splits the input space into exclusive regions• Gives every area a label, value or an operation in
order to characterize its data points. • Is easy to use in classification• Structured by inner nodes and outer nodes which
are connected by branches• Binary trees are the simplest
Chapter 14: CART
10
Decision TreesDecision Trees
Groups of decision trees:• Classification trees• Regression trees
Chapter 14: CART
11
The CART AlgorithmThe CART Algorithm
Basic idea of CART:• First, a tree is constructed based on the training
data • Secondly, the tree is pruned to minimize cost and
computations (minimum cost-complexity principle)
Chapter 14: CART
12
The CART AlgorithmThe CART Algorithm
Classification trees use impurity functions, E(t):
• Entropy function• Gini index• Impurity functions take their minimum value
(zero) when the data only belong to one class but take their maximum value when the data are uniformly distributed over all classes.
• The algorithm attempts to select the partition which gives the maximum decrease in impurity.
• For binary trees it is possible to write ∆E(s,t) = E(t) - pl E(tl) - pr E(tr)
Chapter 14: CART
13
CART: Impurity FunctionsCART: Impurity Functions
Chapter 14: CART
14
CART: Node PartitionCART: Node PartitionEntropy function is used:
E(t) = 0.6730
E(s1,t) = 0.2231E(s2,t) = 0.0138E(s3,t) = 0.2911E(s4,t) = 0.1185E(s5,t) = 0.2231E(s6,t) = 0.0000E(s7,t) = 0.2911E(s8,t) = 0.1185
Chapter 14: CART
15
CART: Tree GrowthCART: Tree Growth
Chapter 14: CART
16
CART: Tree GrowthCART: Tree Growth
Chapter 14: CART
17
CART: Error MeasureCART: Error Measure
Chapter 14: CART
18
CART: Input/Output Relations when Outputs are ConstantCART: Input/Output Relations when Outputs are Constant
Chapter 14: CART
19
CART: Input/Output Relations when Outputs are LinearCART: Input/Output Relations when Outputs are Linear
Chapter 14: CART
20
Structure Identification for ANFISStructure Identification for ANFIS
CART-ANFIS processing is based on the following:
• CART is used to find the first solutions for hard rules (structural identification)
• Prerequisite part of rules is made fuzzy • ANFIS is used to fine tune the parameters
(parameter identification)• Normalization of weights is implicit in the above
procedure.
Chapter 14: CART
21
Structural Identification for ANFISStructural Identification for ANFIS
The CART algorithm makes hard decision boundariesExample:
IF x < a AND y < b THEN z = f1
IF x < a AND y > b THEN z = f2
IF x > a AND y < c THEN z = f3
IF x > a AND y > c THEN z = f4
We can use fuzzy membership functions to make the prerequisite parts fuzzy
Chapter 14: CART
22
CART: Two types of membership functionsCART: Two types of membership functions
Chapter 14: CART
23
CART: Fuzzy input/output relations when the outputs are constantsCART: Fuzzy input/output relations when the outputs are constants
Chapter 14: CART
24
CART: Fuzzy input/output relations when the outputs are linearCART: Fuzzy input/output relations when the outputs are linear