Download - Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006
![Page 1: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/1.jpg)
Artificial Intelligence Project #3: Analysis of Decision Tree
Learning Using WEKA
May 23, 2006
![Page 2: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/2.jpg)
Introduction
Decision tree learning is a method for approximating discrete-valued target functionThe learned function is represented by a decision treeDecision tree can also be re-represented as if-then rules to improve human readability
![Page 3: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/3.jpg)
An Example of Decision Tree
![Page 4: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/4.jpg)
Decision Tree Representation (1/2)
Decision tree classify instances by sorting them down the tree from the root to some leaf node
Node
Specifies test of some attribute
Branch
Corresponds to one of the possible values for this attribute
![Page 5: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/5.jpg)
Decision Tree Representation (2/2)
Each path corresponds to a
conjunction of attribute tests
(Outlook=sunny, Temperature=Hot,
Humidity=high, Wind=Strong)
(Outlook=Sunny ∧ Humidity=High) so NO
Decision trees represent a
disjunction of conjunction of
constraints on the attribute values of
instances
(Outlook=Sunny ∧Humidity=normal)
∨(Outlook=Overcast)
∨(Outlook=Rain ∧Wind=Weak)
Outlook
Humidity Yes
No Yes
Wind
No Yes
Sunny
Overcast
Rain
High Normal Strong Weak
What is the merit of tree representation?
![Page 6: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/6.jpg)
Appropriate Problems for Decision Tree Learning
Instances are represented by attribute-value pairs
The target function has discrete output values
Disjunctive descriptions may be required
The training data may contain errors
Both errors in classification of the training examples and errors in the attribute values
The training data may contain missing attribute values
Suitable for classification
![Page 7: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/7.jpg)
Study
Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells, MH Cheok et al., Nature Genetics 35, 2003.
60 leukemia patients
Bone marrow samples
Affymetrix GeneChip arrays
Gene expression data
![Page 8: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/8.jpg)
Gene Expression Data
# of data examples120 (60: before treatment, 60: after treatment)
# of genes measured12600 (Affymetrix HG-U95A array)
TaskClassification between “before treatment” and “after treatment” based on gene expression pattern
![Page 9: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/9.jpg)
Affymetrix GeneChip Arrays
Use short oligos to detect gene expression level.Each gene is probed by a set of short oligos.Each gene expression level is summarized by
Signal: numerical value describing the abundance of mRNAA/P call: denotes the statistical significance of signal
![Page 10: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/10.jpg)
Preprocessing
Remove the genes having more than 60 ‘A’ calls# of genes: 12600 3190
Discretization of gene expression levelCriterion: median gene expression value of each sample0 (low) and 1 (high)
![Page 11: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/11.jpg)
Gene Filtering
Using mutual information
Estimated probabilities were used.# of genes: 3190 1000
Final dataset# of attributes: 1001 (one for the class)
Class: 0 (after treatment), 1 (before treatment)
# of data examples: 120
,
( , )( ; ) ( , ) log
( ) ( )G C
P G CI G C P G C
P G P C
![Page 12: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/12.jpg)
Final Dataset
120
1000
![Page 13: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/13.jpg)
Materials for the Project
GivenPreprocessed microarray data file: data2.txt
DownloadableWEKA (http://www.cs.waikato.ac.nz/ml/weka/)
![Page 14: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/14.jpg)
Analysis of Decision Tree Learning
![Page 15: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/15.jpg)
Analysis of Decision Tree Learning
![Page 16: Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006](https://reader034.vdocuments.us/reader034/viewer/2022051516/56649efe5503460f94c1260f/html5/thumbnails/16.jpg)
Submission
Due date: June 15 (Thu.), 12:00(noon)Report: Hard copy(301-419) & e-mail.
ID3, J48 and another decision tree algorithm with learning parameter.Show the experimental results of each algorithm. Except for ID3, you should try to find out better performance, changing learning parameter.Analyze what makes difference between selected algorithms.E-mail : [email protected]