icml’2003 minitutorial on research, ‘riting, and reviews€¦ · august 29, 2005 1 experimental...
TRANSCRIPT
![Page 1: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/1.jpg)
August 29, 20051
ExperimentalExperimentalMethodologyMethodology
Rob HolteUniversity of [email protected]
ICML’2003 Minitutorial on Research, ‘Riting, and Reviews
![Page 2: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/2.jpg)
August 29, 20052
Experiments serve a purposeExperiments serve a purpose
They provide evidence for claims, designthem accordingly
Choose appropriate test datasets,consider using artificial data
Record measurements directly related toyour claims
![Page 3: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/3.jpg)
August 29, 20053
Establish a needEstablish a need
Try very simple approaches beforecomplex ones
Try off-the-shelf approaches beforeinventing new ones
Try a wide range of alternatives not justones most similar to yours
Make sure comparisons are fair
![Page 4: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/4.jpg)
August 29, 20054
Explore limitationsExplore limitations
Under what conditions does your systemwork poorly ? When does it work well ?
What are the sources of variance ? Eliminate as many as possible Explain the rest
![Page 5: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/5.jpg)
August 29, 20055
Explore anomaliesExplore anomalies Superlinear speedup IDA* on N processors is more than N times
faster than on 1 processor
“…we were surprised to obtain superlinearspeedups on average… our first reactionwas to assume that our sample size was toosmall …” - V. Rao & V. Kumar
Superlinear Speedup in Parallel State-Space SearchTechnical Report AI88-80, 1988CS Dept., U of Texas - Austin
![Page 6: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/6.jpg)
August 29, 20056
Look at your dataLook at your data4 x-y datasets, all with the same statistics.Are they similar ? Are they linear ?
•mean of the x values = 9.0•mean of the y values = 7.5•equation of the least-squared regression line is: y = 3 + 0.5x•sums of squared errors (about the mean) = 110.0•regression sums of squared errors = 27.5•residual sums of squared errors (about the regression line) = 13.75•correlation coefficient = 0.82•coefficient of determination = 0.67
F.J. Anscombe (1973), "Graphs in Statistical Analysis," American Statistician, 27, 17-21
![Page 7: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/7.jpg)
August 29, 20057
AnscombeAnscombe datasets plotted datasets plotted
![Page 8: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/8.jpg)
August 29, 20058
Look at your data, againLook at your data, again
Japanese credit card dataset (UCI) Cross-validation error rate is identical for
C4.5 and 1R
Is their performance the same ?
![Page 9: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/9.jpg)
August 29, 20059
Closer analysis revealsCloser analysis reveals……
Error rate is the same only onthe dataset class distribution
•ROC curves•Cost curves•REC curves•Learning curves
C4.5
1R
![Page 10: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/10.jpg)
August 29, 200510
Test alternative explanationsTest alternative explanations
Solution Quality (% of optimal)
Combinatorial auction problemsCHC = hill-climbing with a clever new heuristic
87 arb 89 r90N 90 r90P 83 r75P 96 sched 99 match 98 path CHC problem type
![Page 11: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/11.jpg)
August 29, 200511
Is CHC better than random HC ?Is CHC better than random HC ?
Percentage of CHC solutionsbetter than random HC solutions
20 arb 6 r90N 7 r90P 63 r75P 100 sched 100 match 100 path % betterproblem type
![Page 12: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/12.jpg)
August 29, 200512
Avoid Avoid ““OvertuningOvertuning””
Overtuning = using all your data forsystem development. Final system likelyoverfits.
David Lubinksy, PhD thesis Held out ~10 UCI datasets for final testing
Chris Drummond, PhD thesis Fresh data used for final testing
![Page 13: ICML’2003 Minitutorial on Research, ‘Riting, and Reviews€¦ · August 29, 2005 1 Experimental Methodology Rob Holte University of Alberta holte@cs.ualberta.ca ICML’2003 Minitutorial](https://reader034.vdocuments.us/reader034/viewer/2022043018/5f3a2327789c0c5538635e64/html5/thumbnails/13.jpg)
August 29, 200513
Too obvious to mention ? (no)Too obvious to mention ? (no)
Debug and test your code thoroughly
Keep track of parameter settings, versionsof program and dataset, etc.