mcs 2005 round table in the context of mcs, what do you believe to be true, even if you cannot yet...

25
MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Post on 20-Dec-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

MCS 2005 Round Table

In the context of MCS, what do you believe to be true, even if you

cannot yet prove it?

Page 2: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Plan of Attack• Three rough categories of response:

– Performance Claims (8)– Design and Data Principles (6)– Predictions for the Field (3)

• Plan: for each category– Quickly present a hypothesis. Show of hands to see

how many also believe that claim.– Discuss the most contentious. Why the disparity? What

experiments or evidence would help?– Discuss the most supported. Why still unproved? What

experiments or evidence would help?• (Disclaimers: we re-wrote a couple entries to make them into hypotheses. Categories are

indeed “rough”, and we may have misrepresented your claim. I know we’re in America, but please don’t sue us.)

Page 3: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims

Page 4: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/1

• Combiners will generally perform better than dimensionality reduction

Page 5: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/2

• One can always find an ensemble of classifiers which is more accurate than a single classifier,

Page 6: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/3

• Multiple types of classifiers (eg DT, NN, SVM) can be used in an ensemble/expert systems approach to achieve less overall error. The different approaches would hopefully have errors on different examples than on any one approach would have, due to the different biases,

Page 7: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/4

• In complex real applications, the combination of a small set of carefully designed and engineered classifiers can always outperform any coverage optimization based MCS, like bagging and boosting.

Page 8: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/5

• Classifier selection can improve bagging performance, that is, a small subset of bagged classifiers can perform better than a large one.

Page 9: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/6

• The output of an ensemble is “more stable” than that of a single classifier. Although there is quite a bit of literature regarding this, the notion of stability may be context specific. Hence not possible to make a general statement about. If an ensemble is indeed “more stable” and hence more trustworthy, it provides a good argument for designing an ensemble as opposed to a single well-trained classifier even if the classifier provides a good accuracy and the ensemble does not improve accuracy significally.

Page 10: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/7

• Fixed combiners perform well compared to trainable combiners. But: – Fixed combiners are based on a number of a priori

assumptions– Trainable combiners should be able to learn real

patterns in outputs of base classifiers

• Still, fixed combiners however do surprisingly well. (Why?)

Page 11: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims/8

• We have shown this experimentally, however cannot prove theoretically --- that ensemble systems can be use for incremental learning. The difficulty lies with the fact that the data distribution changes --- particularly if new classes are introduced.

Page 12: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Performance Claims1. Contentious:

1. Combiners > dim. Reduc. 9/10

2. MCS more accurate 17/13

3. Small engineered better 12/7

4. Small ensembles better 16/5

2. Not Contentious1. Small ensembles better 16/5

2. Diverse MCS more accurate 17/3

3. MCS more stable 30/3

4. Fixed combiners > trained 0/large

5. MCS good for incremental 24/0

Page 13: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles

Page 14: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/1

It is possible to map the data complexity to the success of a particular multiple classifier systems. That is, we can deduce the type MCS (boosting, bagging, random forests, boosting, bites, etc.) to use based on a pre-analysis or properties of data.

Page 15: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/2

• Performance gain of ensembles can be quantified as a function of the classifiers, diversity, single classifier’s base performance, etc.

• Bonus question: will we ever be able to quantify this, even with knowledge of the data?

Page 16: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/3

• The optimal level of diversity for an ensemble can be directly determined from the input data, independent of the classification model used.

Page 17: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/4

• (Not just for MCS, but for pattern classification algorithms in general: the question of generalization / scaling) Conjecture: the error probability of an algorithm goes up as the log logarithm of the data size, with clever MCS

Page 18: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/5

• Consider Wolpert’s stacked generalizer. It is an empirical observation, seen many times, that the “crispness” (that is, the fraction of samples that are classified with high confidence in one of the classes) of the level 1 classifier is invariably greater than that of the individual level 0 classifiers. The accuracy of the level 1 classifier may or may not be greater than the accuracy of the best level 0 classifier.

Page 19: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles/6

• Good ECOC codes are not random

• Random feature selection is good

• Both are linked.

Page 20: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Design and Data Principles

1. Use complexity to pick MCS 13/5

2. Predict perf. from base prop. 12/13

3. Predict best diversity from data. 0/lots

4. Error is logarithm of data size 1/6

5. Crisp classifier not the accurate one 1/0

6. Good ECOC is not random 0/6

Page 21: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Predictions for the Field

Page 22: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Predictions for the Field/1

• There is still substantial room for improvement in Machine Learning for supervised learning problems.

Page 23: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Predictions for the Field/2

• The time will come that any pattern classification system will be designed as a MCS, as people will realize that MCS is the best solution also for tasks where MCS cannot outperform single classifiers.

Page 24: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Predictions for the Field/3

• A “unifying” theory of MCS is less than five years away.

• Bonus questions: What is a “unifying theory”? How will we know when if and when we get it?

Page 25: MCS 2005 Round Table In the context of MCS, what do you believe to be true, even if you cannot yet prove it?

Predictions for the Field

1. Substantial room for improvement 11/1?

2. All PR systems will be MCS 6/5

3. GUT in < 5 years. 1/17