boosting neural networks

16
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8):1869-1887, 2000. Presented by Yong Li

Upload: dinah

Post on 14-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Boosting Neural Networks. Published by Holger Schwenk and Yoshua Benggio Neural Computation , 12(8):1869-1887, 2000. Presented by Yong Li. Outline. Introduction AdaBoost 3 versions of AdaBoost for Neural Network Results Conclusions Discussions. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Boosting Neural Networks

Boosting Neural Networks

Published by Holger Schwenk and Yoshua BenggioNeural Computation, 12(8):1869-1887, 2000.

Presented by Yong Li

Page 2: Boosting Neural Networks

Outline

1. Introduction2. AdaBoost3. 3 versions of AdaBoost for Neural

Network4. Results5. Conclusions6. Discussions

Page 3: Boosting Neural Networks

Introduction

• Boosting – a general method to improve the performance of a learning method.

• AdaBoost is a relatively new one of Boosting algorithms.

• Many empirical studies for AdaBoost using decision tree as base classifiers. (Breiman 1996, Drucker and cortes, 1996, et al)

• Also theoretically understanding. (Schapire et al 1997, Breidman 1998, Schapire 1999)

Page 4: Boosting Neural Networks

Introduction

• But applications have all been to decision trees. No applications to multi-layer artificial neural networks. (At that time)

• The questions which this paper try to answer– Does AdaBoost work as well for neural networks as for

decision tree?– Does it behave in a similar way?– And more?

Page 5: Boosting Neural Networks

AdaBoost (Adaptive Boosting)• It is often possible to increase the accuracy

of a classifier by averaging the decisions of an ensemble of classifiers.

• Two popular ensemble methods. Bagging and Boosting.– Bagging improves generation performance due to a

reduction in variance while maintaining or only slightly increasing bias.

– AdaBoost constructs a composite classifier by sequentially training classifier while putting more and more emphasis on certain patterns.

Page 6: Boosting Neural Networks

AdaBoost

• AdaBoost M2 is used in the experiments

Page 7: Boosting Neural Networks

Applying AdaBoost to neural networks

• Three versions of AdaBoost are compared in this paper.– (R) Training the t-th classifier with a fixed

training set– (E) Training the t-th classifier using a different

training set at each epoch– (W) Training the t-th calssifier by directly

weighting the cost function of the t-th neural network.

Page 8: Boosting Neural Networks

Results

• Experiments are performed on three data sets.– The online data set collected at Paris 6 university

• 22 attributes([-1 1]22), 10 classes. • 1200 examples for learning and 830 examples for testing

– UCI letter• 16 attributes and 26 classes• 16000 for training and 4000 for testing

– Satimage Data set• 36 attributes and 6 classes• 4435 for training and 2000 for testing

Page 9: Boosting Neural Networks

Results of online data

Page 10: Boosting Neural Networks

Results of online data

•Some conclusions–Boosting is better than Bagging–AdaBoost is less useful for very big networks.–(E) and (W) versions are better than (R)

Page 11: Boosting Neural Networks

Results of online data

•The generation errors continue decrease after the training error reach zero.

Page 12: Boosting Neural Networks

Results of online data

The number of examples with high margin increases when more classifier are combined by boosting

Note: There are opposite results about the margin cumulative distribution.

Page 13: Boosting Neural Networks

Results of online data

Bagging has no significant influence on the margin distribution

Page 14: Boosting Neural Networks

The results for UCI letters and Satimage data sets

•Only E and W version are applied. They obtain same results.

•The same conclusions are drawn as those of online data. (Some results are omitted)

Page 15: Boosting Neural Networks

Conclusion

• AdaBoost can significantly improve the neural classifiers.– Does AdaBoost work as well for neural networks as for decision

tree?• Answer Yes

– Does it behave in a similar way?• Answer Yes

– Overfitting • Still there

– Other questions• Short answers

Page 16: Boosting Neural Networks

Discussions

• Empirically shows AdaBoost works well for neural networks

• The algorithm description is misleading.– Dt(i), Dt(i, y)