incremental risk minimization algorithm (eais 2014)

Upload: andreas-buschermoehle

Post on 13-Oct-2015

20 views

Category:

Documents


0 download

DESCRIPTION

This is a presentation from the 2014 IEEE Conference on Evolving and Adaptive Intelligent Systems in Linz, Austria.

TRANSCRIPT

PowerPoint-Prsentation

Reliable Localized On-line Learningin Non-stationary EnvironmentsAndreas Buschermhle, Werner BrockmannInstitute of Computer ScienceSmart Embedded Systems GroupLinz, 06/03/2014OutlineOn-line Learning

State of the Art

IRMA Approach

Investigation

Application

Conclusion- 2 -Smart Embedded Systems GroupUniversity of OsnabrckMotivationOn-line learning of LIP models for regression on a data sequenceContinuous adaptation to changes (shift and drift)Learning from a single datum at a timeFixed low computational effortFixed low memory demand

But: A single datum contains only local informationIn input spaceIn timeChange affects global output- 3 -

Smart Embedded Systems GroupUniversity of Osnabrck

LIP Models- 4 -Smart Embedded Systems GroupUniversity of Osnabrck

On-line Learning Regression- 5 -

Smart Embedded Systems GroupUniversity of Osnabrck5On-line Learning Algorithms Basics- 6 -Smart Embedded Systems GroupUniversity of OsnabrckLearning Algorithms First Order- 7 -AdvantagesPermanent adaptationDrawbacksProne to noiseBig global changesSmart Embedded Systems GroupUniversity of OsnabrckLearning Algorithms Second Order- 8 -AdvantagesRobust to noiseLow changes on the long runDrawbacksLess adaptation to changesBig changes in betweenPermanent adaptationHighly instableWith forgetting:Smart Embedded Systems GroupUniversity of OsnabrckIncremental Risk Minimization Algorithm (IRMA)- 9 -Loss on examplelocalStiffnessSmart Embedded Systems GroupUniversity of OsnabrckIRMA for LIP Models- 10 -Smart Embedded Systems GroupUniversity of OsnabrckLimit Case: Vanishing Stiffness- 11 -but limit existsSmart Embedded Systems GroupUniversity of OsnabrckHypothesis: Increasing Stiffness- 12 -

# ExamplesStiffnessSmart Embedded Systems GroupUniversity of OsnabrckInvestigation: Increasing StiffnessTarget:3rd order polynomialNormal noise of 5%Shift after 1200 examples- 13 - Improvement through increasing stiffness possible but only lowGrowthL(1200)L(2400)Fixed7.4124.47Additive6.5473.32Multiplicative6.63478.80Sigmoidal (high)6.7229.60Sigmoidal (low)7.3124.36Comparison:FixedAdditive increaseMultiplicative increaseSigmoidal increase (high)Sigmoidal Increase (low)Model:15th order polynomial (overfitting)Different increase functions for stiffnessSmart Embedded Systems GroupUniversity of OsnabrckApplication: Load ForecastingPower companies rely on accurate electricity load forecastingMinimize financial riskOptimize operational efficiency and reliability

Typical non-stationary environmentContinuous changes, e.g. due to weatherRapid changes, e.g. due to holidays

ScenarioMeasurements every 15 minutesPrediction of next stepPrediction of next 24 hoursComparison of PA-IIRLS with forgettingIRMAon GLT and Polynomial model- 14 -

Power feed in of the city Kiel on medium voltage level for 2008 and 2009Smart Embedded Systems GroupUniversity of OsnabrckResults: Load ForecastingHyper-parameter Setup- 15 -SteadyPA-IIRLSIRMAIRMA (sig)GLT, step ahead1.042.8312.392.302.30Poly, step ahead1.0419.7512.921.791.79GLT, 24h ahead9.668.9212.978.348.34Poly, 24h ahead9.6619.7748.697.857.85Relative Prediction ErrorReference steady prediction:Step ahead: last value24h ahead: same day last weekSmart Embedded Systems GroupUniversity of OsnabrckConclusion- 16 -Smart Embedded Systems GroupUniversity of Osnabrck16References[Rose1958] Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65(6), 386408 (1958)[Cram2006] Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. The Journal of Machine Learning Research 7, 551585 (2006)[Hunt1986] Hunt, K.J.: A survey of recursive identification algorithms. Transactions of the Institute of Measurement and Control 8(5), 273278 (1986)[Cram2009] Crammer, K., Kulesza, A., Dredze, M., et al.: Adaptive regularization of weight vectors. Advances in Neural Information Processing Systems 22, 414422 (2009)[Cram2010] Crammer, K., Lee, D.D.: Learning via gaussian herding. Advances in Neural Information Processing Systems 23, 451459 (2010)[Busc2013a] Buschermoehle, A.; Huelsmann, J.; Brockmann, W.: UOSLib A Library for Analysis of Online-Learning Algorithms. In: Proc. 23. Workshop Computational Intelligence, Karlsruhe: KIT Scientific Publishing, accepted for publication (2013)[Busc2013b] Buschermoehle, A.; Schoenke, J.; Rosemann, N.; Brockmann, W.: The Incremental Risk Functional: Basics of a Novel Incremental Learning Approach. In: Proc. Int. Conf. IEEE Systems Man and Cybernetics (SMC): IEEE Press, 15001505 (2013)- 17 -Smart Embedded Systems GroupUniversity of Osnabrck17Worst Case Minimization- 18 -fixedminimized by IRMA(for local improvement)Global loss of approximationworst caseSmart Embedded Systems GroupUniversity of OsnabrckChoosing an On-line Learning Algorithm- 19 -

Smart Embedded Systems GroupUniversity of OsnabrckLearning Algorithms Basics- 20 -Smart Embedded Systems GroupUniversity of Osnabrck