task 1 of pp interpretation

23
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Task 1 of PP Interpretation 1.1 Further applications of boosting: This talk 1.2 Publication on boosting: Paper of Oliver Marchand submitted, but not yet published

Upload: kordell

Post on 07-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Task 1 of PP Interpretation. 1.1Further applications of boosting: This talk 1.2Publication on boosting: Paper of Oliver Marchand submitted, but not yet published. Thunderstorm Prediction with Boosting: Verification and Implementation of a new Base Classifier. André Walser (MeteoSwiss) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Task 1 of PP Interpretation

Federal Department of Home Affairs FDHAFederal Office of Meteorology and Climatology MeteoSwiss

Task 1 of PP Interpretation

1.1 Further applications of boosting:This talk

1.2 Publication on boosting:Paper of Oliver Marchand submitted, but not yet published

Page 2: Task 1 of PP Interpretation

Federal Department of Home Affairs FDHAFederal Office of Meteorology and Climatology MeteoSwiss

Thunderstorm Prediction with Boosting:

Verification and Implementation of a new Base Classifier

André Walser (MeteoSwiss)

Martin Kohli (ETH Zürich, Semester Thesis)

Page 3: Task 1 of PP Interpretation

3 Andre Walser

Overview

• Boosting Algorithm

• Impact of learn data

• Verification results

• Mapping to probability forecast

• New base classier: decision tree

Page 4: Task 1 of PP Interpretation

4 Andre Walser

Supervised Learning

Rules Classifier

New Data

yes/no

Historic Data

Learner

Page 5: Task 1 of PP Interpretation

5 Andre Walser

Learn data

COSMO-7 assml cycle• Data for 79 SYNOP stations

in Switzerland

• At least on year, every hour

• e.g. SI, CAPE, W, date, time

LABEL DATA• a thunderstorm „yes“ if

• an appropriate ww-code was reported in the SYNOP or

• at least 3 lightnings were registered within 13.5 km

station

13.5km

Page 6: Task 1 of PP Interpretation

6 Andre Walser

AdaBoost Algorithm

InputWeighted learn samplesNumber of base classifier M

Iteration1 determine base classifier G2 calculate error, weights w3 adapt the weights of falsely classified samples

Page 7: Task 1 of PP Interpretation

7 Andre Walser

Output of the Learn process

• M base classifier• Threshold classifier:

Page 8: Task 1 of PP Interpretation

8 Andre Walser

AdaBoost Algorithm

InputWeighted learn samplesNumber of base classifier M

Iteration1 determine base classifier G2 calculate error, weights w3 adapt the weights of falsely classified samples

Classifier:

Page 9: Task 1 of PP Interpretation

9 Andre Walser

Output of the Classifier: C_TSTORM

17 UTC

18 UTC

19 UTC

Biased!

Biased!

Page 10: Task 1 of PP Interpretation

10 Andre Walser

Reason: Inappropriate learn data…

• SYNOP messages contain events and non-events, but are only available every 3 hours (most messages for 6, 12, 18 UTC).

• Lightning data only contains events

Page 11: Task 1 of PP Interpretation

11 Andre Walser

New learn data sets

• B – biasedSYNOP messages; only events from lightning data

• F – fullSYNOP messages; all missing values are considered as non events

• AL1 – at least 1SYNOP messages; when lightning data shows at least 1 events, all non missing value are considered as non-events

Page 12: Task 1 of PP Interpretation

12 Andre Walser

Without bias…

17 UTC

18 UTC

19 UTC

Page 13: Task 1 of PP Interpretation

13 Andre Walser

Verification

• POD and FAR for different C_TSTORM values between 0.3 and 0.6

FAR = False Alarms / #Alarms

• Learn data:Model: COSMO-7 assimilation cycle Jun 06 – May 07Obs: B / AL1 / F

• Verification data: Model: COSMO-7 forecasts July 06 and May/June 07Obs: F

Page 14: Task 1 of PP Interpretation

14 Andre Walser

Verification: earlier results

• Results reported last year for 2005:

POD = 72%, FAR = 34%

• Unfortunately not realistic, verification done with obs data B

Page 15: Task 1 of PP Interpretation

15 Andre Walser

July 2006

~7% events

Random forecast

Page 16: Task 1 of PP Interpretation

16 Andre Walser

18 May – 24 June 2007

Page 17: Task 1 of PP Interpretation

17 Andre Walser

Comparison with other system

• DWD Expert-System:• Periode April 2006 - September 2006:

POD = 0.346, FAR = 0.740

Page 18: Task 1 of PP Interpretation

18 Andre Walser

Mapping to a probability forecast

PC_TSTORM

Polygon fit in a reliability diagram:

Page 19: Task 1 of PP Interpretation

19 Andre Walser

Mapping to a probability forecast

0 if x ≤ 0.4;ax2 + bx + c if 0.4 < x < 0.6;a0.62 + b0.6 + c if x ≥ 0.6.

PC_TSTORM =

Limited resolution: The system predicts probabilities only between 0 and ~40% Limited resolution: The system predicts probabilities only between 0 and ~40%

Page 20: Task 1 of PP Interpretation

20 Andre Walser

New Base Classifier: Decision Tree

threshold classifier 1

1 0

Page 21: Task 1 of PP Interpretation

21 Andre Walser

New Base Classifier: Decision Tree

threshold classifier 1

threshold classifier 2

threshold classifier 3

class 1 class 0

1 0 1 0

Page 22: Task 1 of PP Interpretation

22 Andre Walser

Decision Tree: Example

Page 23: Task 1 of PP Interpretation

23 Andre Walser

Conclusions & Outlook

• Boosting • is a simple, efficient and effective machine learning method

for model post-processing• is completely general• can employ a number of redundant indicators• computes a certainty of the classification

mapped to probability forecast

• First verification results promising, extended verification required

• Benefit of decision trees?