presenter - sas institute group presentati… · alex has over 25 years of diverse industrial,...

24

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization
Page 2: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

PresenterAlex Glushkovsky, Principal Analyst, BMO Financial Group

Alex has over 25 years of diverse industrial, consulting and academic experiences.

He hold PhD in mathematical modeling and optimization of technological processes and an Honours MSEE.

Alex is a Fellow Member of The Royal Statistical Society, a certified Professional Risk Manager (PRM) by PRMIA, a Senior Member of the American Society for Quality, a CQE and CRE.

He has been awarded for outstanding instruction of the Economics for Managers course, Ellis MBA, NYIT.

Alex has published/presented over 30 research papers on the statistical analysis and analytical management in International editions.

He has used SAS® for more than 20 years.

Page 3: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business SenseAlex Glushkovsky, BMO Financial Group

Page 4: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

• Is the red dot an outlier?

Outline Outliers: Adding a Business Sense

X1 X1

X2

Page 5: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

• Is the red dot an outlier?

Outline Outliers: Adding a Business Sense

X1 X1

X2

Page 6: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

In Spec Range

• Is the red dot an outlier?

Outline Outliers: Adding a Business Sense

X1Out of Spec

Small Segment

Page 7: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Classification of Outliers

Rare

Unusual

Uncommon

Unexpected

Violated

Error

Inherent

Error

Impact on

BusinessX

Correct &

Prevent

Outliers = Assignable Variance (Statistical Process Control Concept)

Page 8: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Types of Outliers

Univariate (Distance, p-Value, Time Series)

Multivariate (Regressions, Clusters)

Distributions, Patterns

Combinations of items (baskets)

Sequences of events

Models

Efficient frontier, which is a set of outliersOptimization points of the objective functions (pricing that maximizes profit)

Extraordinary features of products, services, or processes

Non-dominated strategies that form Nash equilibrium

Analytical

Sense

Business Sense

Page 9: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outlier Based Business Improvement Cycle

Outlier detection

Classification

Identification of root causes

Improvement Action

Monitoring

Page 10: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Capability Maturity Model (CMM(SM)) Levels of Business Interaction with Outliers

• Ignorance (pay no attention, no detections, no isolations)

• Detection (ad hock removal or replacement)

• Control (implementation of special tools such as SPC)

• Management (establishment of improvement actions)

• Optimization (systematic and integrative improvement processes toward a business goal in a constrained environment)

Page 11: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Utilization of “Outliers” Is a Natural Growing Mechanism of the Climbing Plant

The continuous improvement strategy by spreading outliers in different directions, checking the outcomes, and selecting the most beneficial one for future growth

Outliers

Page 12: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Relationships between Outliers and Models

Models

Root

causes

IdentificationDetection

Impact

Outliers

Page 13: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Models: Lift Charts Comparing Different Models of the Same Problem

0%

20%

40%

60%

80%

100%

0% 20% 40% 60% 80% 100%

Outlier Model

This observation should be validated against an overfitting effect

Page 14: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Distributions: Empirical Cumulative Distributions with Examples of Outliers:

a) Significant Shift in Mean and Medium

b) Bimodal Distribution

0%

20%

40%

60%

80%

100%

-3 -2 -1 0 1 2 30%

20%

40%

60%

80%

100%

-3 -2 -1 0 1 2 3

Outlier Outlier

a) b)

Page 15: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Patterns: Illustrative Example of Sequential Patterns Control

1 2

4 5

3

6

Average

Ref: Glushkovsky, A. and Billard T. 1998. “Pattern Control: Correlation Analysis Combined with SPC.” Quality Engineering, Volume 11, Issue 2

O(N)PairwiseO(N2)

Page 16: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Clustering: Scatterplot of Clusters Profitability Versus their Sample Sizes (%)

$-

$5

$10

$15

$20

$25

$30

$35

$40

$45

0 5 10 15 20 25

Outlier I

Outlier II

Outlier I cluster has a small sample size and significant difference of the target variable to the neighboring clusters

Outlier II cluster has a small sample size but similar value of the target variable to the neighboring cluster

Connectors link some neighbouring clusters

Page 17: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Association Rules:

• Rare type of event occurs (rare item in the basket)

• Rare association of events per customer or session (combinations of items in the basket)

• Rare sequence of events per customer or session

Page 18: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Regression Models: Scatter Plot Representing Event Rate of the Target Variable vs Logarithmically Scaled Percentage of Population

0

10

20

30

40

50

60

0.01 0.1 1 10 100

Outliers with

elevated values

of the target

variable

Outlier with an average

value of the target variable

0 1 0 -0.54 21.252 30.64 -0.09

1, _MISSING_, _UNKNOWN_ 2 2 0.28 11.425 69.36 -0.09

1 1 3 -0.49 19.565 28.19 -0.14

0, _MISSING_, _UNKNOWN_ 2 0 0.2 3.384 71.81 -0.14

V17< 2 1 -2 0.31 5.823 8.11 -0.42

2<= V17 2 -2 0.25 12.418 7.17 -0.42

_MISSING_ 3 2 -0.05 15.673 84.72 -0.42

V12< 2 1 0 0.26 4.474 9.72 -0.18

2<= V12< 3 2 -1 0.4 10.948 5.55 -0.18

3<= V12< 4 3 -2 0.53 12.437 3.67 -0.18

4<=V12< 10 4 -1 0.42 15.417 10.32 -0.18

10<= V12, _MISSING_ 5 2 -0.14 26.604 70.73 -0.18

Coefficient

D4

D16

V17

V12

Group Scorecard Points Weight of Evidence Percentage of PopulationEvent Rate TARGET_BIN

Page 19: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Decision Tree: Aspects for Consideration

• Sample size of the final node

• Estimated values of the target variable of the final node

• Interaction effects of the branches

• Split cut-offs of the nodesModeling

Data

Outlier Node

Modeling Data

Robust Model

Special Segments

Overfitting?

Page 20: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

1

• Select inputs for deterministic special segment splits by using business objectives and judgement. It can be any type of variable: nominal, ordinal, binary, or interval numeric.

2

• Run univariate decision trees specifying a very small final node size to be allowed. Explore the setup of the maximum number of branches to be, let’s say, 5-7, i.e., more than the default of two.

3

• Based on some sound statistics, such as purity measures for binary targets, identify candidates for the special segment.

4

• Isolate special segments obtained by step 3 only if they have a strong business sense.

5

• Run conventional decision tree, gradient boosting, random forest, scorecard, or regressions models on the rest of the population. Step 3 can provide additional guidance for a minimum node size to be allowed in the modeling.

Assembly 4 and 5Defend or reject

obtained modeling results

Page 21: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Outliers and Stability: Assignable Instabilities of Time Series:

• Isolated single outlier

• Shift in mean or in variance

• Trend in mean or in variance

• Autocorrelation

• Cycling (seasonality)

-4

-3

-2

-1

0

1

2

3

4

5

6

1 5 9

13

17

21

25

29

33

37

41

45

49

53

57

61

65

69

73

77

81

85

89

93

97

Upper Control Limit

Lower Control Limit

Single Outlier

Recovery

“Immunization” Effect

Ref: Glushkovsky, A. 2006. “Stability Index of Stochastic Processes: The Statistical Process Control Approach.” Economic Quality Control, Vol 21, No. 1, 87 – 111

Page 22: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Conclusion

• Outliers are not just statistical or data quality issues but business matters. Do not ignore outliers. On the contrary, hunt for them! Use the climbing plants strategy to improve business in the right direction by using outliers as anchors to climb business peaks.

• Dealing with outliers should not just be an analyst’s ad hock practice when preparing reports or training models, but systematic approaches beyond standard data quality and integrity assurances.

• Systematically implementing special applications to detect outliers, to estimate their performance, to identify their root causes, and to control them will improve business.

Page 23: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Outline Outliers: Adding a Business Sense

• Disclaimer

• The paper represents the views of the author and do not necessarily reflect the views of the BMO Financial Group. All charts and data are simulated for illustrative purposes only and do not reflect the actual business state.

• Acknowledgments

• The author would like to thank Matthew Fabian and Lori Bieda of the BMO Financial Group for their support.

Page 24: Presenter - Sas Institute Group Presentati… · Alex has over 25 years of diverse industrial, consulting and academic experiences. He hold PhD in mathematical modeling and optimization

Don't Forget to Provide Feedback!

1. Go to the Agenda icon in the conference app.

2. Find this session title and select it.

3. On the sessions page, scroll down to Surveys and select the name of the survey.

4. Complete the survey and click Finish.