lean six sigma dmaic process: common mistakes and misconceptions during data collection and analysis...

Lean Six Sigma

DMAIC process:Common Mistakes and Misconceptions

During Data Collection and Analysis

Hans Vanhaute04/08/2014

Goal of tonight’s presentation

Give you a few examples of common mistakes made during “Measure” phase of DMAIC projects.

Draw more widely applicable lessons and conclusions that may benefit you (so you don’t make the same mistakes).

Hopefully provide you with some interesting insights (and don’t put you to sleep).

DMAIC and “Projects”

“A problem scheduled for a solution.”

Management decides the problem is important enough to provide the resources it needs to get the problem solved.

“DMAIC Projects”

Eliminates a chronic problem that causes customer dissatisfaction, defects, costs of poor

quality, or other deficiencies in performance.

Six Sigma DMAIC Project

DEFINE MEASURE ANALYZE IMPROVE CONTROL

Very Data-Intensive

M – Measure

Define a high-level process map.

Define the measurement plan.

Test the measurement system (“Gauge Study”).

Collect the data to objectively establish current baseline.

Typical tools:

- Capability analysis

- Gage R&Rs

The DMAIC steps

Y = f (unknown Xs)

Initial Analysis

Capability Analysis Conundrums

Black Box Process

Unknown Xs

Cpk values that inaccurately

predict process performance

Non-normal data

Instability of the process over time

Case 1: Inherent non-normality of the process output.

Some physical, chemical, transactional processes will produce outcomes that “lean” one way:

- Time measurements- Values close to zero, but that are always positive (surface

roughness RMS…) - …

Process experts or careful analysis of the metric should be able to help with understanding.


Good news: Capability Analysis of Non-Normal data is possible.Bad news: This situation doesn’t happen very often.

Example

Case 2: Problematic measurement systems

(we’ll come back to that one when we discuss GR&R…)


Case 3: Failure to stratify the data.

Stratification is the separation of data into categories.It means to “break-up” the data to see what it tells you.

Its most frequent use is when diagnosing a problem and identifying which categories contribute to the

problem being solved. Microsoft Word

97 - 2003 Document


This is the big one!

Stream 1

Stream 2

Stream 3

Stream 4

Cpk1

Cpk2

Cpk3

Cpk4

Cpk ???


1 Cpk value?

2 Cpk values?

What is a Cpk value supposed to tell us?

Expected future performance of the process(es) assuming statistical stability

over time.


Over-estimating variation of the process.(Why?)

Under-estimating process capability.

Leading to all sorts of non-value-added activity for your organization.

Recognize two of the four streams are main drivers of overall capability.Correct estimation of the two most important process capabilities.

Points to appropriate improvement activities.


Cpk = 0.67

Cpk = 1.15

Cpk = 1.50 Cpk = 1.50

Cpk = 1.15

12010896847260483624121Index

Data

Time Series Plot

Prediction (stratified)

Prediction (not stratified)

Actual data (stratified)

Actual data (not stratified)


Example

Problematic measurement systems:

2a: Limiting factors to “how well” you can measure something.

2b: I passed my GR&R but I’m still getting “weird” results.

2c: Time effects


Case 2a: Limits to measurementsGame: Identify the dataset with the highest resolution.

Resolution:

a: The process or capability of making distinguishable the individual parts of an object, closely adjacent optical

images, or sources of light b: A measure of the sharpness of an image or of the fineness with which a device can produce or record

such an image.

Case 2a: Limits to measurements

9.9 10.2 9.6 9.910.5 9.9 9.0 9.911.4 10.2 9.3 10.29.0 9.6 10.2 9.610.2 10.5 10.2 9.610.5 9.9 10.2 9.910.2 11.1 10.5 10.810.2 10.8 9.3 10.210.5 9.3 9.9 9.99.3 9.9 10.8 11.110.8 9.6 10.8 10.89.9 9.3 10.2 9.39.9 10.5 9.9 9.99.9 9.9 10.2 9.6

9.6 10.2 9.6 9.610.2 9.6 9.0 9.611.4 10.2 9.6 10.29.0 9.6 10.2 9.610.2 10.2 10.2 9.610.2 9.6 10.2 10.210.2 10.8 10.2 10.810.2 10.8 9.0 10.210.2 9.6 9.6 9.69.0 9.6 10.8 10.810.8 9.6 10.8 10.89.6 9.0 10.2 9.610.2 10.2 9.6 9.69.6 9.6 10.2 9.6

Which dataset has the highest resolution?

? ?

Measurement Resolution: a: The process or capability of making distinguishable the individual parts of a dataset or closely adjacent

data points.b: A measure of the sharpness of a set of data or of the

fineness with which a measurement device can produce or record such a dataset.

Less resolution

Less resolution

Less resolution

Limiting factors to “how well” you can measure something.


9.939710.594411.4290 9.0401

9.910.511.4 9.0

9.610.211.4 9.0

101111 9

21

43

14

12

10

8

6

4

2

0

Frequency

Histogram (unlimited decimal places)

14

12

10

8

6

4

2

0

Frequency

Histogram (1-in-10 rule)

20

15

10

5

0

Frequency

Histogram (1-in-5)

35

30

25

20

15

10

5

0

Frequency

Histogram (1-in-3)

Less resolution

Less resolution

Less resolution


21

43


99

95

90

80

70

60

50

40

30

20

10

5

1

Perc

ent

1-in-3 rule

99

95

90

80

70

60

50

40

30

20

10

5

1

Perc

ent

1-in-5 rule

99

95

90

80

70

60

50

40

30

20

10

5

1

Perc

ent

1-in-10 rule

99

95

90

80

70

60

50

40

30

20

10

5

1

Perc

ent

Unlimited decimal places

Less resolution

Less resolution

Less resolution


S = 1.000 S = 1.005 (0.5% over)

S = 1.040 (4% over) S = 1.150 (15% over)

21

43


Limiting factors to “how well” you can measure something:


Why??“Always done it that way, never given it any thought”.Focus on “meeting specs” not on controlling process.

“Always” round to x decimal places.Nobody told me how many decimals were needed

…

The old “1 in 10” rule of thumb seems to make sense.Resolution must be at least 1/10th of data rangeResolution must be at least 1/10th of spec range

Case 2b: “Weird” StuffI passed my GR&R but I’m still getting “weird” results.

Distribution of Measurements

Distribution of measurement variability

GR&R 101:

P/TV ratio expresses the total measurement variability as a percentage of the total historical process variation.Here P/TV ~ 14%

“Metrics”


Distribution of measurement error

GR&R 101: “Metrics”

P/T expresses the total measurement variability as a percentage of the tolerance width of the process:Here P/T ~ 12.5%

Spec. limits


GR&R 101: “Metrics”

P/TV P/T

Very good <10% <10%

Marginal 10 – 30% 10 – 30%

Needs Improvement > 30% > 30%

Simple, right? Not so fast…


Part-to-PartReprodRepeatGage R&R

100

50

0

Per

cent

% Contribution% Study Var

% Tolerance

0.8

0.4

0.0

Sam

ple

Ran

ge

_R=0.2333

UCL=0.7624

LCL=0

1 2 3

50

40

30Sam

ple

Mea

n

1 2 3

UCL=35.65

LCL=34.77

__X=35.21

10987654321

50

40

30

Parts

321

50

40

30

Tester

10 9 8 7 6 5 4 3 2 1

50

40

30

Parts

Ave

rage

1

2

3

Tester

Gage name: NaOH titrationDate of study: 11/25/2008

Reported by: J im SmithTolerance: 0.01Misc:

Components of Variation

R Chart by Tester

Xbar Chart by Tester

Result by Parts

Result by Tester

Tester * Parts Interaction

NaOH Titration

R chart by operator:Points inside control limits indicate that operator is consistent between repeat measurements made on same sample (GOOD)

Points outside control limits indicate that operator is not consistent between repeat measurements made on same sample (BAD)


Example

P/T = 22%

P/TV = 16%


ExampleP/T = 70%P/TV = 40%

P/T = 10%P/TV = 6%


Example

So… what caused this?

Camera

Lens

Ring Light

Pin Tip Position

Case 2c: Time EffectsThe speed of Information is finite.

Information can come from different distances.



Moon: 1.2 light-seconds awaySun: 8 light-minutes awayMars: 12.5 light-minutes awayPluto: 5.5 light-hours awayProxima Centauri: 4.2 light-years away



Just because you “observe” (measure / see) several events “at the same time”, doesn’t mean they all occur(red) at the same time.

Case 2c: Time Effects

Example

Arranged by order of occurrenceArranged by order of observation

Case 2c: Time Effects

What can you do?

Collect the data as close as possible to the origin of the event you are observing.

“Traceability” of the events you are observing.

“De-convolution” of the data.

In mathematics, de-convolution is an algorithm-based process used to reverse the effects of convolution on

recorded data

Blind reliance on some index value (Cpk, Cp, P/T, P/TV,…) to tell you what is going on might get you in trouble.Always:- Make sure you understand how the index is calculated- Use the approach fully, not half-way- Verify that all assumptions were met Data stratification opportunities abound. Identify them early on in your project.

A few simple rules of thumb will quickly help you determine if you have a chance of having a good measurement system.

So… What did we learn?

Further analysis of the Gage R&R data can provide you with some great insights into and improvement opportunities for your measurement process.

Data has a finite speed. Being aware of this and planning for it during your measure phase will help keep you on the right track.

So… What did we learn?

Parting Thoughts

My organization doesn’t use Six Sigma, do these insights benefit me

as well?

Thank You

Questions?

lean six sigma dmaic process: common mistakes and misconceptions during data collection and analysis...

Documents

time slide

dataintensive slide

sigma dmaic process

careful analysis

process output

capability analysis

c pk1 c pk2 c pk3 c

actual data