national bureau of standardsreport...nationalbureauofstandardsreport nmmject nbsreport 1103-40-5146...

NATIONAL BUREAU OF STANDARDS REPORT

6263

Draft of

Part IV (Miscellaneous Topics)

Manual on Experimental Statistics

for Ordnance Engineers

A REPORT TO

OFFICE OF ORDNANCE RESEARCHDEPARTMENT OF THE ARMY

U. S. DEPARTMENT OF COMMERCE

NATIONAL BUREAU OF STANDARDS

for

THE NATIONAL BLKEAU OF STANDARDS

Functions and Activities

The functions of the National Bureau of Standards are set forth in the Act of Congress, March

3( 1901, as amended by Congress in Public Law 619, 1950. These include the development and

maintenance of the national standards of measurement and the provision of means and methods*** '

for making measurements consistent with these standards; the determination of physical constants

and properties of materials; the development of methods and instruments for testing materials,

devices, and structures; advisory services to Government Agencies on scientific and t

NATIONAL BUREAU OF STANDARDS REPORTNMmJECT NBS REPORT

1103-40-5146 13 January 1959 6263

Draft of

Part IV (Miscellaneous Topics)

for

Manual on Experimental Statistics

for Ordnance Engineers

Prepared by

Statistical Engineering Laboratory

A iieport to

Office of Ordnance ResearchDepartment of the Army

IMPORTANT NOTICE

NATIONAL BUREAU OF STAlfittiKl«d for vto wltklA tlio C

to tdditlofiat ovoluatlOA Md rcllitlog of llili RdP^’di oltkor If

llid OfNco of tlia OIroctor, Nat

how««or, b| Ike GoveromoAt ito reproduce addltloAtl coplei

Approved for public release by the

director of the National Institute of

Standards and Technology (NIST)

on October 9, 2015

irogreM aceouAtlAg documeota

ffnalljr publleked It Is subjected

reproductloAi or opsA-llterature

ilOA Is obtaloed Ia wHIlAg from

Suck permlisloA Is AOt oeededi

prepared If tkat ageocy wishes

U. S. DEPARTMENT OF COMMERCE

NATIONAL BUREAU OF STANDARDS

(i)

;n.tC:'pi -I vii

fi#/

'• r

V 'J

:' -. , \ . i ^ : ‘imfv .,•' r...-'; ...,• r .V „ .? = 'i., .].'• ,;’'..'-j

i' - •• •.,, . • . .. : ,. : '-f!,? .. :•>'

*

\'

:.';•' ?: ' •

y. ^ ^ •;.> .• r ,< 7 :. V ; ' : , . r . i' •' '

•. i^ ' ) i

. r•

• ' ' •,'• . 7 ';

I.,'"! ' ?

'.

.. i

•s .i

^

;. H., !

,

Vi:

i'

'*,

^'

NOTICE

This report is a preliminary draft of Part IV

(Miscellaneous Topics) for the Manual on Experimental

Statistics for Ordnance Engineers.

No known inaccuracies exist in the present

draft, but improvements in arrangement and exposition

of some of the material may be made at a later date.

(ii)

V I' r

*1 ;

4

4

! r n; * ,r

Ir-ib Lj! :I‘.)-

Table of Contents

Page

1. Rejection of Observations 1

2. The Place of Control Charts in Experimental Work, 9

3. Statistical Techniques for Analyzing Extreme-

Value Data 13

4. Transformations 22

5. Notes on Statistical Computations

5.1 Coding. 2 5

5.2 Rounding 29

(iii)

4 '

I

. I

*.r; 1)'*

;' '/*/ ''j'tJ'ia ’'f

'

’"-O •iT’* J '^>Zi t i-

j n/ -.y.•.-«. ^

1

.

».'., -f- .:i

' 9i ' lui ^5^ !• -i-fjoi- :• ' .'i J ?5 -rr

' vt’

'.' %

» t .^

.

•

3 £ a- f./ i .B *i d .-r;’ sr 4 -V;

. i,c

V.:

1:1 . G

/

1 . Rejection of Observations

Every experimenter has at some time obtained a set

of observations, purportedly taken under the same conditions,

in which one observation was widely different, or an outlier

from the rest.

The problem that confronts the experimenter is whether

he should keep the suspect observation in computation, or

whether he should discard it as being a faulty measurement.

The word reject will mean reject in computation , since every

observation should be recorded. A careful experimenter will

want to make a record of his "rejected” observations and

where possible detect and analyze carefully their cause (s).

It should be emphasized that we are not discussing

the case where we know that the observation differs because

of an assignable cause, i.e. a dirty test-tube, or a change

in operating conditions* We are dealing with the situation

where, as far as we are able to ascertain, all the observa-

tions are on approximately the same footing. One observation

is "suspicious" however, in that it seems to be set apart from

the others. We wonder whether it is not so far from the others

that we can reject it as being caused by some assignable but

thus far unascertainable cause.

-1 -

X

'IKK, i i-'

i'l

-T‘

V,

1

•

"' k'a .1 V. "1 "4 Ziti -fi- , iiK ;.i'

i.'.i,

j.. i> ’-- ' ; 'i 'r

'.>

''!' >' •

^ ,.v- ;-v ,. ti: .- J- .n . . .

j;. .• J ; • K -'ev; = 4’*^ ; ; n„ m .x .i' , ^

x'i fftc( M . i .»'• . . . *f

.

..• ( ; A t : . :r

T ,. , . r, t < -f,,** “ f “

i:-' 1 •'XixC .. ' '

•, . O't"! • ' >; i i- - •' ^ B . JI .

'

i .' L '. .X-:. .,xx-yi..\X;ryl. V‘- > ^ * /;/:

' J - A . ' ^> '-.

V- U; .r ,': .'

•'' X" ;• : >:=, ; .' i :i‘ ..i - 'i ." . -;v-/v. .^;, . , v' ';'

.

''

'

j' ' v’ i ' ,,* J .0 ' ii / > \ )

''

,

' '•

. V / .. .V,

; ii J

Whether a measurement far-removed from the great ma-

jority of a set of measurements of a quantity^ and thus

possibly reflecting a gross error, should have a full vote,

a diminished vote, or no vote in the final average - and in

the determination of precision - is a very difficult question

to answer completely in general terms. Common sense dictates

that, if on investigation, a trustworthy explanation of the

discrepancy is found, the value concerned should be excluded

from the final average and from the estimate of precision,

since these presumably are intended to apply to the unadulte-

rated system. If, on the other hand, no explanation for the

apparent anomalousness is found, then common sense would seem

to indicate that it should be included in computing the final

average and the estimate of precision. Experienced investiga-

tors differ in this matter. Some, e.g., J.W. Bessel, would

always include it. Others would be inclined to exclude it,

on the grounds that it is better to exclude a possibly *good*

measurement than to include a possibly ’bad* one. The argu-

ment for exclusion is that when a ’good’ measurement is ex-

cluded we simply lose some of the relevant information, with

consequent decrease in precision and the introduction of some

bias (both being theoretically computable) • whereas when a

truly anomalous measurement is included it vitiates our results.

-2 -

^ b'7v..r;i5f •ji -•iti*'' txi*: ^-.' 'C.’tvti.i.v-IJ ::!: *tS

Z:'

r/ J' 1-; •' '• T ' V>'. -~ i, J-iiM V- > si ^ i •• _>4i at* ».• I t jU v7 *?>

l4*.-7 • ft . ;p,t'V •:- ^:i"' '-'1 t Jt v-»j r d-

s:.^^r',t cisvai; l>'.'>ou&-. .ceîiiii ' • 'ti'^ r .toa-iq 5-«j ••. «:3-.r '7'.:: f: .*? ao • i7‘ • r.

V .. i. V . W. (* . ;3’ .^!t^ I Sf:io.?M .:‘;i‘>tr.!' 'airlt iijfe •stî’^.tti 'L-‘'4:r:t

* ..’;'.f;a*'

; .vJ‘

' bep J .. ai i'liiO’tV' a*t*'j.d jb , . jL ®b,ijr:>nx Jrjffx/.'fu

i)

'>^1*^ i d l5iaoq -.ii :-j j j j : c:bivt>o*^?v cîirt :ib

#-

; u ;• > .I.r; .T . w-i6,< M i'f'w j 7/* >.. . ; UiibJ t-r^rsX-j •?ii's.^*0/;.« ! .'I , •

-. aOiExxCox‘r‘î‘*o'^

:. 't-w.'. 'i:0 "?-«?'- x 4-*.«0.r .'\ Xo^fui; c. •• c.f

.

1''

'

^raO*3t Ttwo ftlvf-'l-i^' b.£>b*j rO:S?i;- S'JC

•

;t,i ;jfy:

^^

biasing both the final average and the estimate of precision

by unknown, and generally unknowable, amounts.

There have been many criteria proposed for guiding

the rejection of observations. For an excellent summary and

critical review of the classical rejection procedures, and

some more modern ones, see ref. [1]. One of the more famous

classical rejection rules is ”Chauvenet*s criterion*’, which

is not recommended. This criterion is based on the normal

distribution and advises rejection of an extreme observation

if the probability of occurrence of such deviation from the

mean of the n measurements is less than l/2n. Obviously

for small n, such a criterion rejects too easily.

A review of the history of rejection criterion, and

the fact that new criteria are still being proposed, leads

one to realize that no completely satisfactory rule can be

devised for any and all situations. One cannot devise a cri-

terion that will not reject a predictable amount from endless

arrays of perfectly good data. The amount of data rejected

of course depends on the rule. This is the price one pays

for using any rule for rejection of data. There is actually

nothing better to use than the judgment of an experienced

investigator who is thoroughly familiar with his measurement

process. For an excellent discussion of this point, see ref.

[ 2 ], Statistical rules are given largely for the benefit of

-3 -

; ;.v*' a.

^

..v'S.?-tiup#ni-' •"

*i .r 'a.»i.:û ' \ t ^...' • . -y\ tnr . • -a'

' - ’'

'-..r*-.*'

.-lofti t r. vrr\ • o j mfl

’^.,.v.Jl^•'•vc;xî* ;4 i8 ': . anotl /rv% -;>- • .• uiJfOyi.r, * w'J

'.r: ^ ^ ^6 Ctf!Jt [* .il/ivT *.^v> •,.,. nj : ?‘ t,'* . ,;*'*? orSr-jr;

iOffiivf , t'f.-ri y ; h: r a " =

..isv -);J . ;!^'v ,- i.>>' S t ' '3 J I'l. > A 1“^^’ J , h-'rb ',' 'Mmi.' . '^T ^ C .7

•> • .^ K ^ ..yi-'

r lA •rrcAr:' *,::*

îij ;'?0''/ .71 r; , Aiv?'- • V> ? ^»o/if»'i.'X£»’v )/• r, -^ tjtiiovido'xq li

•;.^vv.

' .i '/f1 n'^4* fci .4 r^JS©,‘T' ., .-fî’f . lo ni'^P-

*.v;I texid > .' eiJ", v,.t-‘ 'i . >"0 .in^::'

' .. «; doi: :*:/.

uJ ,i-. • u.iv^vf,t

'

^

‘

\B’X ?crr^;/o.‘dlixt . y; 7.r+*;.7j;i:.a xfi* / -. f.b

’{ I •o*’ i

'

•; c 'i >;; jon ; A rroi

*

i.a,

,i'-*.1 • -

.• ;• \ •./ • i *i .

.

V» gc.- .; v ic

^.. ‘ v

inexperienced investigators, those working with a new process,

or those who simply want justification for what they would

have done anyway.

Whatever rule is used, it must bear some resemblance

to the experimenter’s feelings about the nature and possible

frequency of errors. For an extreme example - if the ex-

perimenter feels that he may make one blunder in twenty, and

he uses a rejection rule that throws out 3 in 10, obviously

his reported data will be ’’clean”. The one sure way, and

the only sure way, to avoid publishing any ’’bad” results is

to throw away all results.

With the foregoing reservations, the next two sections

give some suggested procedures for judging outliers. In

general the rules to be applied to a single experiment reject

only what would be rejected by an experienced investigator

anyway. (See section 1.2)

1.1. Rejection of observations in routine experimental work.

Probably the best technique for detection of ’’errors”

(e.g., systematic errors, gross errors) in routine work is

the use of control charts for the mean and range. Detailed

methods of application of control charts have been described

in many places and will not be repeated here - see section 2

for appropriate references.

-4 -

. ' vV

»iOV

: I

kifir.

r .iMJt;- . j-:;

o'd:’ • -..'V

.^». i A X '

-;a .-f V

• '/ ,'(-f • -- V • > . > 'i - •;,

::t Hkv.S7.S- -r:.K.H ;

-. .• ..v:

>; -•ri ;=

; 'v" .: ?: - -

. .-, 7 • . i i . ; L

’-..fi-•, £iV, , ’ '.d:s iijx*"

ai.:a C: •; D : C, od r^:-" i

V,; -i od '* "fjr ,/ .

I i i?" ." c ->S

. o:?.;0 r J .:; i

.V . “f< r . - . .

i3 O'? Hi

-.«• 7 xO ’;S. r'-•«*<

;,V'’t 'ilX- .. •

•-'

.' 'iT >/i f':c’iT-' or

^iio Ic

^ AOJk&

1.2 Rejection of observations in a single experiment.

We shall assume that we have n observations ranked in

order from lowest to highest (X^ < ^2 * * * — ^n^ *

observations are from a population which has a normal dis-

tribution with mean |x and standard deviation a. We shall

discuss four cases.

Case (1) - We do not know anything whatever about the

population from which the data were taken, and must make our

decision on the basis of the n observations.

Case (2 ) - We do not know or a, but have an estimate

s of a which is independent of our present sample. The value

of s is estimated with f degrees of freedom.

Case (3) - We are willing to assume a value for a, the

standard deviation of the population from which the observa-

tions were taken.

Case (4) - The population from which the observations

have been taken is well known, and in particular we know the

values of }j, and a,

CASE (1) : No prior information about the mean and standarddeviation

i) Choose a, the probability or risk we are willing to

take of rejecting an observation which really belongs

in the group.

ii) Look up ^2-o/2Table XI. (NOTE: If the procedure

is such that we would consider rejection of obser-

vations which were extreme in one direction only -

- 5-

*

i:'\ V.

e.g., where we would reject values which were too

large but never those which were too small - then

we should replace a/2 by a. Our choice as to whe-

ther to use a/2 or a must be made on a priori grounds

and never on the basis of the data to be analyzed)

.

iii) If 3 < n < 7

8 < n > 10

11 < n < 13

14 < n < 25

Compute r^Q,

Compute r^^.

Compute ^21 ^

Compute

where r. . is computed as follows

:

îjX

îs suspect X

îs suspect

^10 (X2-Xi)/(X„-Xi)

^11 (X2-Xi)/(X„_1-Xi)

^21 (X -X „)/(X -X„)' n n-2 n 2 (X3-Xi)/(X^_1-Xi)

^22 „)/(X^-X„)n n-z n o (X3-Xi)/(Xi^_2-Xi)

iv) If > r^_ a/2>reject the suspect observation^

otherwise retain it.

CASE (2) : Mean unknown, estimate of a available from previoussample

i) Choose a, the probability or risk we are willing to

take of rejecting an observation which really belongs

in the group.

ii) Look up (n,f) in Table IV. (See Note in step ii

of Case (1) ) .

-6 -

lA.

f*

di

Aw K ;

%

. '"-'A'. • j•• ^ .

I

‘.*

4^ /

i»'t

iii) Compute w =

iv) If w < reject the observation which is

suspect, otherwise retain it.

CASE (3) : Mean unknown, a known

i) Choose a, the probability or risk we are willing

to take of rejecting an observation which really

belongs in the group.

ii) Look up (n,oo ) in Table IV. (See Note in step ii~ cx

of Case (1)).

•H•H•H Compute w * q, a.

iv) If w < reject the observation which is

suspect, otherwise retain it.

CASE (4) : Mean known, o known

i) Choose a, the probability or risk we are willing

to take of rejecting an observation which really

ii)

belongs in the group.

X 11Compute a* “ (l-a/2)‘^"" , (See note in step ii of

Case (1)).

iii) Look up z, , in Table Ib,1-a

iv) Compute a * a + az, ,1-a

v)

b = - crz^_^.

Reject any observation which does not lie in the

interval from a to b.

-7 -

References

[1] Paul R. Rider. ’’Criteria for Rejection of Observations”.Washington University Studies - New Series, Science andTechnology - No. 8, October 1933.

[2] E.B. Wilson, Jr. ”An Introduction to Scientific Research”,McGraw-Hill Book Co., Inc., New York, 1952.

Additional Reading

[3] W.J. Dixon. ’’Processing Data for Outliers”, Biometrics,Vol. 9 (1953) pp. 74-89,

[4] A. Hold. ’’Statistical Theory with Engineering ApplicationsJohn Wiley and Sons, 1952, pp. 333-336,

[5] F. Proschan. ’’Testing Suspected Observations”, IndustrialQuality Control, Vol. XIII, No. 7, January 1957,

i'.

'; '.',

ixjLJtii:5,A08 .0-^ no*:4c‘uI^^Tj

,*’p..a'-.)l J- •'. ;in-i.-;.to jallS|fi A rf'^i.w xtooilT-f ih.il » '"Vl’d 'J % b>'IpB ..A.

d€'; : -•€t£

' . qq ^^cP I ^ 3 .o b aii , v o CX'

' n do L.

i.r ' i.T'. -.-^ f!VT:-i'..ad

2. The Place of Control Charts in Experimental Work

2.1 Primary objective of control charts

Control charts have very important functions in experi-

mental work, although their use in laboratory situations has

been discussed only briefly by most textbooks. Control charts

can be used as a form of statistical test in which the primary

objective is to test whether or not the process is in ^’statis-

tical control”. The process is in ’’statistical control” when

repeated samples from the process behave as random samples

from a stable probability distribution; thus the underlying

conditions of a process ”in control” are such that it is poss-

ible to make predictions in the probability sense.

The control limits are usually computed by using for-

mulae which utilize the information from the samples themselves.

These limits are placed on the specific chart and the decision

is made that the process was ”in control” if all points fall

within the line. If all points are not within the limits

then the decision is made that the process is not ”in control”.

The basic assumption underlying most statistical tech-

niques is that the dataare a random sample from a stable prob-

ability distribution, which is another way of saying that the

process is in ’’statistical control”. It is the validity of

this basic assumption which the control chart is designed to

test. The control chart is used to demonstrate the existence

of statistical control, and to monitor a controlled process.

-9-

*-•• .•"r~’‘T!?! ’ .**'

*5_

'u.' ; f •' :; ,V . X’:' -• ’•'i ^ * •' >i" • .' 1' ;• • V

.T^qxo liJt stivi:i:)atix v"i^>/ iVJiii .'. nr: . ‘c‘ fao

mJSr . >iJ Rt.t iiJc bzaj *i ' >il j ' -..-o-i > - *‘. ^•*.‘‘.v I^Ja5^'••'•= >

'

'•i' 'j' ’)!'

.

^ j'.

Vi' < '• */'• '

.^' Kfc s . -:m' ' - .'

'

.

:,c:rj-jio"> .mTivoiif^xtxf Jpooi • ,4 '^.1 Jic i b -i.

'

//f .l:.U wjiirlw :; ;' •

'i';rei -t > ' r f • 1 r • : ; -M* '^rnv

~i- - y '> M ' - t ;•- rf"' >;^ :. -.-1. ' -• . .

' tjto -^o . ' t -xvtiJv riy

-w>:v/ . - ••'.> ;. •* nl: px pfiMC>-tq •. >..

.X '!; j v; .70/1 o'*p '-'tniio.- J. ir, If J-xh^l\7

f '**

'

,

'' io I * 7or; 7o;.i co ^ o'57 - oo/jrj al , no ; ofliJ r;*jc!l.

•'

» ’ Jl•

’ i’

. :-

-*•.>91 I li.— * ^ i j - 7ev. > ‘ifj i; v i -n next jo • o'. •'•' ,;'i

.. i(*.•• •

:< L' ; Cl jsiv.c* fr^twurfJi n o^-i J'.'-.tn •£>n?>.ff;..qr.i-,c' fr^lwurfj’i n Ov-i

;il j :t j O, ''•('ii.f oi' •» a 1: il ^ i 1 7 ,;Xk:» I j u i

'

q:i:xb\'."7 ->iii cii II . ” to't. :'->n J • ' t .t . .1,4'-7

L i r

c H

r.*.: j V/’ vb u ’’O'f.t n

As a monitor, a given control chart indicates a particular

type of departure from control.

2.2. Information provided by control charts

Control charts provide a running graphical record of

small subgroups of data taken from a repetitive process.

Control charts may be kept on any of various characteristics

of each small subgroup — e.g. the average, standard deviation,range, proportion defective. The chart for each particular

characteristic is designed to detect certain specified de-

partures in the process from the conditions assumed. The

process may be a measurement process as well as a production

process. The order of groups is usually with respect to time,

but not necessarily so. The grouping is such that the members

of a group are more likely to be alike than are different groups.

Primarily control charts can be used to demonstrate whe-

ther or not the process is in statistical control. When the

charts show lack of control, they indicate where or when the

trouble occurred. Often they indicate the nature of the trouble,

e.g. trends or runs, sudden shifts in the mean, increased vari-

ability, etc.

The control charts, besides serving as a method of test-

ing for control, also provide additional and useful information

in the form of estimates of the characteristics of a controlled

- 10 -

., . f.

’ -V

'',* > \\"

V, ,.

r 4, ,^-^iJ.Ar'; ',

« ^ • '4

: >T- i.-. X» ' v>L>

£*,' - .V.a *•••»»'* *

V': ’'J ^ Y ja- .a-

- . •

f' '• r i

f'

! ''r > >j>irT

•t * '=K

Vt,

S . ^

u«>T^ ;

r .’i urt •

i ''

t I' r»-v/ ; rlKi^ ^

-1. i J

/ 'iw ^'-

^ ».. Vo i j- 'i I-. *..y. J‘

f - ii*

:,o>.

> 'J 1 *>

;

- '

«

a 'A " \

‘' ' ' *•

ihr^B ',4i';-T -

• '

y'.;-'

;^V fxi V

.

:^- •a r • V

- ’? « / jr6"-

C, v..

* •.

' :7 *T Xi-’. =."

rrx . ft ' Ti J^iVv^

j \ -V«. 'Vr._

Kii!v»ai. a: t . i '

t.' c-

,.4 T'a’M 1 \ 7 ['‘

-fBrniTiu

O'iQ ^*d 1 ajiT */c '' . .

•^- •? :Av'»

,. :p ,,.-4 V-

8iL«it

K;‘ -

-.i'ae.i r-

•\.i :>tu-

1

ii;

i. ^ ' L-

i•'*:

MiO-T j )^16

. i tii,,

•ib:» v^.’

i o’^J ifTo:>

1

«

dLe^

I. V ^ -> '. . •'*

'V ;'

. i

;.: a*

^'"r'l

''1

\.* M *

,.. S':

process. This is information which is altogether too fre-

quently overlooked. For example, one very important piece of

information which can be obtained from a control chart for

the range or standard deviation is an estimate of the varia-

bility (a) of a routine measurement or production process. It

will be remembered that many of the techniques of Part I are

given in parallel for known a and unknown a. Most experimental

scientists have very good knowledge of the variability of their

measurements, but hesitate to assume ’’known a” without addi-

tional justification. Control charts can be used to provide

the justification.

Finally, as was pointed out in section 1, a control

chart is the most satisfactory criterion for rejection of

observations in a routine laboratory operation.

2.3 References

The basic references for the use of control charts are:

1. For the underlying philosophy of statistical con-trol - Shewhart, W.A., ’’Economic Control of Quality ofManufactured Product”, Van Nostrand Inc., New York, N.Y, 1931.(The first and still a very significant book in the field).

2. For the details of application - ”ASTM Manual onQuality Control of Materials”, available from American Societyfor Testing Materials, 1916 Race Street, Philadelphia 3, Pa.,1951. (This manual gives examples of various kinds of charts).

Some recommended general texts on quality control are:

3. Cowden, D.J., ’’Statistical Methods in QualityControl”, Prentice-Hall, Inc., Englewood Cliffs, N.J.,1957.

- 11 -

i 'f'.'-f =\> fv V ^ -t ^ ,.. ' ^ - L':

JJ '’ J r:-*' ’.'7.VL Oi7‘. . Z:- r^-^‘ '..=?, ;.. :0^ ».»,' <

jJ > , ••'> Y t ^ " *

Z''

* V *• -’ •X-. „'r;:^0 ^

r

* •''"i

'f i '- '•*

•‘

, ., ‘-r: t‘ , 1 "i

., Tl, •> ” 5 -'. - c 11^. . .- :: ", 'J - . f> . • -nf' ... • ..

.“ ' ..-:iA ...: - ,. y»

, r‘H *>’.(.4 'i 'i..\ J :. ^ ‘,:j .t. ..it

j.,. f. jS; . ,'•

ar^X V - ^*,. -• :,-.y "--i"' .‘U'y

n ,: y) -„ ..- Of /.. '" > i i . \ i

c.I'-iirf 'i', ..r-r}iv Kii; X.-r v ,:• u . ’^^

•';;s.;^ ,. Jr- 1 v ’"-i Krt'o^L.; ^

C' Z-:\ >.:: ' 'V . :,t,'‘"

*. 1 5 > LJ r

t < i :f»*‘ . .‘ "X ^ X;

u It* _,.X y.-.^ X : *?:. .aI j iv^'- X.-Jr^Ak '., •' •'•. , '{ I T » •

'

•.; tt*'- \ 'nl 7m':>

‘ •;*«'‘‘^ I;V;f .• > ^*vX n .-•: .: :'#>••

.

M.^ ‘ • A »T ^ ^ • t ...

:• X. i>y\*n:rio vu ^.-^u f ^ ' .:•'••-: . '• ;^.. 'Nv Y/qi ::V-^ X': • .I.-'’'’ . V -:.'•

. ,f, w.; .'i- f«,

.. .'a•’ r.; r' .>v^ ^ »

;X. V ^ ." .= : i',-. . . % , ' A

‘ < t1 y. i. S'

' - V r- .

:

•.:-X

rt i 1. : 1 r.

:

•• •

f.

' • - • • -i ?

'

t %i/ *• Vi:.' ^ ?r'X'“ c : .s-’

a .i ^ tiw^•. .s'ili.JtO ho • - •• ' >; ;: t 2 , . • , . . : \ 'J i. . • . -

'

J^Xc-'-. :'5'

4o Duncan^ A.J. ’’Quality Control and IndustrialStatistics”, Richard D. Irwin, Inc., Chicago, 1952.

5. Grant, E.L. ’’Statistical Quality Control”, SecondEdition, McGraw-Hill Book Co., New York, 1952.

6. Actual examples of laboratory applications in thechemical field can be found in a series of comprehensive bib-liographies published in Analytical Chemistry:

a) Wernimont, G., ’’Statistics Applied to Analysis”,Anal. Chem. 115, (1949).

b) Hader, R.J. and Youden, W.J., ’’ExperimentalStatistics”, Anal. Chem. 120, (1952).

c) Mandel, J. and Linnig, F.J., ’’Statistical Methodsin Chemistry”, Anal. Chem. 2S, 770 (1956).

d) Mandel, J. and Linnig, F.J., ’’Statistical Methodsin Chemistry”, Anal. Chem. 30, 739, (1958).

These 4 articles are excellent review articles, successivelybringing up to data the recent developments in statisticaltheory and statistical applications which are of interest inchemistry. They contain extensive bibliographies, divided bysubject matter, and thus provide means for locating articleson control charts in the laboratory. They are not limited tocontrol chart applications, however.

e) For a special technique with ordnance examples,see: Grubbs, F.E., ’’The Difference Control Chartwith Example of its Use”, Industrial Quality ControlVol. Ill, No. 1, July 1946.

Industrial Quality Control , the monthly journal of theAmerican Society for Quality Control, is the most comprehensivepublication in this field.

- 12 -

3 . Statistical Techniques for Analyzing Extreme-Value Data*

Classical applications of statistical methods, which

frequently concern average values and other quantities fol-

lowing the symmetrical normal distribution, are inadequate

when the quantity of interest is the largest or the smallest

in a set of magnitudes. This is the situation in a number

of fields, in many of which applications of methods for

dealing with extremes have already been made. Meteorological

phenomena that involve extreme pressures, temperatures, rain-

falls, wind velocities, etc., may be treated by extreme-value

techniques. The techniques are also applicable in the study

of floods and droughts.

Other examples occur in the fracturing of metals, tex-

tiles, and other materials under applied force, and in fatigue

phenomena [1,5,6]. In these instances, the observed strength

of a specimen often differs from the calculated strength, and

depends, among other things, upon the length and volume. An

explanation is to be found in the existence of weakening flaws

assumed to be distributed at random in the body and assumed

not to influence one another in any way. The observed strength

is determined by that of the weakest region - just as no chain

is stronger than its weakest link. It is thus apparent that

whenever extreme observations are encountered it will pay to

consider the use of extreme-value techniques

.

*) This section is adapted, with permission, from [10] and [14].

- 13 -

y*'.y t i ''* '. >«S ••

“ . ^ --V >S .^;: "M.a ii

*5*; ;'i ?.?.. :»

- -:

, ,V- - • r-^' ‘

i3^f f 'J-K 3:j^.,

^

.-v-X

7

* '» "ir'fx’ 'T*‘ i '^ I'-t'«

1 ^

Xn* i xl ..’^ . J 3, ' • . ,• [ }

.,' - t - .v/W'-ix

3,

i2 . -r. -'iC>'< XO'X"’ .," iy.’r^ xf, ciCii'S 07 ''

"i M • . V ‘.*

? ->-.(.

. .> t » >i'* V* ^-r- ,- 1,'. V,

'i-,,'T7 s-,;i I

:

i i i

j.-» je

f.*•

A

Use of Ext reme-»Value Technique - Largest Values

A simplified account is given here. For the detailed/

theory and methods^ consult the primary references^ [l]^ [2],

and [3], References [4] thru [15] give some selected appli-

cations. More extensive lists of references can be found in

[l]j [3]^and [9].

Figure 1 illustrates the frequency form of a typical

curve for the distribution of largest observations.

Co

c

CQ

TO ' ,

''•

•I,

'''

‘i u...I J>4 (

a1

n? t [•-i; stf - B'^-I nao -ot^M ''' ' J";

3tytu- I

'

i.,>i| ^

.";'• .',• ' V ' .V .

Bi"? h ’X' *:"> ,ncJ'ff*dii':Kd i.iytj edi Jk J

'

' .j ' V

’'’

'/,

£'[ ':r}s*r cf;&a

while very high ones do occasionally occur. Theoretical consi-

derations lead to a curve of this nature, called the distri-

bution of largest values or the extreme -value distribution.

In using the extreme-value method, all the observed

maxima, such as the largest wind velocity observed in each

year during a fifty-year period, are first ranked in order

of size from the smallest to the largest. These values are

given ranks 1 to n and are then transformed into cumulative

frequencies by the relation = i/ (n + 1), where i is the

rank of the ith observation counting from the smallest. Thus

a plotting position is obtained for each observation. The

data are plotted on a special graph paper, called extreme-value

probability paper, designed so that the "ideal** extreme-value

distribution will plot exactly as a straight line. Consequent-

ly, the closeness of the plotted points to a straight line is

an indication of how well the data fit the theory.

Extreme-value probability paper has a uniform scale

along one axis, usually the vertical, which is used for the

observed values, as in Figure 2. The horizontal axis then

serves as the probability scale and is marked according to

the doubly-exponential formula. Thus, in Figure 2, the space

between 0.01 and 0.5 is much less than the space between 0.5

and 0.99. The limiting values 0 and 1 are never reached, as

-15-

•If -J-ri'-.: .•, i>*i ') . t . • .' *

ly'•

•,> -i .

-

I d ‘.r-l.t X ' *t • C ' vV

‘'’•j.v>-. n/ . X'' -'

^i«>t>T. . r,r

y ;\^y '• . ‘ ; . T.(,’ r-''. . \::.x thi

i v:H^; ., /f*

»,wi< * '

"'^

^:t

' »i r :? 1 '

I f 1 ’. ' . . .- y* y'j ':% . .'

/v.-. ; .-•'/"t . ,- '.0 ' Hi'

-_ U ' yy'fsiJiq J G

'

• '-,i

‘

V >i-:

'

1 »

• .j

: ViX ! r.

'

X* -^y-

:tj. it? jrfo*! •. : ,.• \ :' X .^'-k : ^••*'v'i,i' :?

i>fi't -vZ^CiJ \ .,;

.< ^jd i >s.;

J'A.? .,"

':• * .-V-i,';.

^:fr':r- • : ^ , ,.T..;;,;>

-ritfj'«?d - : - y , . . .- V,

« ! .>a« 0 ^ .* f' ’

'

'i

i f IT-'

' 1. . .?;.' ’ «. • -

".fj '’ . V •'. L&a-fi’J .' .. ,' !'\v

/-t ^ o v . .

< ' ..

^

^ ,. c X ^ . i jdXxift

'H: .*.« .• . i • .} “ -vfy'afv^ .. }; u-ty ••••£•

is true of any probability paper designed for an unlimited

variate

.

An extreme-value plot of the maximum atmospheric

pressures in Bergen, Norway for the period between 1857 and

1926 (Figure 2) showed by inspection that the observed data

satisfactorily fitted the theory. Fitting the line by eye

may be sufficient. Details of fitting a computed line can

be found in [l]. From the fitted straight line it is poss-

ible to predict, for example, that a pressure of 793mm corres-

ponds to a probability of 0.994, that is, pressures of this

magnitude have less than 1 chance in 100 of being exceeded in

any particular year.

In studies of the normal acceleration increments ex-

perienced by an airplane flying through gusty air [9, page

394], an instrument was employed that indicated only the max-

imum shocks. Thus, only one maximum value was obtained from

a single flight. A plot representing 26 flights of the same

aircraft indicated that the probability that the largest recorded

gust will not be exceeded in any other flight was 0.96, i.e.,

a chance of four in 100 of encountering a more severe gust

than any recorded. A more recent study [13] presents refine-

ments especially adapted to very small samples of extreme data

and also to larger samples where it is necessary to obtain the

-16 -

I::-

:c«> li • * .' - •“ ' • Vr ‘

^' "

'

' .' 5 . .

» iX .'* I- ' '* >.-'!

.f--; . ’ xwsintTrXi .1

:

i . .•

V •

,

'• ' t' '* . •” !C t ' ./ v , j ^ A.

* ySf'^

\ 0 f.r% # ih-*;. / i 'I O'- / r*

t

Pressure

in

mm

greatest amount pf information from a limited set of costly

data

.

Return period

Fig* 2—Annual maxima of atmospheric pressures, Bergen, Norway, 1857-1926,

-17 -

A'.'.'vH

I'jl f J/V,!lV f I .. .^J7

• f*...,, 0i>. i ? 5'*^ a I:

J

•r-

''N>,,

'-..

i

1

\

V.'

i

" r.,.' :: •• iT .•

,

V- » .V i$: i a.i.

\

J1

-I

!

)

\

it

( jj r.’

I

#

. JSt,.;.

^ fi, \

\b t.

V

)

•'i

('

;?^V f.ri-«« JL^f,

a * ^

Smallest Values.

Extreme-value theory can also be used to study the

smallest observations, since the corresponding limiting dis-

tribution is simply related to the one previously given.

The steps in applying the ’’smallest-value” theory are very

similar to the largest-value case. For example, engineers

have long been interested in the problem of predicting the

tensile strength of a bar or specimen of homogeneous material.

One approach is to regard the specimen as being composed of

a large number of pieces of very short length. The tensile

strength of the entire specimen is obviously limited by the

strength of the weakest of these small pieces. Thus the

tensile strength at which the entire specimen will fail is

a smallest-value phenomenon. The smallest-value approach

can be used even though the number and individual strengths

of the ’’small pieces” are unknown.

This method has been applied with considerable success

by Shigeo Kase in studying the tensile testing of rubber [11].

Using 200 specimens obtained so as to assure as much homogeneity

as possible, he found that the observed distribution of their

tensile strengths could be fitted remarkably well by the extreme-

value distribution for smallest values. The fitted curve given

by the data indicates that one-half of a test group of specimens

-1 oo —

'i Ot •-P. \i

.ci --

X] ,>*40

.; -'.

- j

\ ..> K ;.;:u ..

.-:. .•. r*HT .9

^ : i .- j..' X"y<

1- - %-". 'X-QQ . a •

;.^".J' •> ' ; ...-

. ;

’i O :;. fX; t .

aa:^toi^€ i:

‘ v' 7 -:'.

'

.-

-r :•

..«• ' . .' .i '. '-•.!

• ' lo

•".•. *ll

ioiiiw Ti

f. t'-.

.IJ ij!

7 'c*: >' irJ r B.'l- "

^ i.jA J. jC i-

»OJj f«OV®

f^-T -*!>i q i Xiwr

! j fy.

ic': i >/', '*\r K

>c iiSfw h

[ ."J-'i i< : 7 -^

.: ;, • ?-• -3 v r

may be exi>ected to break under a tensile stress of 105 kg. /cm. 2

or more, while only 1 in a thousand will survive a stress ex-

ceeding 126 kg. /cm. 2 .

Missing Observations .

It has been found that fatigue life of specimens under

fixed stress can be treated in the same manner as tensile

strength — by using the theory of smallest values. An ex-tensive application of this method is given in [15].

In such cases and in other situations, tests must be

stopped before all specimens have failed. This results in

a sample with missing values — a ’’censored^’ sample. Methodsfor handling such data are included in [15].

- 19 -

References

Primary references .

Jl] E.J, Gumbel, ’’Statistical Theory of Extreme Values andSome Practical Applications”, National Bureau ofStandards Applied Mathematics Series 33, 1954, (U.S.Government Printing Office, Washington, D.C.).

12] National Bureau of Standards, "Probability Tables for theAnalysis of Extreme-Value Data”, Applied MathematicsSeries 22, 1953. (U.S. Government Printing Office,Washington, D.C.).

[3] E.J. Gumbel, ’’Statistics of Extremes”, Columbia UniversityPress, New York, 1958.

Selected references on specific applications .

[4] American Society for Testing Materials, ’’Symposium onFatigue with Emphasis on Statistical Approach - II,”ASTM publication 137, Philadelphia, 1952. Discussionsby E.J. Gumbel on pp. 22 and 56.

[5] B. !Epstein and H. Brooks, ”The Theory of Extreme Valuesand Its Implications in the Study of the DielectricStrength of Paper Capacitors”, J. Applied Physics,Vol. 19, pp. 544-550 (1948).

[6] A.M. Freudenthal and E.J. Gumbel, ”0n the StatisticalInterpretation of Fatigue Tests”, Proceedings of theRoyal Society A, 216, pp, 309-332 (1953)

.

[7] A.M. Freudenthal and E.J. Gumbel, "Minimum Life inFatigue”, Journal of the Am. Stat. Assoc., Vol. 49,pp. 575-597, (September 1954).

[8] Y.C. Fung, ’’Statistical Aspects of Dynamic Loads”, Journalof the Aeronautical Sciences, Vol. 20, pp, 317-330,(May 1954).

{9] E.J. Gumbel and P.G. Carlson, ”Extreme Values inAeronautics”, Journal of the Aeronautical Sciences,Vol. 21, No. 6, pp, 389-398 (June 1954),

-20 -

1

^ ' 1^ •» :

.

I /^n‘- f __ ''

...;- V-. v>

i. .S . •••*'

'V'

v;;,'•. i$ j > 1 T -f ’ .f ^qq . ^ .- . pv

?- >

•. i

•

•’ -' ••'

,

*"..-! H r

.

.»

.. ' «*• ^^4 ? ' •' -' •*

.

.. a- ; , ?»

* .• .'

..r!

-. ? •• iaiU -' ^'.fy;

^.;a. _.^.Ky vi/'V-u ‘'-oi!, ’':

i ,?>• .-jXs\- .. »;'r- -•.^: JL 0 'i i, >^l

'

• ;rvTir'>f,i.- I'.'pP'*'.

,' /.,0 ?- ' '. v'- _•

#’

X •

yf.'» *-’

3 C'

V*

[1

#

[10] E.J. Gumbel and J. Lieblein, ^’Sorae Applications ofExtreme“Value Methods’% The American Statistician,Vol. 8, No. 4, December 1954.

[11] Shigeo Kase, ’’A Theoretical Analysis of the Distributionof Tensile Strength of Vulcanized Rubber”, Journalof Polymer Science, Vol. 11, No. 5, pp. 425-431,November 1953.

[12] Julius Lieblein, ”0n the Exact Evaluation of the Variancesand Covariances of Order Statistics in Samples fromthe Extreme-Value Distribution”, Annals of MathematicalStatistics, No. 2, pp. 282-287^ June .1953,

[13] Julius Lieblein, ”A New Method of Analyzing Extreme-ValueData,” NACA Technical Note 3053, National AdvisoryCommittee for Aeronautics, January 1954.

[14] Anonymous: ”Extreme-Value Methods for Engineering Problems”,National Bureau of Standards Technical News Bulletin,38 , No. 2, pp. 29-31, February 1954.

[15] Julius Lieblein and Marvin Zelen, ’’Statistical Investigationof the Fatigue Life of Deep Groove Ball Bearings”,Journal of Research of the National Bureau of Standards,57, No. 5, November 1956.

-21 -

f

,h < -I

.,. , .‘'•/ J i ;•, • : -jV

,.. jq: J: .. ..

J*

‘ ;'v:ji% , ,1:: - \/• »•"

!{fe: ^vv' ''i jbiJ ' .‘Ji’A 'y « itU^. f :

'

r;:/ :{^>, 'SXj}.‘

.' ^'

.. t'

,( . . < tr‘: . **

-*jLi ^^

^Jt •

. >. j.^

. ^er1,

'',> ••

. V-/*" '-A . . . » • 1̂

• X, . td y.t.’j ,y%v '! 1 '"» 'r-iilO,' .y’.' / ?»

.v*y•'l_^,Vv,' ,y ^ 'A- "'' 'i' It i J t y* * ,a•‘H "-^,4i

.“''ty. 1 ^'"

.

V |i f '• •

/ .... .. n.

* 'i..>i jU .'. : '’•f .* * «

'

,., -.1, ’ » '

,CX''}4i»

r..' . .'?vy#:T =j^ tQ ' ,:' X -.. ,' *' r ;V> •

' i

m

.

+ :

4. Transformations

Statistical techniques have to be based on some

assumptions. There is often the assumption of normality;

in some techniques there is the assumption of equal variance

among groups; etc. Analysis of variance tests (in Part III)

depend on assumptions of normality^ additivity^ indepen-

dence and equality of variance. Real-life data does not

always fit the assumptions required for commonly used methods

of analysis. Where this is the case, a transformation (change

of scale) applied to the raw data may put the data in such

form that a conventional analysis can be validly performed.

Many physical measurement processes produce normally-

distributed data. Some do not. If the data are definitely

known to be non-normal, there are distribution-free tech-

niques available (see Part I, section 2.7). If, however, a

simple transformation is known to normalize the particular

kind of data, one might prefer to use normal techniques on

the transformed data, rather than use distribution-free

methods. Certain kinds of data, for example, are pretty

definitely known to be approximately normal in logs, and the

use of a log transformation in these cases may become routine.

-22 -

-.-fW

'V' \KK'.

' ,.y.

;

'

f .» . uff.

jtiK i-j: }:.(.. I-' 9»i Cit. -»•. *-,

V tV •i" Ai>I

„ . •di:;'".' frcO

t ^vt

>*:3X*'X‘v - ^ Tb. ft‘0':r'4.o;r .••:«. etS'^ -s-i

> •-'' ,',u ..ij v-.x, ., ;«< >.7t.. -- ..c's;:

, r i :.r‘/ .Iv.- ; : tt. I ^ '%'^tjkv

:'*.' .J8 .f

iti-t lMai» . ' ...;

< I !:

aO h'VI:• •:

i".

i^.'i 1 f’

'l .

’S^ u

,\ • ‘I :'.K ^'. . ., j

1 ). I*:' A

' I. tv',

f .K v!'i' ,^ :V..• J • i . tf}' '':l ,

' 0 ^'‘ ., . V ./Tj 4 ' • «

'J ':'' i'•" -

'

••'•'i "r.i •\.if:ii' -

.

'

• ..Tv— -J-

. L‘

J.*

‘.'^>

~ ' “» "T \'

< ,) ' , . --rvY :

:. »s ->:'f jj'XJ'Ti€-j, V^' .'''-.

r

) \:,'i;, :_ f. >-. ;*;•;••. a>.\ '

1‘

h^: ' ‘ Vu . 1

t -:' iji- -s 1 ,

W f. y,4.. ^

'. ’,

.

,;* ' * JL

Transformations are also used to achieve equality of

variance among groups, and quite often both equality of

variance and approximate normality are achieved simultane-

ously. Transformations are not recommended for uncritical

use by the amateur. From small amounts of data it is often

not clear which transformation to use and it is not always

clear whether or not the transformation has helped. Graph-

ical methods can be used to examine the effect of transfor-

mation see, for example [2]. The table following, taken

from a larger table of Bartlett [1], shows several frequently

used variance stabilizing transformations.

[1] M.S. Bartlett, "The Use of Transformations", Biometrics,Vol. 3, No. 1, 1947 (pp. 39-52).

[2] A. Hald, "Statistical Theory with Engineering Applications"John Wiley and Sons, New York, 1952.

-23 -

^ \*

'*>^^,U.-ti-^lH: '5'JJ4^i'^jiV’

-'?k£C3B,1 I U4tt r ^ r^ 1t'3n;«

of

Transformations

-24 -

5 . Notes on Statistical Computations .

5.1 Coding in statistical computations . Coding is the term

applied to arithmetical operations on the original data in-

tended to make the numbers easier to handle in computation.

The possible coding operations are:

(a) multiplication (or its inverse, division) to

change the order of magnitude of the recorded

numbers for computing purposes.

(b) addition (or its inverse, subtraction) of a

constant - applied to recorded numbers which are

nearly equal, to reduce the number of figures

which need be carried in computation.

Particularly when the recorded results contain non-significant

zeros, (e.g., numbers like .000121 or like 11,100) coding is

clearly desirable. There is obviously no point in copying

these zeros a large number of times, or in adding additional

useless zeros when squaring, etc. Of course, these results

-4 3could have been given as 121 x 10 or 11.1 x 10 in which

case coding for order of magnitude would not be necessary.

The purpose of coding is to save labor in computation.

On the other hand, the process of coding and decoding the re-

sults introdaces more opportunities for error in computation.

The decision of whether to code or not must be carefully

-25-

jxl 0jj>rtr-ft *’. ?*fft r-;. ;-.». -.jT t'vhao:-

L" (•'uc.X'-. ' ,;. it’% i-:, *ioJ; Ic; -* * r jui ( •?)

bci(J3‘'j.w".‘ ' "5.0 '‘:-b V.- j" ^ i

'

'if-' .;.f.^ *’i fbiiJ''

.

'•

,,q ;-./l '.' vT

s t .'•. : j\i.i: ^ j A tc^'

9T.S rl:’»..{^', r-,'i-£u/r-»iii, I';...' '.,' i .“ ' ..

cu>':ij;*r 1 ,t..' 4^^‘tsit sciS . ,

^1 ^ ^ iir.'n

t/i :l>JL •; r:-:'; ,.y. ••:>'> £• ' ’ ' . " ’ '’Ci ' ' ^ '. '..I'txiii t

' *

1 /^ :

aM'OO^' . v'.ti

»V^ ( • lo \ ‘ 1. j

ioqlOT • . •• VO ,..y^. 1 -r :. r;

rri:.'rrf^r r-.,v .. * ,. .. ,>

"

> vi iTv>v>^ t

. 'J,sj5»c^ 15 ’ .' * / 0 , '1 'leb’io ..>0 0 ^.VJJSO

. i rc: j :J, . ; ;• i !r\u(o :? * *• .t \

considered, weighing the advantage of saved labor against

the disadvantage of more likely mistakes. With this in mind

the following rules are given for coding and decoding:

1. The whole set of observed results must be treated alike.

2. The possible coding operations are the two general types

of arithmetic operations - (a) addition (or subtraction) and

(b) multiplication (or division) . Either, a err b or both to-

gether may be used as necessary to make the original numbers

more tractable

.

3. Needless to say, keep careful note of how the data have

been coded.

4. The desired computation is carried out on the coded data.

5. The process of decoding a computed result depends on the

computation that has been carried out, and therefore is shown

below separately for several common computations.

a. The mean is affected by every coding operation.

Therefore, apply the inverse operation and reverse

the order of coding to put the coded mean back into

original units. For example, if the data have been

coded by first multiplying by lOpOO and then sub-

tracting 120, decode mean by adding 120 and then

dividing by 10,000

-26 -

>

'

/

-‘^yirya:^ : ,-:'Ur% ^£ij:u-ci' :.vi ,,.:7

. e#, r^f^'THojfado ^:;.- r---s^/ '^^^^r^r^ ^cft

£'0-qt^ y-Vi v.:^' '4;... -; u •. y /

(noll:'-yrry:tif:y^ ,. i :» ir^f: ^'x mL^ Xii'- . (fsoi:R.r vi4 -io) -ioj:.^..- .^\ l:j ii.yi fy

vi: I'^fl. lUii^n / y T^-V^, \{ "T X r. .'j £•;_{'': .( .*> *:'., '; 4r! i^:^:JRi> ;-;d »*

', ‘I'

' r

. n.4 -.;L «.* • ‘ ,i£0 "% ' i‘. ' ' 'x^O'

^:-:" ivz'

.

Cl ' £:' ,>^ffOO

R AR rMJ4

4r A, ') a

>

I

.. #*

v-... OS I' £!.£::#.£

»v0^0

1

' h

'

'^\v L'lk*iiii4Jt''•4dl^.'.;

Decoding

:

Example

Obs. results Coded.0121 1.0130 10• 0125 5 Mean

Mean .0125 codedmean 5

Coded mean + 12010,000

125"ia;oo(T

.0125

b. A standard deviation computed on coded data is

affected only by multiplication or division. The

standard deviation is a measure of dispersion, like

the range, and is not affected by adding or sub-

tracting a constant to the whole set of data. There

fore, if the data have been coded by addition or

subtraction only, no adjustment is needed in the com

puted standard dev^iation. If the coding has in-

volved multiplication (or division) the inverse

operation must be applied to the computed standard

deviation to bring it back to original units.

c. A variance computed on coded data will have to be

multiplied by the square of the coding factor if

division has been used in coding; divided by the

square of the coding factor if multiplication was

used in coding.

d. ’’Coding” which involves loss of significant figures:

Notice that the kind of coding discussed so far has

not involved any loss in significant figures. There

is a process, however, called ’’coding” that involves

both coding and rounding . This usually arises where

the data are considered to be too-finely recorded

for the purpose. It is sometimes a desirable pro-

-27 -

1. ;' i 0 ; '

Mci 0.

•'3 .,'i bti-J i..q iJXV'=«L i, A

*f;T.: 0. \.) ^v'.^ +-. sj' S(.r^.' - ' ' ! r x‘j

^ '_> fj^ j.> i .*'

' '

* ^ .'1 ,;f< , ; i.f îV'rb’' lyî^yr f. G.; ^

, r 1 ; .g>•V. - «•» 'iO riC-; .! ,. ' ->rf

'

:', ; •i ei:v. ' X i> r.'XJ

< , ; -, ; r' 't: ?• 1f

•C>0 > '

»

‘-.•'*'(^1 i

^ > ' • i ...-'

.^ :• 'I #5 fd"- % '.'Xivr D'Cl;J:';/':X * ‘J L.'?

>i i;. • .* ij r ..V 11 ;-•• •' < ^1

r / . 4 •."•V\-.. )(b. : - tXf j C' ' . ‘ ' : •:>!.?

'.

s .'• '-^r>-3

.

*,^ •_ 1.W I'lV . > »qo

. ' i 'i < :tO .a:i .rfr‘

., s

•JU, • '-- . i x.T'

*

f

TV-r.

- i. 1]

'

^^ b^rboo • ; . •; v:* ' V^ '\X . .i xo:ôy ^ 1

'

. qi->^ I'!: t:

jyNiv-Ji. ''-C'i' ; -; . i i>OCf- ilA . - '•Oiiw.tl. .w^« Cfi '. 2 i ^

.

•’

. v;... ., r -X j*iio,

^ ••.

'iV; 'P.Citc! . •.’ ^ •*, y? i*>n.^: f^'t 0.i

cedure but one should realize that there is more here involved

than just arithmetic. This type of coding involves a loss of

information and therefore its application requires more judgment.

For example suppose the data consisted of weights of

shipments (in pounds) of some bulk material:

7,123

10,056

100,310

5,097

543

#

etc

.

If the interest is in the average, and the range is

large, as here, one may decide to save trouble by working with

weights to the nearest hundred pounds. The ’’coded'* data would

then be

:

71

101

1003

51

5

etc

.

-28 -

}

:

' ,-

r,- - ,..Ar,„ vf6C :-.rd - .3

>*

1 /'J .-) '/» >' ' «.-'Q •'^'- „ 'C)^ s > j

.’'

•.^1

r-4i^:> *j»w ^h-£4'^ vj:”' Jitr 4si> :^i>J ,w'*' 3 - 'iVi

1 1 ;^.r'-

.'^ rr ^

• ? . '

•. V t.1

ios

ftOOj

Ui

o

•

.o;^o

Whether this is sufficient information will depend on the

range of the data and the uses to which the results will

be put. The only caution is that a higher order of judgment

is required than was required in the previous examples. The

grouping of data that is done in making a frequency distri-

bution is coding of this nature and does involve rounding.

5 . 2 Rounding in statistical computations

5.2.1 Rounding of numbers : Rounded numbers are inherent to

the process of reading and recording data. The readings of

an experimenter are rounded numbers to start with since all

measuring equipment is of limited accuracy. Often he records

the results to even less accuracy than is attainable with the

available equipment simply because they are completely ade-

quate for his immediate purpose. Computers are often required

to round numbers either to simplify the arithmetic calculations

or because it cannot be avoided, as when 3.1416 is used for II

or 1.414 is used for l/T~ . When a number is to be rounded to

a specific number of significant figures, the rounding proce-

dure should be carried out in accordance with the following

rule

:

-29-

* t

. K ’i

'vi'j 1. ^ L? • r: ., ' 1 • .: rtf:>'>iri ' a -.-r ’.’ 1 . . • i' K

UfjT Vp .;: .;i‘.

i» /. • V i flv.v. ^ >;?fiT . c» y

..Jl/. ; 1

1

::,/’

• - - :*'t> . - 4 “. C' i ';'

't.J 3 -ij, .^.JT ' ' -^rk •,; :> • , -

-. . r ^kn.K' ^3kfiu ,lo v=^t4?i: ns . : '>J: j j..-c;

ail 3 Ar- -r.-v ".1 * ~ r'

JSO V• f

ufi.-f^^.i -'..t juH

k •*:>fn . jSlifciy : *5,-'. 1 ’..*rO'’i'

. i. ]. ' 0

^kAs>iX54.i- -'i ;i:-;.; 'k ' v^>.:iTro-jn ^ '?i.o-

,S>i-'; ^ ; r . :^ « : atJr ii

' •"l.'lTi'v ..' ;H' .. . .. Vi - '. v . V

•s>^ na.v ' uqj5ToD .

acu- 4.'^ -'i.^ :j' 'va o:-.; ‘ ».'

T . .'O

n 'io^ -bj^r ’ rrii:'"^V i ' fi-. .fUii . -,u..«r V

uJ X/CfLi^: ri .1. k’ ' " ;cf.r-

; /ir, ^i;xixi.v/v. : . ,

^b TC'J .1'. :*•

u O c ^ *? - .0- . • ,X, •;.

rto\ ’i, ;

• > .f. i ' ' > / T

,

'*j c • •j. ^'‘ 1 * '. •,••• ! -I'/'i

' - •; t^vt

• •9^t\

7

i

(1) When the figure next beyond the last place to be re-

tained is less than 5 , the figure in the last place

retained should be kept unchanged.

(2) When the figure next beyond the last figure or place

to be retained is greater than 5, the figure in the

last place retained should be increased by 1.

(3) When the figure next beyond the last figure to be re-

tained is 5, and

(a) there are no figures or only zeros beyond this

5^ an odd figure in the last place to be retained

should be increased by 1; an even figure should

be kept unchanged;

(b) if the 5 is followed by any figures other than

zero, the figure in the last place to be re-

tained should be increased by 1 whether odd or

even

.

The final-recorded result should be rounded off in one

step to the most precise value available and not in two or

more steps of successive roundings.

5.2.2 Rounding the results of single arithmetic operations :

Nearly all numerical calculations arising in the problems of

everyday life are in some way approximate. The aim of the

-30 -

:'d of pfi ' fxov f>rv.hl\.i

.: 0i?l .^’. .'!-.i' S'

Lt S ^' '

r}jt Sri

f:v. -:-BV :/oi' ?. oo;.x »; ‘.--t

’/o } Jiisjs;' aiJ

ft

' > V

I• i V/[ CfO JJo rc ;. : •• i.. ; ; I r. «;

' i>j Khj.r.£ ,&tif • xsn , i n-ati .. ' '^'

5ii.i fj rjui.." ' :

;

j-ji:/''! \.lyx> ’^o nvgiX O/X '^i.s . y-'sjiij < ;i;

Ow ,si:.iii; i , . ; S" = .1 •i:T I'i r-,^M -I .^r,. -.

VikhCi^y n ;1,

M J,

:.>. '•.:. fi :i'

? V... 'fiir 1 yo ^v/oi

c.» ">'.."vv •• _£|' «!• ) or:., . .. .

'. .;/! I I .

. f \ \

'TO . \ v/r'o i At 03., ' i r jU :

.9£iw .IJ. Oidd' J3 r ' M’ J'^0 J JO,

'x,v om *i . on r:

^'•J. 'V . T

computer should be to obtain results consistent with the

data with a minimum of labor. He can be guided in the various

arithmetical operations by some basic rules regarding signi-

ficant figures and the rounding of data.

1. Addition : When several approximate numbers are to be added^

the sum should be rounded to the number of decimal places

(not significant figures) no greater than in the addend

which has the smallest number of decimal places.

Although the result is determined by the least

accurate of the numbers entering the operation,

one more decimal place in the more accurate

numbers should be retained thus eliminating

inherent errors in the numbers

.

2. Subtraction : When one approximate number is to be

subtracted from another, they must both be rounded off

to the same place before subtracting.

Errors arising from the subtraction of nearly

equal approximate numbers are frequent and

troublesome often making the computation prac-

tically worthless. They can sometimes be avoided

when the two nearly equal numbers can be approx-

imated to more significant digits.

-31 -

: Vf ->4 I ;• i ii"* , \e „. -f r" .f

/. . -r. -j V?:. ' v.

•OO.-* va V. ji- L ' ‘.

U3V. 3.r

i.-iil. t /. •-» 'j" _• :fi j : :' V

> :• .

> - - nt>dV

t> '?>>/! .’•**

tsfyiia uu'* .•»: ?

• i iii i r ; •

:2 * ;•

.,ni. -*..i •• •.. ••':) î tfl': ;; ' ''.'jiJV

.,; X.T jiAr.-/; ... '.:: I : l

^ *£' jr 1 nûo r » . r I \

r ^.o

•/ tiu'i &ftO

', ^

1

p ‘

' A ,

,;.. u .r.C'.X'- . -io A, :-i h-?-'=

.-.'4 j;! X* .t ....•' C .*

j :•? • • *;»jj .1 . •. ;. ,j»'i'

> * - ;/' . ;_

. o .C l ^.‘Oc.

.1 '^

;

: !/ . vô'itS

Ov:^' 1 ^1 j' -• .rx.-jf:;Jr ' • ^ « '!T7 ' lexj^ve-

3.

Multiplication

:

If the less accurate of two approximate

numbers contains n significant digits, their product can

be relied upon for n digits only, and should not be written

with more.

As a practical working plan, carry intermediate

computations out in full and round off the final

result in accordance with the rule stated above.

4.

Division : If the less accurate of either the dividend or

the divisor contains n significant digits, their quotient

can be relied upon for n figures only and should not be

written with more.

Carry intermediate computations out in full and

round off the final result in accordance with the

rule stated above

.

5. Powers and Roots : If an approximate number contains n

significant digits, its power can be relied upon for n

digits only; its root can be relied upon for at least

n digits.

6. Logarithms : If the mantissa of the logarithm in an

n-place log table is not in error by more than two units

in the last significant figure, the antilog is correct

to n-1 significant figures.

The foregoing statements are rules only. Fuller expla-

-32 -

.r-:., w.

'

^ " ;,v’^ .^ iT>

M . l( ij . { " ' *. **, '!' “--

1.1 m- „

fffiso^xlnê jx Bal,/z:fiiOo e^z^cftrajr.«f*

at..-aTOBT fl4 fw

•••. i'

,''-^^ v':f. •: • > .r

_v.* ,_

* ' • ' • ,.'•»• < r

^

'.w J'ftfi >iX 'îijQ aooXu jS-T-uaiiic^ -

" avoci ;; sgf i-

*«7*

;.,t«*. 1

3*0^^^ t npraivXOyT' 'ji ' ' -.

Jnz4 foup n eaj!:;$inof> TOaiîb c»if4'W:tôu bXuoxJfe yXao 1^0%, n poXis'T oJ aô

Vf «*tom rii XV X iw' B

^>uK ( L'sJt el ‘ûo aaoXjs:^pr^ f^jnii^^ta/rSftAi y rxti'J

^(iS tiXiw epnabTLOpo^ «r 4/w 'st .1^/1 jt aiu x'Jbo b:tiioi">

','

'

ir«r;_

' *•-,

'' '

** y^’^TL -""jT ,,. - .•^»v*:,’5.n 0;Xi/*x

i*>^ V " ‘ S'*

’ • n Prti^.*a4c* ^&^fn,ifu ^J/^!nj::KcrqqB ax: jrl ha^ »’34

nations of the rules, tests for determining the accuracy of

computations when it is desirable to attain or to state a

certain degree of precision, and the effects of rounding on

statistical analyses of large numbers of observations are

ably handled by Scarborough [ i ] and Eisenhart [ 2 1

•

5.2.3 Rounding the results of a series of arithmetic operations:

Most engineers and physical scientists well know the rules for

reporting a result to the proper number of significant figures.

From a computational point of view, they know these rules too

well. It is perfectly true, for example, that a product of two

numbers should be reported to the same number of significant

figures as the least accurate of the two numbers. It is not so

true that the two numbers should be rounded to the same number

of significant figures before multiplication. A better rule is

to round the more accurate number to one more figure than the

less accurate one and then round the product to the same number

as the less accurate one. The great emphasis against reporting

more figures than are reliable has led to a prejudice against

carrying enough figures in computation.

Assuming that the reader is familiar with the rules of the

preceding section regarding significant figures in a single

arithmetical operation, this section will stress the less wcil-

-33 -

I, ' n ' *1

>T'v'i''(

1^'.'Hr V|*

U ^

‘

4‘

,.v ... %r >

'

> : l.i z;.

•'

i- iV i «» -o# ?.o a. o : --.io lu it-r ^*:ts.Iv i...;, , ;r v.v:o' o

>v-..\'. cai#€>'l .feO H-t -J .-*Xla 3LU '• '• : . -rr; il/-

Jt'3;jKV%

’sacl 40 ’M7 :^v> J^.r.f»‘i4.: t7 is'^ P.

]C ] . l \ . r . *;.i; ,1 ' .'-i.d.

.ao -ii. j f>in,ii j v.r; lo a 3o ^ej ’ s-r ‘- 3 v ."• .’•

wv>rr>. 1 .%."» ••’i I •;:£> -.

1 ayii'o t. X i. $ a It’' ^

•

*,

' . » X * /* y :» .’

oom a'f/r •:.*£ won;-- ^.v fio taioq -r «-. .#-.v..

I'o •. ,» . ..,{»jL'c;i../.* c;. jajstJ.; ,nj«i ' v :; - .;

'.:-AX !9t1i beu?nuf' ^ o*\5 c i

i- i bj.iixxat i;'ii. .terfj . , :

o

x!0 ,^ . x-.J I'tvrw: .^: - •:, ^^3

w :.0 3ftr.Co% 9liS :*;? ;tw l

known difficulties which arise in a computation consisting of

a long series of different arithmetic operations. In such a

computation, strict adherence to the rules at each stage can

wipe out all meaning from the final results.

For example in computing the slope of a straight line

fitted to data with 3 significant figures, one would surely

not report the slope to 7 significant figures, but if one

rounds to 3 significant figures, after each necessary step

in the computation, one might end up with no significant

figures in the slope.

It is easily demonstrated by carrying out a few compu-

tations of this nature that there is real danger of losing all

significance by too-strict adherence to the rules, devised for

use at the final stage. The greatest trouble of this kind

comes where we must subtract two nearly equal numbers, and many

statistical computations involve such a step.

The rules generally given for rounding off were given

in a period when the average was the only property of interest

in a set of data. Reasonable rounding does little damage to

the average. Almost always now, however, we calculate the

standard deviation, and this statistic does suffer from too-

strict rounding. Suppose we have a set of numbers:3.13.23.3

Avg. * 3.2

-34 -

Ic . '>:•• . r »q n ir xi'liw SGi ^ L'- j''

a fil.’.'

. ,*:> "'T' .1 ;. .•..; i .' '..

•/ X a’Xc-: .P

x;'^ •'tr ... j: . ... «-;••' 1

'*

;v‘**. 'T'r ^

,eriw.-r>i T- ^ ' .-r '• .

jJ~,: -^

}

‘-.. 'I'.. .Ji '.>' ...ivt::-:>

rr' ^iird'^

-^./i .• o yP3 ' 'v- '^i; .

y 5 >; C' “.r.f t-rl r.*''"^/ -

:.C-.

Si 4 ?J

:inc' -Tf*-

0^1 i 1-

—'i '.V#tr 1

.-, .--

«'.--b»x/::-; •, *;?|'

• v> J; 5:iie '

. / . lo.eiiofi?.

•ioi; ';j SOiiC“5 :;:/^4i:e.:'•

•

i' r: ;aj*:,!' a€>Xv-

>.J V V.''T \V‘ -*i.; •

oi’;* > r:H i I..I. - .. X /-.iv5^ .'Xv xrri .x/x..

If the three numbers are rounded off to 1 significant figure,

they are then all the same. The average of the rounded

figures is the same as the rounded average of the original

figures, but all information about the variation in the

original numbers is lost.

The generally recommended procedure is to carry two

or three extra figures throughout the computation and then

to round off the final reported answer (e.g., standard devia-

tion, slope of a line, etc.) to a number of significant figures

consistent with the original data.

[1] J.B. Scarborough, Chapter I - Numerical MathematicalAnalysis , the Johns Hopkins Press, Baltimore (1930)

.

[2] C. Eisenhart, Chapter 4 - Techniques of StatisticalAnalysis, McGraw-Hill Book Co., New York (194'/).

-35-

./ /V\4- 4 I,' rwt': : >'. J

i. ,'3rf>„;

V.:;:.. ;:^i. ' /r: '^X..r , v/V

' 0(t*/ i tv : . A , .. ^ ,::> / ;r

^

^ r-y .:ts.^- r-::.\'^ r . -•^

1' > ' ^ .^^’iiIi !i X - JuO ; .;> i:.'\/-,...: !’'ir , I ; *^'r'XlL^a ^'t ?'

. .''ii 1'i!'

,-;„ V.- '.'.'.J j jiv»

- » ; » •4- .

U. S. DEPARTMENT OF COMMERCELbwis L. StfausB, Secretary

NATIONAL BURfeAU OF STANDARDSA. V. Ailih, Director

THE IVATIOIVAL R1JREA1J OF STANRARRSThe scope of activities of the National Bureau of Standards at its headquarters in Washington,

D. C., Snd its major laboratories in Boulder, Colo., is suggested in the following listing of the

divisions and sections engaged in technical Work. In general, each section carries out specialized

research, development, and engineering in the field indicated hy its title. A brief 'description ofthe activities, and of the resultant publications, appears on the inside front cover.

WASHINGTON, D. C.• Kl©45ti*ldty and KI©Ctl*onlcs« Resistance and Reactance. Electron Devices. Electrical In*

strumentsi Magnetic Measurements. Dielectrics.. Engineering Electronics. Electronic Instru*

mentation. Electrochemistry.

Optics and ]Hotrolo|ty« Photometry and Colorimetry. Optical Instruments. PhotographicTechnology. Length. Engineering Metrology.

Heat* Temperature Physics. Thermodynamics. Cryogenic Physics. Rheology. Engine Fuels.Free Radicals Research.

Atomic and Radiation Physics* Spectroscopy. Radiometry. Mass Spectrometry. SolidState Physics. Electron Physics. Atomic Physics. Neutron Physics. Radiation Theory.

Radioactivity. X-rays. High Energy Radiation. Nucleonic Instrumentation. Radiological

Equipment. '

Chemistry* Organic Coatings, ’^urface Chemistry. Organic Chemistry. Analytical Chemistry.Inorganic Chemistry. Electrodeposition. Molecular Structure and Properties of Gases. Physical

Chemistry. Thermochemistry. Spectrochemistry. Pure Substances.

IMechanics* Sound. Mechanical Instruments. ' Fluid Mechanics. Engineering Mechanics. Massand Scale. Capacity, Density, and Fluid Meters. Combustion Controls.

Organic and FIhrous .materials* Rubber. Textiles. Paper. Leather. Testing andSpecifications. Polymer Structure. Plastics. Dental Research,

metallurgy* Thermal Metallurgy. Chemical Metallurgy. Mechanical Metallurgy. Corrosion.Metal Physics.

mineral Products* Engineering Ceramics. Glass. Refractories, Enameled Metals. ConcretingMaterials. Constitution and Microstructure.

Building Technology* Structural Engineering. Fire Protection. Air Conditioning, Heating,and Refrigeration. Floor, Roof, and Wall Coverings. Codes and Safety Standards. Heat Transfer.

Applied mathematics* Numerical Analysis. Computation. Statistical Engineering. Mathe-matical Physics.

Rata Processing Systems* SEAC Engineering Group. Components and Techniques. DigitalCircuitry. Digital Systems. Anolog Systems. Application Engineering.

• Office of Basic Instrumentation. • Office of Weights and Measures.

BOrUiER, COLORAIIOI

I'ryogenic Engineering* Cryogenic Equipment. Cryogenic Pr«)cesses. P.’operlies of Mate-rials. Gas Liquefaction.

Rlldio Pr

.^ ' h.i ' : ^

i

national bureau of standardsreport...nationalbureauofstandardsreport nmmject nbsreport 1103-40-5146...

Documents