national bureau of standardsreport...nationalbureauofstandardsreport nmmject nbsreport 1103-40-5146...
TRANSCRIPT
-
NATIONAL BUREAU OF STANDARDS REPORT
6263
Draft of
Part IV (Miscellaneous Topics)
Manual on Experimental Statistics
for Ordnance Engineers
A REPORT TO
OFFICE OF ORDNANCE RESEARCHDEPARTMENT OF THE ARMY
U. S. DEPARTMENT OF COMMERCE
NATIONAL BUREAU OF STANDARDS
for
-
THE NATIONAL BLKEAU OF STANDARDS
Functions and Activities
The functions of the National Bureau of Standards are set forth in the Act of Congress, March
3( 1901, as amended by Congress in Public Law 619, 1950. These include the development and
maintenance of the national standards of measurement and the provision of means and methods*** '
for making measurements consistent with these standards; the determination of physical constants
and properties of materials; the development of methods and instruments for testing materials,
devices, and structures; advisory services to Government Agencies on scientific and t
-
NATIONAL BUREAU OF STANDARDS REPORTNMmJECT NBS REPORT
1103-40-5146 13 January 1959 6263
Draft of
Part IV (Miscellaneous Topics)
for
Manual on Experimental Statistics
for Ordnance Engineers
Prepared by
Statistical Engineering Laboratory
A iieport to
Office of Ordnance ResearchDepartment of the Army
IMPORTANT NOTICE
NATIONAL BUREAU OF STAlfittiKl«d for vto wltklA tlio C
to tdditlofiat ovoluatlOA Md rcllitlog of llili RdP^’di oltkor If
llid OfNco of tlia OIroctor, Nat
how««or, b| Ike GoveromoAt ito reproduce addltloAtl coplei
Approved for public release by the
director of the National Institute of
Standards and Technology (NIST)
on October 9, 2015
irogreM aceouAtlAg documeota
ffnalljr publleked It Is subjected
reproductloAi or opsA-llterature
ilOA Is obtaloed Ia wHIlAg from
Suck permlisloA Is AOt oeededi
prepared If tkat ageocy wishes
U. S. DEPARTMENT OF COMMERCE
NATIONAL BUREAU OF STANDARDS
(i)
-
;n.tC:'pi -I vii
fi#/
'• r
V 'J
:' -. , \ . i ^ : ‘imfv .,•' r...-'; ...,• r .V „ .? = 'i., .].'• ,;’'..'-j
i' - •• •.,, . • . .. : ,. : '-f!,? .. :•>'
*
\'
:.';•' ?: ' •
y. ^ ^ •;.> .• r ,< 7 :. V ; ' : , . r . i' •' '
•. i^ ' ) i
. r•
• ' ' •,'• . 7 ';
I.,'"! ' ?
'.
.. i
•s .i
^
;. H., !
,
Vi:
i'
'*,
^'
-
NOTICE
This report is a preliminary draft of Part IV
(Miscellaneous Topics) for the Manual on Experimental
Statistics for Ordnance Engineers.
No known inaccuracies exist in the present
draft, but improvements in arrangement and exposition
of some of the material may be made at a later date.
(ii)
-
V I' r
*1 ;
4
4
! r n; * ,r
Ir-ib Lj! :I‘.)-
-
Table of Contents
Page
1. Rejection of Observations 1
2. The Place of Control Charts in Experimental Work, 9
3. Statistical Techniques for Analyzing Extreme-
Value Data 13
4. Transformations 22
5. Notes on Statistical Computations
5.1 Coding. 2 5
5.2 Rounding 29
(iii)
-
4 '
I
. I
*.r; 1)'*
;' '/*/ ''j'tJ'ia ’'f
'
’"-O •iT’* J '^>Zi t i-
j n/ -.y.•.-«. ^
1
.
».'., -f- .:i
' 9i ' lui ^5^ !• -i-fjoi- :• ' .'i J ?5 -rr
' vt’
'.' %
» t .^
.
•
3 £ a- f./ i .B *i d .-r;’ sr 4 -V;
. i,c
V.:
1:1 . G
/
-
1 . Rejection of Observations
Every experimenter has at some time obtained a set
of observations, purportedly taken under the same conditions,
in which one observation was widely different, or an outlier
from the rest.
The problem that confronts the experimenter is whether
he should keep the suspect observation in computation, or
whether he should discard it as being a faulty measurement.
The word reject will mean reject in computation , since every
observation should be recorded. A careful experimenter will
want to make a record of his "rejected” observations and
where possible detect and analyze carefully their cause (s).
It should be emphasized that we are not discussing
the case where we know that the observation differs because
of an assignable cause, i.e. a dirty test-tube, or a change
in operating conditions* We are dealing with the situation
where, as far as we are able to ascertain, all the observa-
tions are on approximately the same footing. One observation
is "suspicious" however, in that it seems to be set apart from
the others. We wonder whether it is not so far from the others
that we can reject it as being caused by some assignable but
thus far unascertainable cause.
-1 -
-
X
'IKK, i i-'
i'l
-T‘
V,
1
•
"' k'a .1 V. "1 "4 Ziti -fi- , iiK ;.i'
i.'.i,
j.. i> ’-- ' ; 'i 'r
'.>
''!' >' •
^ ,.v- ;-v ,. ti: .- J- .n . . .
j;. .• J ; • K -'ev; = 4’*^ ; ; n„ m .x .i' , ^
x'i fftc( M . i .»'• . . . *f
.
..• ( ; A t : . :r
T ,. , . r, t < -f,,** “ f “
i:-' 1 •'XixC .. ' '
•, . O't"! • ' >; i i- - •' ^ B . JI .
'
i .' L '. .X-:. .,xx-yi..\X;ryl. V‘- > ^ * /;/:
' J - A . ' ^> '-.
V- U; .r ,': .'
•'' X" ;• : >:=, ; .' i :i‘ ..i - 'i ." . -;v-/v. .^;, . , v' ';'
.
''
'
j' ' v’ i ' ,,* J .0 ' ii / > \ )
''
,
' '•
. V / .. .V,
; ii J
-
Whether a measurement far-removed from the great ma-
jority of a set of measurements of a quantity^ and thus
possibly reflecting a gross error, should have a full vote,
a diminished vote, or no vote in the final average - and in
the determination of precision - is a very difficult question
to answer completely in general terms. Common sense dictates
that, if on investigation, a trustworthy explanation of the
discrepancy is found, the value concerned should be excluded
from the final average and from the estimate of precision,
since these presumably are intended to apply to the unadulte-
rated system. If, on the other hand, no explanation for the
apparent anomalousness is found, then common sense would seem
to indicate that it should be included in computing the final
average and the estimate of precision. Experienced investiga-
tors differ in this matter. Some, e.g., J.W. Bessel, would
always include it. Others would be inclined to exclude it,
on the grounds that it is better to exclude a possibly *good*
measurement than to include a possibly ’bad* one. The argu-
ment for exclusion is that when a ’good’ measurement is ex-
cluded we simply lose some of the relevant information, with
consequent decrease in precision and the introduction of some
bias (both being theoretically computable) • whereas when a
truly anomalous measurement is included it vitiates our results.
-2 -
-
^ b'7v..r;i5f •ji -•iti*'' txi*: ^-.' 'C.’tvti.i.v-IJ ::!: *tS
Z:'
r/ J' 1-; •' '• T ' V>'. -~ i, J-iiM V- > si ^ i •• _>4i at* ».• I t jU v7 *?>
l4*.-7 • ft . ;p,t'V •:- ^:i"' '-'1 t Jt v-»j r d-
s:.^^r',t cisvai; l>'.'>ou&-. .ce^iiiii ' • 'ti'^ r .toa-iq 5-«j ••. «:3-.r '7'.:: f: .*? ao • i7‘ • r.
V .. i. V . W. (* . ;3’ .^!t^ I Sf:io.?M .:‘;i‘>tr.!' 'airlt iijfe •st^i’^.tti 'L-‘'4:r:t
* ..’;'.f;a*'
; .vJ‘
' bep J .. ai i'liiO’tV' a*t*'j.d jb , . jL ®b,ijr:>nx Jrjffx/.'fu
i)
'>^1*^ i d l5iaoq -.ii :-j j j j : c:bivt>o*^?v c^iirt :ib
#-
; u ;• > .I.r; .T . w-i6,< M i'f'w j 7/* >.. . ; UiibJ t-r^rsX-j •?ii's.^*0/;.« ! .'I , •
-. aOiExxCox‘r‘^i‘*o'^
:. 't-w.'. 'i:0 "?-«?'- x 4-*.«0.r .'\ Xo^fui; c. •• c.f
.
1''
'
^raO*3t Ttwo ftlvf-'l-i^' b.£>b*j rO:S?i;- S'JC
•
;t,i ;jfy:
^^
-
biasing both the final average and the estimate of precision
by unknown, and generally unknowable, amounts.
There have been many criteria proposed for guiding
the rejection of observations. For an excellent summary and
critical review of the classical rejection procedures, and
some more modern ones, see ref. [1]. One of the more famous
classical rejection rules is ”Chauvenet*s criterion*’, which
is not recommended. This criterion is based on the normal
distribution and advises rejection of an extreme observation
if the probability of occurrence of such deviation from the
mean of the n measurements is less than l/2n. Obviously
for small n, such a criterion rejects too easily.
A review of the history of rejection criterion, and
the fact that new criteria are still being proposed, leads
one to realize that no completely satisfactory rule can be
devised for any and all situations. One cannot devise a cri-
terion that will not reject a predictable amount from endless
arrays of perfectly good data. The amount of data rejected
of course depends on the rule. This is the price one pays
for using any rule for rejection of data. There is actually
nothing better to use than the judgment of an experienced
investigator who is thoroughly familiar with his measurement
process. For an excellent discussion of this point, see ref.
[ 2 ], Statistical rules are given largely for the benefit of
-3 -
-
; ;.v*' a.
^
..v'S.?-tiup#ni-' •"
*i .r 'a.»i.:^u ' \ t ^...' • . -y\ tnr . • -a'
' - ’'
'-..r*-.*'
.-lofti t r. vrr\ • o j mfl
’^.,.v.Jl^•'•vc;x^i* ;4 i8 ': . anotl /rv% -;>- • .• uiJfOyi.r, * w'J
'.r: ^ ^ ^6 Ctf!Jt [* .il/ivT *.^v> •,.,. nj : ?‘ t,'* . ,;*'*? orSr-jr;
iOffiivf , t'f.-ri y ; h: r a " =
..isv -);J . ;!^'v ,- i.>>' S t ' '3 J I'l. > A 1“^^’ J , h-'rb ',' 'Mmi.' . '^T ^ C .7
•> • .^ K ^ ..yi-'
r lA •rrcAr:' *,::*
^iij ;'?0''/ .71 r; , Aiv?'- • V> ? ^»o/if»'i.'X£»’v )/• r, -^ tjtiiovido'xq li
•;.^vv.
' .i '/f1 n'^4* fci .4 r^JS©,‘T' ., .-f^i’f . lo ni'^P-
*.v;I texid > .' eiJ", v,.t-‘ 'i . >"0 .in^::'
' .. «; doi: :*:/.
uJ ,i-. • u.iv^vf,t
'
^
‘
\B’X ?crr^;/o.‘dlixt . y; 7.r+*;.7j;i:.a xfi* / -. f.b
’{ I •o*’ i
'
•; c 'i >;; jon ; A rroi
*
i.a,
,i'-*.1 • -
.• ;• \ •./ • i *i .
.
V» gc.- .; v ic
^.. ‘ v
-
inexperienced investigators, those working with a new process,
or those who simply want justification for what they would
have done anyway.
Whatever rule is used, it must bear some resemblance
to the experimenter’s feelings about the nature and possible
frequency of errors. For an extreme example - if the ex-
perimenter feels that he may make one blunder in twenty, and
he uses a rejection rule that throws out 3 in 10, obviously
his reported data will be ’’clean”. The one sure way, and
the only sure way, to avoid publishing any ’’bad” results is
to throw away all results.
With the foregoing reservations, the next two sections
give some suggested procedures for judging outliers. In
general the rules to be applied to a single experiment reject
only what would be rejected by an experienced investigator
anyway. (See section 1.2)
1.1. Rejection of observations in routine experimental work.
Probably the best technique for detection of ’’errors”
(e.g., systematic errors, gross errors) in routine work is
the use of control charts for the mean and range. Detailed
methods of application of control charts have been described
in many places and will not be repeated here - see section 2
for appropriate references.
-4 -
-
. ' vV
»iOV
: I
kifir.
r .iMJt;- . j-:;
o'd:’ • -..'V
.^». i A X '
-;a .-f V
• '/ ,'(-f • -- V • > . > 'i - •;,
::t Hkv.S7.S- -r:.K.H ;
-. .• ..v:
>; -•ri ;=
; 'v" .: ?: - -
. .-, 7 • . i i . ; L
’-..fi-•, £iV, , ’ '.d:s iijx*"
ai.:a C: •; D : C, od r^:-" i
V,; -i od '* "fjr ,/ .
I i i?" ." c ->S
. o:?.;0 r J .:; i
.V . “f< r . - . .
i3 O'? Hi
-.«• 7 xO ’;S. r'-•«*<
;,V'’t 'ilX- .. •
•-'
.' 'iT >/i f':c’iT-' or
^iio Ic
^ AOJk&
-
1.2 Rejection of observations in a single experiment.
We shall assume that we have n observations ranked in
order from lowest to highest (X^ < ^2 * * * — ^n^ *
observations are from a population which has a normal dis-
tribution with mean |x and standard deviation a. We shall
discuss four cases.
Case (1) - We do not know anything whatever about the
population from which the data were taken, and must make our
decision on the basis of the n observations.
Case (2 ) - We do not know or a, but have an estimate
s of a which is independent of our present sample. The value
of s is estimated with f degrees of freedom.
Case (3) - We are willing to assume a value for a, the
standard deviation of the population from which the observa-
tions were taken.
Case (4) - The population from which the observations
have been taken is well known, and in particular we know the
values of }j, and a,
CASE (1) : No prior information about the mean and standarddeviation
i) Choose a, the probability or risk we are willing to
take of rejecting an observation which really belongs
in the group.
ii) Look up ^2-o/2Table XI. (NOTE: If the procedure
is such that we would consider rejection of obser-
vations which were extreme in one direction only -
- 5-
-
*
i:'\ V.
-
e.g., where we would reject values which were too
large but never those which were too small - then
we should replace a/2 by a. Our choice as to whe-
ther to use a/2 or a must be made on a priori grounds
and never on the basis of the data to be analyzed)
.
iii) If 3 < n < 7
8 < n > 10
11 < n < 13
14 < n < 25
Compute r^Q,
Compute r^^.
Compute ^21 ^
Compute
where r. . is computed as follows
:
^ijX
^is suspect X
^is suspect
^10 (X2-Xi)/(X„-Xi)
^11 (X2-Xi)/(X„_1-Xi)
^21 (X -X „)/(X -X„)' n n-2 n 2 (X3-Xi)/(X^_1-Xi)
^22 „)/(X^-X„)n n-z n o (X3-Xi)/(Xi^_2-Xi)
iv) If > r^_ a/2>reject the suspect observation^
otherwise retain it.
CASE (2) : Mean unknown, estimate of a available from previoussample
i) Choose a, the probability or risk we are willing to
take of rejecting an observation which really belongs
in the group.
ii) Look up (n,f) in Table IV. (See Note in step ii
of Case (1) ) .
-6 -
-
lA.
f*
di
Aw K ;
%
. '"-'A'. • j•• ^ .
I
‘.*
4^ /
i»'t
-
iii) Compute w =
iv) If w < reject the observation which is
suspect, otherwise retain it.
CASE (3) : Mean unknown, a known
i) Choose a, the probability or risk we are willing
to take of rejecting an observation which really
belongs in the group.
ii) Look up (n,oo ) in Table IV. (See Note in step ii~ cx
of Case (1)).
•H•H•H Compute w * q, a.
iv) If w < reject the observation which is
suspect, otherwise retain it.
CASE (4) : Mean known, o known
i) Choose a, the probability or risk we are willing
to take of rejecting an observation which really
ii)
belongs in the group.
X 11Compute a* “ (l-a/2)‘^"" , (See note in step ii of
Case (1)).
iii) Look up z, , in Table Ib,1-a
iv) Compute a * a + az, ,1-a
v)
b = - crz^_^.
Reject any observation which does not lie in the
interval from a to b.
-7 -
-
References
[1] Paul R. Rider. ’’Criteria for Rejection of Observations”.Washington University Studies - New Series, Science andTechnology - No. 8, October 1933.
[2] E.B. Wilson, Jr. ”An Introduction to Scientific Research”,McGraw-Hill Book Co., Inc., New York, 1952.
Additional Reading
[3] W.J. Dixon. ’’Processing Data for Outliers”, Biometrics,Vol. 9 (1953) pp. 74-89,
[4] A. Hold. ’’Statistical Theory with Engineering ApplicationsJohn Wiley and Sons, 1952, pp. 333-336,
[5] F. Proschan. ’’Testing Suspected Observations”, IndustrialQuality Control, Vol. XIII, No. 7, January 1957,
-
i'.
'; '.',
ixjLJtii:5,A08 .0-^ no*:4c‘uI^^Tj
,*’p..a'-.)l J- •'. ;in-i.-;.to jallS|fi A rf'^i.w xtooilT-f ih.il » '"Vl’d 'J % b>'IpB ..A.
d€'; : -•€t£
' . qq ^^cP I ^ 3 .o b aii , v o CX'
' n do L.
i.r ' i.T'. -.-^ f!VT:-i'..ad
-
2. The Place of Control Charts in Experimental Work
2.1 Primary objective of control charts
Control charts have very important functions in experi-
mental work, although their use in laboratory situations has
been discussed only briefly by most textbooks. Control charts
can be used as a form of statistical test in which the primary
objective is to test whether or not the process is in ^’statis-
tical control”. The process is in ’’statistical control” when
repeated samples from the process behave as random samples
from a stable probability distribution; thus the underlying
conditions of a process ”in control” are such that it is poss-
ible to make predictions in the probability sense.
The control limits are usually computed by using for-
mulae which utilize the information from the samples themselves.
These limits are placed on the specific chart and the decision
is made that the process was ”in control” if all points fall
within the line. If all points are not within the limits
then the decision is made that the process is not ”in control”.
The basic assumption underlying most statistical tech-
niques is that the dataare a random sample from a stable prob-
ability distribution, which is another way of saying that the
process is in ’’statistical control”. It is the validity of
this basic assumption which the control chart is designed to
test. The control chart is used to demonstrate the existence
of statistical control, and to monitor a controlled process.
-9-
-
*-•• .•"r~’‘T!?! ’ .**'
*5_
'u.' ; f •' :; ,V . X’:' -• ’•'i ^ * •' >i" • .' 1' ;• • V
.T^qxo liJt stivi:i:)atix v"i^>/ iVJiii .'. nr: . ‘c‘ fao
mJSr . >iJ Rt.t iiJc bzaj *i ' >il j ' -..-o-i > - *‘. ^•*.‘‘.v I^Ja5^'••'•= >
'
'•i' 'j' ’)!'
.
^ j'.
Vi' < '• */'• '
.^' Kfc s . -:m' ' - .'
'
.
:,c:rj-jio"> .mTivoiif^xtxf Jpooi • ,4 '^.1 Jic i b -i.
'
//f .l:.U wjiirlw :; ;' •
'i';rei -t > ' r f • 1 r • : ; -M* '^rnv
~i- - y '> M ' - t ;•- rf"' >;^ :. -.-1. ' -• . .
' tjto -^o . ' t -xvtiJv riy
-w>:v/ . - ••'.> ;. •* nl: px pfiMC>-tq •. >..
.X '!; j v; .70/1 o'*p '-'tniio.- J. ir, If J-xh^l\7
f '**
'
,
'' io I * 7or; 7o;.i co ^ o'57 - oo/jrj al , no ; ofliJ r;*jc!l.
•'
» ’ Jl•
’ i’
. :-
-*•.>91 I li.— * ^ i j - 7ev. > ‘ifj i; v i -n next jo • o'. •'•' ,;'i
.. i(*.•• •
:< L' ; Cl jsiv.c* fr^twurfJi n o^-i J'.'-.tn •£>n?>.ff;..qr.i-,c' fr^lwurfj’i n Ov-i
;il j :t j O, ''•('ii.f oi' •» a 1: il ^ i 1 7 ,;Xk:» I j u i
'
q:i:xb\'."7 ->iii cii II . ” to't. :'->n J • ' t .t . .1,4'-7
L i r
c H
r.*.: j V/’ vb u ’’O'f.t n
-
As a monitor, a given control chart indicates a particular
type of departure from control.
2.2. Information provided by control charts
Control charts provide a running graphical record of
small subgroups of data taken from a repetitive process.
Control charts may be kept on any of various characteristics
of each small subgroup — e.g. the average, standard deviation,range, proportion defective. The chart for each particular
characteristic is designed to detect certain specified de-
partures in the process from the conditions assumed. The
process may be a measurement process as well as a production
process. The order of groups is usually with respect to time,
but not necessarily so. The grouping is such that the members
of a group are more likely to be alike than are different groups.
Primarily control charts can be used to demonstrate whe-
ther or not the process is in statistical control. When the
charts show lack of control, they indicate where or when the
trouble occurred. Often they indicate the nature of the trouble,
e.g. trends or runs, sudden shifts in the mean, increased vari-
ability, etc.
The control charts, besides serving as a method of test-
ing for control, also provide additional and useful information
in the form of estimates of the characteristics of a controlled
- 10 -
-
., . f.
’ -V
'',* > \\"
V, ,.
r 4, ,^-^iJ.Ar'; ',
« ^ • '4
: >T- i.-. X» ' v>L>
£*,' - .V.a *•••»»'* *
V': ’'J ^ Y ja- .a-
- . •
f' '• r i
f'
! ''r > >j>irT
•t * '=K
Vt,
S . ^
u«>T^ ;
r .’i urt •
i ''
t I' r»-v/ ; rlKi^ ^
-1. i J
/ 'iw ^'-
^ ».. Vo i j- 'i I-. *..y. J‘
f - ii*
:,o>.
> 'J 1 *>
;
- '
«
a 'A " \
‘' ' ' *•
ihr^B ',4i';-T -
• '
y'.;-'
;^V fxi V
.
:^- •a r • V
- ’? « / jr6"-
C, v..
* •.
' :7 *T Xi-’. =."
rrx . ft ' Ti J^iVv^
j \ -V«. 'Vr._
Kii!v»ai. a: t . i '
t.' c-
,.4 T'a’M 1 \ 7 ['‘
-fBrniTiu
O'iQ ^*d 1 ajiT */c '' . .
•^- •? :Av'»
,. :p ,,.-4 V-
8iL«it
K;‘ -
-.i'ae.i r-
•\.i :>tu-
1
ii;
i. ^ ' L-
i•'*:
MiO-T j )^16
. i tii,,
•ib:» v^.’
i o’^J ifTo:>
1
«
dLe^
I. V ^ -> '. . •'*
'V ;'
. i
;.: a*
^'"r'l
''1
\.* M *
,.. S':
-
process. This is information which is altogether too fre-
quently overlooked. For example, one very important piece of
information which can be obtained from a control chart for
the range or standard deviation is an estimate of the varia-
bility (a) of a routine measurement or production process. It
will be remembered that many of the techniques of Part I are
given in parallel for known a and unknown a. Most experimental
scientists have very good knowledge of the variability of their
measurements, but hesitate to assume ’’known a” without addi-
tional justification. Control charts can be used to provide
the justification.
Finally, as was pointed out in section 1, a control
chart is the most satisfactory criterion for rejection of
observations in a routine laboratory operation.
2.3 References
The basic references for the use of control charts are:
1. For the underlying philosophy of statistical con-trol - Shewhart, W.A., ’’Economic Control of Quality ofManufactured Product”, Van Nostrand Inc., New York, N.Y, 1931.(The first and still a very significant book in the field).
2. For the details of application - ”ASTM Manual onQuality Control of Materials”, available from American Societyfor Testing Materials, 1916 Race Street, Philadelphia 3, Pa.,1951. (This manual gives examples of various kinds of charts).
Some recommended general texts on quality control are:
3. Cowden, D.J., ’’Statistical Methods in QualityControl”, Prentice-Hall, Inc., Englewood Cliffs, N.J.,1957.
- 11 -
-
i 'f'.'-f =\> fv V ^ -t ^ ,.. ' ^ - L':
JJ '’ J r:-*' ’.'7.VL Oi7‘. . Z:- r^-^‘ '..=?, ;.. :0^ ».»,' <
jJ > , ••'> Y t ^ " *
Z''
* V *• -’ •X-. „'r;:^0 ^
r
* •''"i
'f i '- '•*
•‘
, ., ‘-r: t‘ , 1 "i
., Tl, •> ” 5 -'. - c 11^. . .- :: ", 'J - . f> . • -nf' ... • ..
.“ ' ..-:iA ...: - ,. y»
, r‘H *>’.(.4 'i 'i..\ J :. ^ ‘,:j .t. ..it
j.,. f. jS; . ,'•
ar^X V - ^*,. -• :,-.y "--i"' .‘U'y
n ,: y) -„ ..- Of /.. '" > i i . \ i
c.I'-iirf 'i', ..r-r}iv Kii; X.-r v ,:• u . ’^^
•';;s.;^ ,. Jr- 1 v ’"-i Krt'o^L.; ^
C' Z-:\ >.:: ' 'V . :,t,'‘"
*. 1 5 > LJ r
t < i :f»*‘ . .‘ "X ^ X;
u It* _,.X y.-.^ X : *?:. .aI j iv^'- X.-Jr^Ak '., •' •'•. , '{ I T » •
'
•.; tt*'- \ 'nl 7m':>
‘ •;*«'‘‘^ I;V;f .• > ^*vX n .-•: .: :'#>••
.
M.^ ‘ • A »T ^ ^ • t ...
:• X. i>y\*n:rio vu ^.-^u f ^ ' .:•'••-: . '• ;^.. 'Nv Y/qi ::V-^ X': • .I.-'’'’ . V -:.'•
. ,f, w.; .'i- f«,
.. .'a•’ r.; r' .>v^ ^ »
;X. V ^ ." .= : i',-. . . % , ' A
‘ < t1 y. i. S'
' - V r- .
:
•.:-X
rt i 1. : 1 r.
:
•• •
f.
' • - • • -i ?
'
t %i/ *• Vi:.' ^ ?r'X'“ c : .s-’
a .i ^ tiw^•. .s'ili.JtO ho • - •• ' >; ;: t 2 , . • , . . : \ 'J i. . • . -
'
J^Xc-'-. :'5'
-
4o Duncan^ A.J. ’’Quality Control and IndustrialStatistics”, Richard D. Irwin, Inc., Chicago, 1952.
5. Grant, E.L. ’’Statistical Quality Control”, SecondEdition, McGraw-Hill Book Co., New York, 1952.
6. Actual examples of laboratory applications in thechemical field can be found in a series of comprehensive bib-liographies published in Analytical Chemistry:
a) Wernimont, G., ’’Statistics Applied to Analysis”,Anal. Chem. 115, (1949).
b) Hader, R.J. and Youden, W.J., ’’ExperimentalStatistics”, Anal. Chem. 120, (1952).
c) Mandel, J. and Linnig, F.J., ’’Statistical Methodsin Chemistry”, Anal. Chem. 2S, 770 (1956).
d) Mandel, J. and Linnig, F.J., ’’Statistical Methodsin Chemistry”, Anal. Chem. 30, 739, (1958).
These 4 articles are excellent review articles, successivelybringing up to data the recent developments in statisticaltheory and statistical applications which are of interest inchemistry. They contain extensive bibliographies, divided bysubject matter, and thus provide means for locating articleson control charts in the laboratory. They are not limited tocontrol chart applications, however.
e) For a special technique with ordnance examples,see: Grubbs, F.E., ’’The Difference Control Chartwith Example of its Use”, Industrial Quality ControlVol. Ill, No. 1, July 1946.
Industrial Quality Control , the monthly journal of theAmerican Society for Quality Control, is the most comprehensivepublication in this field.
- 12 -
-
3 . Statistical Techniques for Analyzing Extreme-Value Data*
Classical applications of statistical methods, which
frequently concern average values and other quantities fol-
lowing the symmetrical normal distribution, are inadequate
when the quantity of interest is the largest or the smallest
in a set of magnitudes. This is the situation in a number
of fields, in many of which applications of methods for
dealing with extremes have already been made. Meteorological
phenomena that involve extreme pressures, temperatures, rain-
falls, wind velocities, etc., may be treated by extreme-value
techniques. The techniques are also applicable in the study
of floods and droughts.
Other examples occur in the fracturing of metals, tex-
tiles, and other materials under applied force, and in fatigue
phenomena [1,5,6]. In these instances, the observed strength
of a specimen often differs from the calculated strength, and
depends, among other things, upon the length and volume. An
explanation is to be found in the existence of weakening flaws
assumed to be distributed at random in the body and assumed
not to influence one another in any way. The observed strength
is determined by that of the weakest region - just as no chain
is stronger than its weakest link. It is thus apparent that
whenever extreme observations are encountered it will pay to
consider the use of extreme-value techniques
.
*) This section is adapted, with permission, from [10] and [14].
- 13 -
-
y*'.y t i ''* '. >«S ••
“ . ^ --V >S .^;: "M.a ii
*5*; ;'i ?.?.. :»
- -:
, ,V- - • r-^' ‘
i3^f f 'J-K 3:j^.,
^
.-v-X
7
* '» "ir'fx’ 'T*‘ i '^ I'-t'«
1 ^
Xn* i xl ..’^ . J 3, ' • . ,• [ }
.,' - t - .v/W'-ix
3,
i2 . -r. -'iC>'< XO'X"’ .," iy.’r^ xf, ciCii'S 07 ''
"i M • . V ‘.*
? ->-.(.
. .> t » >i'* V* ^-r- ,- 1,'. V,
'i-,,'T7 s-,;i I
:
i i i
j.-» je
f.*•
A
-
Use of Ext reme-»Value Technique - Largest Values
A simplified account is given here. For the detailed/
theory and methods^ consult the primary references^ [l]^ [2],
and [3], References [4] thru [15] give some selected appli-
cations. More extensive lists of references can be found in
[l]j [3]^and [9].
Figure 1 illustrates the frequency form of a typical
curve for the distribution of largest observations.
Co
c
CQ
-
TO ' ,
''•
•I,
'''
‘i u...I J>4 (
a1
n? t [•-i; stf - B'^-I nao -ot^M ''' ' J";
3tytu- I
'
i.,>i| ^
.";'• .',• ' V ' .V .
Bi"? h ’X' *:"> ,ncJ'ff*dii':Kd i.iytj edi Jk J
'
' .j ' V
’'’
'/,
£'[ ':r}s*r cf;&a
-
while very high ones do occasionally occur. Theoretical consi-
derations lead to a curve of this nature, called the distri-
bution of largest values or the extreme -value distribution.
In using the extreme-value method, all the observed
maxima, such as the largest wind velocity observed in each
year during a fifty-year period, are first ranked in order
of size from the smallest to the largest. These values are
given ranks 1 to n and are then transformed into cumulative
frequencies by the relation = i/ (n + 1), where i is the
rank of the ith observation counting from the smallest. Thus
a plotting position is obtained for each observation. The
data are plotted on a special graph paper, called extreme-value
probability paper, designed so that the "ideal** extreme-value
distribution will plot exactly as a straight line. Consequent-
ly, the closeness of the plotted points to a straight line is
an indication of how well the data fit the theory.
Extreme-value probability paper has a uniform scale
along one axis, usually the vertical, which is used for the
observed values, as in Figure 2. The horizontal axis then
serves as the probability scale and is marked according to
the doubly-exponential formula. Thus, in Figure 2, the space
between 0.01 and 0.5 is much less than the space between 0.5
and 0.99. The limiting values 0 and 1 are never reached, as
-15-
-
•If -J-ri'-.: .•, i>*i ') . t . • .' *
ly'•
•,> -i .
-
I d ‘.r-l.t X ' *t • C ' vV
‘'’•j.v>-. n/ . X'' -'
^i«>t>T. . r,r
y ;\^y '• . ‘ ; . T.(,’ r-''. . \::.x thi
i v:H^; ., /f*
»,wi< * '
"'^
^:t
' »i r :? 1 '
I f 1 ’. ' . . .- y* y'j ':% . .'
/v.-. ; .-•'/"t . ,- '.0 ' Hi'
-_ U ' yy'fsiJiq J G
'
• '-,i
‘
V >i-:
'
1 »
• .j
: ViX ! r.
'
X* -^y-
:tj. it? jrfo*! •. : ,.• \ :' X .^'-k : ^••*'v'i,i' :?
i>fi't -vZ^CiJ \ .,;
.< ^jd i >s.;
J'A.? .,"
':• * .-V-i,';.
^:fr':r- • : ^ , ,.T..;;,;>
-ritfj'«?d - : - y , . . .- V,
« ! .>a« 0 ^ .* f' ’
'
'i
i f IT-'
' 1. . .?;.' ’ «. • -
".fj '’ . V •'. L&a-fi’J .' .. ,' !'\v
/-t ^ o v . .
< ' ..
^
^ ,. c X ^ . i jdXxift
'H: .*.« .• . i • .} “ -vfy'afv^ .. }; u-ty ••••£•
-
is true of any probability paper designed for an unlimited
variate
.
An extreme-value plot of the maximum atmospheric
pressures in Bergen, Norway for the period between 1857 and
1926 (Figure 2) showed by inspection that the observed data
satisfactorily fitted the theory. Fitting the line by eye
may be sufficient. Details of fitting a computed line can
be found in [l]. From the fitted straight line it is poss-
ible to predict, for example, that a pressure of 793mm corres-
ponds to a probability of 0.994, that is, pressures of this
magnitude have less than 1 chance in 100 of being exceeded in
any particular year.
In studies of the normal acceleration increments ex-
perienced by an airplane flying through gusty air [9, page
394], an instrument was employed that indicated only the max-
imum shocks. Thus, only one maximum value was obtained from
a single flight. A plot representing 26 flights of the same
aircraft indicated that the probability that the largest recorded
gust will not be exceeded in any other flight was 0.96, i.e.,
a chance of four in 100 of encountering a more severe gust
than any recorded. A more recent study [13] presents refine-
ments especially adapted to very small samples of extreme data
and also to larger samples where it is necessary to obtain the
-16 -
-
I::-
:c«> li • * .' - •“ ' • Vr ‘
^' "
'
' .' 5 . .
» iX .'* I- ' '* >.-'!
.f--; . ’ xwsintTrXi .1
:
i . .•
V •
,
'• ' t' '* . •” !C t ' ./ v , j ^ A.
* ySf'^
\ 0 f.r% # ih-*;. / i 'I O'- / r*
t
-
Pressure
in
mm
greatest amount pf information from a limited set of costly
data
.
Return period
Fig* 2—Annual maxima of atmospheric pressures, Bergen, Norway, 1857-1926,
-17 -
-
A'.'.'vH
I'jl f J/V,!lV f I .. .^J7
• f*...,, 0i>. i ? 5'*^ a I:
J
•r-
''N>,,
'-..
i
1
\
V.'
i
" r.,.' :: •• iT .•
,
V- » .V i$: i a.i.
\
J1
-I
!
)
\
it
( jj r.’
I
#
. JSt,.;.
^ fi, \
\b t.
V
)
•'i
('
;?^V f.ri-«« JL^f,
a * ^
-
Smallest Values.
Extreme-value theory can also be used to study the
smallest observations, since the corresponding limiting dis-
tribution is simply related to the one previously given.
The steps in applying the ’’smallest-value” theory are very
similar to the largest-value case. For example, engineers
have long been interested in the problem of predicting the
tensile strength of a bar or specimen of homogeneous material.
One approach is to regard the specimen as being composed of
a large number of pieces of very short length. The tensile
strength of the entire specimen is obviously limited by the
strength of the weakest of these small pieces. Thus the
tensile strength at which the entire specimen will fail is
a smallest-value phenomenon. The smallest-value approach
can be used even though the number and individual strengths
of the ’’small pieces” are unknown.
This method has been applied with considerable success
by Shigeo Kase in studying the tensile testing of rubber [11].
Using 200 specimens obtained so as to assure as much homogeneity
as possible, he found that the observed distribution of their
tensile strengths could be fitted remarkably well by the extreme-
value distribution for smallest values. The fitted curve given
by the data indicates that one-half of a test group of specimens
-1 oo —
-
'i Ot •-P. \i
.ci --
X] ,>*40
.; -'.
- j
\ ..> K ;.;:u ..
.-:. .•. r*HT .9
^ : i .- j..' X"y<
1- - %-". 'X-QQ . a •
;.^".J' •> ' ; ...-
. ;
’i O :;. fX; t .
aa:^toi^€ i:
‘ v' 7 -:'.
'
.-
-r :•
..«• ' . .' .i '. '-•.!
• ' lo
•".•. *ll
ioiiiw Ti
f. t'-.
.IJ ij!
7 'c*: >' irJ r B.'l- "
^ i.jA J. jC i-
»OJj f«OV®
f^-T -*!>i q i Xiwr
! j fy.
ic': i >/', '*\r K
>c iiSfw h
[ ."J-'i i< : 7 -^
.: ;, • ?-• -3 v r
-
may be exi>ected to break under a tensile stress of 105 kg. /cm. 2
or more, while only 1 in a thousand will survive a stress ex-
ceeding 126 kg. /cm. 2 .
Missing Observations .
It has been found that fatigue life of specimens under
fixed stress can be treated in the same manner as tensile
strength — by using the theory of smallest values. An ex-tensive application of this method is given in [15].
In such cases and in other situations, tests must be
stopped before all specimens have failed. This results in
a sample with missing values — a ’’censored^’ sample. Methodsfor handling such data are included in [15].
- 19 -
-
References
Primary references .
Jl] E.J, Gumbel, ’’Statistical Theory of Extreme Values andSome Practical Applications”, National Bureau ofStandards Applied Mathematics Series 33, 1954, (U.S.Government Printing Office, Washington, D.C.).
12] National Bureau of Standards, "Probability Tables for theAnalysis of Extreme-Value Data”, Applied MathematicsSeries 22, 1953. (U.S. Government Printing Office,Washington, D.C.).
[3] E.J. Gumbel, ’’Statistics of Extremes”, Columbia UniversityPress, New York, 1958.
Selected references on specific applications .
[4] American Society for Testing Materials, ’’Symposium onFatigue with Emphasis on Statistical Approach - II,”ASTM publication 137, Philadelphia, 1952. Discussionsby E.J. Gumbel on pp. 22 and 56.
[5] B. !Epstein and H. Brooks, ”The Theory of Extreme Valuesand Its Implications in the Study of the DielectricStrength of Paper Capacitors”, J. Applied Physics,Vol. 19, pp. 544-550 (1948).
[6] A.M. Freudenthal and E.J. Gumbel, ”0n the StatisticalInterpretation of Fatigue Tests”, Proceedings of theRoyal Society A, 216, pp, 309-332 (1953)
.
[7] A.M. Freudenthal and E.J. Gumbel, "Minimum Life inFatigue”, Journal of the Am. Stat. Assoc., Vol. 49,pp. 575-597, (September 1954).
[8] Y.C. Fung, ’’Statistical Aspects of Dynamic Loads”, Journalof the Aeronautical Sciences, Vol. 20, pp, 317-330,(May 1954).
{9] E.J. Gumbel and P.G. Carlson, ”Extreme Values inAeronautics”, Journal of the Aeronautical Sciences,Vol. 21, No. 6, pp, 389-398 (June 1954),
-20 -
-
1
^ ' 1^ •» :
.
I /^n‘- f __ ''
...;- V-. v>
i. .S . •••*'
'V'
v;;,'•. i$ j > 1 T -f ’ .f ^qq . ^ .- . pv
?- >
•. i
•
•’ -' ••'
,
*"..-! H r
.
.»
.. ' «*• ^^4 ? ' •' -' •*
.
.. a- ; , ?»
* .• .'
..r!
-. ? •• iaiU -' ^'.fy;
^.;a. _.^.Ky vi/'V-u ‘'-oi!, ’':
i ,?>• .-jXs\- .. »;'r- -•.^: JL 0 'i i, >^l
'
• ;rvTir'>f,i.- I'.'pP'*'.
,' /.,0 ?- ' '. v'- _•
#’
X •
yf.'» *-’
3 C'
V*
[1
#
-
[10] E.J. Gumbel and J. Lieblein, ^’Sorae Applications ofExtreme“Value Methods’% The American Statistician,Vol. 8, No. 4, December 1954.
[11] Shigeo Kase, ’’A Theoretical Analysis of the Distributionof Tensile Strength of Vulcanized Rubber”, Journalof Polymer Science, Vol. 11, No. 5, pp. 425-431,November 1953.
[12] Julius Lieblein, ”0n the Exact Evaluation of the Variancesand Covariances of Order Statistics in Samples fromthe Extreme-Value Distribution”, Annals of MathematicalStatistics, No. 2, pp. 282-287^ June .1953,
[13] Julius Lieblein, ”A New Method of Analyzing Extreme-ValueData,” NACA Technical Note 3053, National AdvisoryCommittee for Aeronautics, January 1954.
[14] Anonymous: ”Extreme-Value Methods for Engineering Problems”,National Bureau of Standards Technical News Bulletin,38 , No. 2, pp. 29-31, February 1954.
[15] Julius Lieblein and Marvin Zelen, ’’Statistical Investigationof the Fatigue Life of Deep Groove Ball Bearings”,Journal of Research of the National Bureau of Standards,57, No. 5, November 1956.
-21 -
-
f
,h < -I
.,. , .‘'•/ J i ;•, • : -jV
,.. jq: J: .. ..
J*
‘ ;'v:ji% , ,1:: - \/• »•"
!{fe: ^vv' ''i jbiJ ' .‘Ji’A 'y « itU^. f :
'
r;:/ :{^>, 'SXj}.‘
.' ^'
.. t'
,( . . < tr‘: . **
-*jLi ^^
^Jt •
. >. j.^
. ^er1,
'',> ••
. V-/*" '-A . . . » • 1̂
• X, . td y.t.’j ,y%v '! 1 '"» 'r-iilO,' .y’.' / ?»
.v*y•'l_^,Vv,' ,y ^ 'A- "'' 'i' It i J t y* * ,a•‘H "-^,4i
.“''ty. 1 ^'"
.
V |i f '• •
/ .... .. n.
* 'i..>i jU .'. : '’•f .* * «
'
,., -.1, ’ » '
,CX''}4i»
r..' . .'?vy#:T =j^ tQ ' ,:' X -.. ,' *' r ;V> •
' i
m
.
+ :
-
4. Transformations
Statistical techniques have to be based on some
assumptions. There is often the assumption of normality;
in some techniques there is the assumption of equal variance
among groups; etc. Analysis of variance tests (in Part III)
depend on assumptions of normality^ additivity^ indepen-
dence and equality of variance. Real-life data does not
always fit the assumptions required for commonly used methods
of analysis. Where this is the case, a transformation (change
of scale) applied to the raw data may put the data in such
form that a conventional analysis can be validly performed.
Many physical measurement processes produce normally-
distributed data. Some do not. If the data are definitely
known to be non-normal, there are distribution-free tech-
niques available (see Part I, section 2.7). If, however, a
simple transformation is known to normalize the particular
kind of data, one might prefer to use normal techniques on
the transformed data, rather than use distribution-free
methods. Certain kinds of data, for example, are pretty
definitely known to be approximately normal in logs, and the
use of a log transformation in these cases may become routine.
-22 -
-
-.-fW
'V' \KK'.
' ,.y.
;
'
f .» . uff.
jtiK i-j: }:.(.. I-' 9»i Cit. -»•. *-,
V tV •i" Ai>I
„ . •di:;'".' frcO
t ^vt
>*:3X*'X‘v - ^ Tb. ft‘0':r'4.o;r .••:«. etS'^ -s-i
> •-'' ,',u ..ij v-.x, ., ;«< >.7t.. -- ..c's;:
, r i :.r‘/ .Iv.- ; : tt. I ^ '%'^tjkv
:'*.' .J8 .f
iti-t lMai» . ' ...;
< I !:
aO h'VI:• •:
i".
i^.'i 1 f’
'l .
’S^ u
,\ • ‘I :'.K ^'. . ., j
1 ). I*:' A
' I. tv',
f .K v!'i' ,^ :V..• J • i . tf}' '':l ,
' 0 ^'‘ ., . V ./Tj 4 ' • «
'J ':'' i'•" -
'
••'•'i "r.i •\.if:ii' -
.
'
• ..Tv— -J-
. L‘
J.*
‘.'^>
~ ' “» "T \'
< ,) ' , . --rvY :
:. »s ->:'f jj'XJ'Ti€-j, V^' .'''-.
r
) \:,'i;, :_ f. >-. ;*;•;••. a>.\ '
1‘
h^: ' ‘ Vu . 1
t -:' iji- -s 1 ,
W f. y,4.. ^
'. ’,
.
,;* ' * JL
-
Transformations are also used to achieve equality of
variance among groups, and quite often both equality of
variance and approximate normality are achieved simultane-
ously. Transformations are not recommended for uncritical
use by the amateur. From small amounts of data it is often
not clear which transformation to use and it is not always
clear whether or not the transformation has helped. Graph-
ical methods can be used to examine the effect of transfor-
mation see, for example [2]. The table following, taken
from a larger table of Bartlett [1], shows several frequently
used variance stabilizing transformations.
[1] M.S. Bartlett, "The Use of Transformations", Biometrics,Vol. 3, No. 1, 1947 (pp. 39-52).
[2] A. Hald, "Statistical Theory with Engineering Applications"John Wiley and Sons, New York, 1952.
-23 -
-
^ \*
'*>^^,U.-ti-^lH: '5'JJ4^i'^jiV’
-'?k£C3B,1 I U4tt r ^ r^ 1t'3n;«
-
of
Transformations
-24 -
-
5 . Notes on Statistical Computations .
5.1 Coding in statistical computations . Coding is the term
applied to arithmetical operations on the original data in-
tended to make the numbers easier to handle in computation.
The possible coding operations are:
(a) multiplication (or its inverse, division) to
change the order of magnitude of the recorded
numbers for computing purposes.
(b) addition (or its inverse, subtraction) of a
constant - applied to recorded numbers which are
nearly equal, to reduce the number of figures
which need be carried in computation.
Particularly when the recorded results contain non-significant
zeros, (e.g., numbers like .000121 or like 11,100) coding is
clearly desirable. There is obviously no point in copying
these zeros a large number of times, or in adding additional
useless zeros when squaring, etc. Of course, these results
-4 3could have been given as 121 x 10 or 11.1 x 10 in which
case coding for order of magnitude would not be necessary.
The purpose of coding is to save labor in computation.
On the other hand, the process of coding and decoding the re-
sults introdaces more opportunities for error in computation.
The decision of whether to code or not must be carefully
-25-
-
jxl 0jj>rtr-ft *’. ?*fft r-;. ;-.». -.jT t'vhao:-
L" (•'uc.X'-. ' ,;. it’% i-:, *ioJ; Ic; -* * r jui ( •?)
bci(J3‘'j.w".‘ ' "5.0 '‘:-b V.- j" ^ i
'
'if-' .;.f.^ *’i fbiiJ''
.
'•
,,q ;-./l '.' vT
s t .'•. : j\i.i: ^ j A tc^'
9T.S rl:’»..{^', r-,'i-£u/r-»iii, I';...' '.,' i .“ ' ..
cu>':ij;*r 1 ,t..' 4^^‘tsit sciS . ,
^1 ^ ^ iir.'n
t/i :l>JL •; r:-:'; ,.y. ••:>'> £• ' ’ ' . " ’ '’Ci ' ' ^ '. '..I'txiii t
' *
1 /^ :
aM'OO^' . v'.ti
»V^ ( • lo \ ‘ 1. j
ioqlOT • . •• VO ,..y^. 1 -r :. r;
rri:.'rrf^r r-.,v .. * ,. .. ,>
"
> vi iTv>v>^ t
. 'J,sj5»c^ 15 ’ .' * / 0 , '1 'leb’io ..>0 0 ^.VJJSO
. i rc: j :J, . ; ;• i !r\u(o :? * *• .t \
-
considered, weighing the advantage of saved labor against
the disadvantage of more likely mistakes. With this in mind
the following rules are given for coding and decoding:
1. The whole set of observed results must be treated alike.
2. The possible coding operations are the two general types
of arithmetic operations - (a) addition (or subtraction) and
(b) multiplication (or division) . Either, a err b or both to-
gether may be used as necessary to make the original numbers
more tractable
.
3. Needless to say, keep careful note of how the data have
been coded.
4. The desired computation is carried out on the coded data.
5. The process of decoding a computed result depends on the
computation that has been carried out, and therefore is shown
below separately for several common computations.
a. The mean is affected by every coding operation.
Therefore, apply the inverse operation and reverse
the order of coding to put the coded mean back into
original units. For example, if the data have been
coded by first multiplying by lOpOO and then sub-
tracting 120, decode mean by adding 120 and then
dividing by 10,000
-26 -
-
>
'
/
-‘^yirya:^ : ,-:'Ur% ^£ij:u-ci' :.vi ,,.:7
. e#, r^f^'THojfado ^:;.- r---s^/ '^^^^r^r^ ^cft
£'0-qt^ y-Vi v.:^' '4;... -; u •. y /
(noll:'-yrry:tif:y^ ,. i :» ir^f: ^'x mL^ Xii'- . (fsoi:R.r vi4 -io) -ioj:.^..- .^\ l:j ii.yi fy
vi: I'^fl. lUii^n / y T^-V^, \{ "T X r. .'j £•;_{'': .( .*> *:'., '; 4r! i^:^:JRi> ;-;d »*
', ‘I'
' r
. n.4 -.;L «.* • ‘ ,i£0 "% ' i‘. ' ' 'x^O'
^:-:" ivz'
.
Cl ' £:' ,>^ffOO
R AR rMJ4
4r A, ') a
>
I
.. #*
v-... OS I' £!.£::#.£
»v0^0
1
' h
'
'^\v L'lk*iiii4Jt''•4dl^.'.;
-
Decoding
:
Example
Obs. results Coded.0121 1.0130 10• 0125 5 Mean
Mean .0125 codedmean 5
Coded mean + 12010,000
125"ia;oo(T
.0125
b. A standard deviation computed on coded data is
affected only by multiplication or division. The
standard deviation is a measure of dispersion, like
the range, and is not affected by adding or sub-
tracting a constant to the whole set of data. There
fore, if the data have been coded by addition or
subtraction only, no adjustment is needed in the com
puted standard dev^iation. If the coding has in-
volved multiplication (or division) the inverse
operation must be applied to the computed standard
deviation to bring it back to original units.
c. A variance computed on coded data will have to be
multiplied by the square of the coding factor if
division has been used in coding; divided by the
square of the coding factor if multiplication was
used in coding.
d. ’’Coding” which involves loss of significant figures:
Notice that the kind of coding discussed so far has
not involved any loss in significant figures. There
is a process, however, called ’’coding” that involves
both coding and rounding . This usually arises where
the data are considered to be too-finely recorded
for the purpose. It is sometimes a desirable pro-
-27 -
-
1. ;' i 0 ; '
Mci 0.
•'3 .,'i bti-J i..q iJXV'=«L i, A
*f;T.: 0. \.) ^v'.^ +-. sj' S(.r^.' - ' ' ! r x‘j
^ '_> fj^ j.> i .*'
' '
* ^ .'1 ,;f< , ; i.f ^iV'rb’' ly^i^yr f. G.; ^
, r 1 ; .g>•V. - «•» 'iO riC-; .! ,. ' ->rf
'
:', ; •i ei:v. ' X i> r.'XJ
< , ; -, ; r' 't: ?• 1f
•C>0 > '
»
‘-.•'*'(^1 i
^ > ' • i ...-'
.^ :• 'I #5 fd"- % '.'Xivr D'Cl;J:';/':X * ‘J L.'?
>i i;. • .* ij r ..V 11 ;-•• •' < ^1
r / . 4 •."•V\-.. )(b. : - tXf j C' ' . ‘ ' : •:>!.?
'.
s .'• '-^r>-3
.
*,^ •_ 1.W I'lV . > »qo
. ' i 'i < :tO .a:i .rfr‘
., s
•JU, • '-- . i x.T'
*
f
TV-r.
- i. 1]
'
^^ b^rboo • ; . •; v:* ' V^ '\X . .i xo:^oy ^ 1
'
. qi->^ I'!: t:
jyNiv-Ji. ''-C'i' ; -; . i i>OCf- ilA . - '•Oiiw.tl. .w^« Cfi '. 2 i ^
.
•’
. v;... ., r -X j*iio,
^ ••.
'iV; 'P.Citc! . •.’ ^ •*, y? i*>n.^: f^'t 0.i
-
cedure but one should realize that there is more here involved
than just arithmetic. This type of coding involves a loss of
information and therefore its application requires more judgment.
For example suppose the data consisted of weights of
shipments (in pounds) of some bulk material:
7,123
10,056
100,310
5,097
543
#
etc
.
If the interest is in the average, and the range is
large, as here, one may decide to save trouble by working with
weights to the nearest hundred pounds. The ’’coded'* data would
then be
:
71
101
1003
51
5
etc
.
-28 -
-
}
:
' ,-
r,- - ,..Ar,„ vf6C :-.rd - .3
>*
1 /'J .-) '/» >' ' «.-'Q •'^'- „ 'C)^ s > j
.’'
•.^1
r-4i^:> *j»w ^h-£4'^ vj:”' Jitr 4si> :^i>J ,w'*' 3 - 'iVi
1 1 ;^.r'-
.'^ rr ^
• ? . '
•. V t.1
ios
ftOOj
Ui
o
•
.o;^o
-
Whether this is sufficient information will depend on the
range of the data and the uses to which the results will
be put. The only caution is that a higher order of judgment
is required than was required in the previous examples. The
grouping of data that is done in making a frequency distri-
bution is coding of this nature and does involve rounding.
5 . 2 Rounding in statistical computations
5.2.1 Rounding of numbers : Rounded numbers are inherent to
the process of reading and recording data. The readings of
an experimenter are rounded numbers to start with since all
measuring equipment is of limited accuracy. Often he records
the results to even less accuracy than is attainable with the
available equipment simply because they are completely ade-
quate for his immediate purpose. Computers are often required
to round numbers either to simplify the arithmetic calculations
or because it cannot be avoided, as when 3.1416 is used for II
or 1.414 is used for l/T~ . When a number is to be rounded to
a specific number of significant figures, the rounding proce-
dure should be carried out in accordance with the following
rule
:
-29-
-
* t
. K ’i
'vi'j 1. ^ L? • r: ., ' 1 • .: rtf:>'>iri ' a -.-r ’.’ 1 . . • i' K
UfjT Vp .;: .;i‘.
i» /. • V i flv.v. ^ >;?fiT . c» y
..Jl/. ; 1
1
::,/’
• - - :*'t> . - 4 “. C' i ';'
't.J 3 -ij, .^.JT ' ' -^rk •,; :> • , -
-. . r ^kn.K' ^3kfiu ,lo v=^t4?i: ns . : '>J: j j..-c;
ail 3 Ar- -r.-v ".1 * ~ r'
JSO V• f
ufi.-f^^.i -'..t juH
k •*:>fn . jSlifciy : *5,-'. 1 ’..*rO'’i'
. i. ]. ' 0
^kAs>iX54.i- -'i ;i:-;.; 'k ' v^>.:iTro-jn ^ '?i.o-
,S>i-'; ^ ; r . :^ « : atJr ii
' •"l.'lTi'v ..' ;H' .. . .. Vi - '. v . V
•s>^ na.v ' uqj5ToD .
acu- 4.'^ -'i.^ :j' 'va o:-.; ‘ ».'
T . .'O
n 'io^ -bj^r ’ rrii:'"^V i ' fi-. .fUii . -,u..«r V
uJ X/CfLi^: ri .1. k’ ' " ;cf.r-
; /ir, ^i;xixi.v/v. : . ,
^b TC'J .1'. :*•
u O c ^ *? - .0- . • ,X, •;.
rto\ ’i, ;
• > .f. i ' ' > / T
,
'*j c • •j. ^'‘ 1 * '. •,••• ! -I'/'i
' - •; t^vt
• •9^t\
7
i
-
(1) When the figure next beyond the last place to be re-
tained is less than 5 , the figure in the last place
retained should be kept unchanged.
(2) When the figure next beyond the last figure or place
to be retained is greater than 5, the figure in the
last place retained should be increased by 1.
(3) When the figure next beyond the last figure to be re-
tained is 5, and
(a) there are no figures or only zeros beyond this
5^ an odd figure in the last place to be retained
should be increased by 1; an even figure should
be kept unchanged;
(b) if the 5 is followed by any figures other than
zero, the figure in the last place to be re-
tained should be increased by 1 whether odd or
even
.
The final-recorded result should be rounded off in one
step to the most precise value available and not in two or
more steps of successive roundings.
5.2.2 Rounding the results of single arithmetic operations :
Nearly all numerical calculations arising in the problems of
everyday life are in some way approximate. The aim of the
-30 -
-
:'d of pfi ' fxov f>rv.hl\.i
.: 0i?l .^’. .'!-.i' S'
Lt S ^' '
r}jt Sri
f:v. -:-BV :/oi' ?. oo;.x »; ‘.--t
’/o } Jiisjs;' aiJ
ft
' > V
I• i V/[ CfO JJo rc ;. : •• i.. ; ; I r. «;
' i>j Khj.r.£ ,&tif • xsn , i n-ati .. ' '^'
5ii.i fj rjui.." ' :
;
j-ji:/''! \.lyx> ’^o nvgiX O/X '^i.s . y-'sjiij < ;i;
Ow ,si:.iii; i , . ; S" = .1 •i:T I'i r-,^M -I .^r,. -.
VikhCi^y n ;1,
M J,
:.>. '•.:. fi :i'
? V... 'fiir 1 yo ^v/oi
c.» ">'.."vv •• _£|' «!• ) or:., . .. .
'. .;/! I I .
. f \ \
'TO . \ v/r'o i At 03., ' i r jU :
.9£iw .IJ. Oidd' J3 r ' M’ J'^0 J JO,
'x,v om *i . on r:
^'•J. 'V . T
-
computer should be to obtain results consistent with the
data with a minimum of labor. He can be guided in the various
arithmetical operations by some basic rules regarding signi-
ficant figures and the rounding of data.
1. Addition : When several approximate numbers are to be added^
the sum should be rounded to the number of decimal places
(not significant figures) no greater than in the addend
which has the smallest number of decimal places.
Although the result is determined by the least
accurate of the numbers entering the operation,
one more decimal place in the more accurate
numbers should be retained thus eliminating
inherent errors in the numbers
.
2. Subtraction : When one approximate number is to be
subtracted from another, they must both be rounded off
to the same place before subtracting.
Errors arising from the subtraction of nearly
equal approximate numbers are frequent and
troublesome often making the computation prac-
tically worthless. They can sometimes be avoided
when the two nearly equal numbers can be approx-
imated to more significant digits.
-31 -
-
: Vf ->4 I ;• i ii"* , \e „. -f r" .f
/. . -r. -j V?:. ' v.
•OO.-* va V. ji- L ' ‘.
U3V. 3.r
i.-iil. t /. •-» 'j" _• :fi j : :' V
> :• .
> - - nt>dV
t> '?>>/! .’•**
tsfyiia uu'* .•»: ?
• i iii i r ; •
:2 * ;•
.,ni. -*..i •• •.. ••':) ^i tfl': ;; ' ''.'jiJV
.,; X.T jiAr.-/; ... '.:: I : l
^ *£' jr 1 n^uo r » . r I \
r ^.o
•/ tiu'i &ftO
', ^
1
p ‘
' A ,
,;.. u .r.C'.X'- . -io A, :-i h-?-'=
.-.'4 j;! X* .t ....•' C .*
j :•? • • *;»jj .1 . •. ;. ,j»'i'
> * - ;/' . ;_
. o .C l ^.‘Oc.
.1 '^
;
: !/ . v^o'itS
Ov:^' 1 ^1 j' -• .rx.-jf:;Jr ' • ^ « '!T7 ' lexj^ve-
-
3.
Multiplication
:
If the less accurate of two approximate
numbers contains n significant digits, their product can
be relied upon for n digits only, and should not be written
with more.
As a practical working plan, carry intermediate
computations out in full and round off the final
result in accordance with the rule stated above.
4.
Division : If the less accurate of either the dividend or
the divisor contains n significant digits, their quotient
can be relied upon for n figures only and should not be
written with more.
Carry intermediate computations out in full and
round off the final result in accordance with the
rule stated above
.
5. Powers and Roots : If an approximate number contains n
significant digits, its power can be relied upon for n
digits only; its root can be relied upon for at least
n digits.
6. Logarithms : If the mantissa of the logarithm in an
n-place log table is not in error by more than two units
in the last significant figure, the antilog is correct
to n-1 significant figures.
The foregoing statements are rules only. Fuller expla-
-32 -
-
.r-:., w.
'
^ " ;,v’^ .^ iT>
M . l( ij . { " ' *. **, '!' “--
1.1 m- „
fffiso^xln^e jx Bal,/z:fiiOo e^z^cftrajr.«f*
at..-aTOBT fl4 fw
•••. i'
,''-^^ v':f. •: • > .r
_v.* ,_
* ' • ' • ,.'•»• < r
^
'.w J'ftfi >iX '^iijQ aooXu jS-T-uaiiic^ -
" avoci ;; sgf i-
*«7*
;.,t«*. 1
3*0^^^ t npraivXOyT' 'ji ' ' -.
Jnz4 foup n eaj!:;$inof> TOai^ib c»if4'W:t^ou bXuoxJfe yXao 1^0%, n poXis'T oJ a^o
Vf «*tom rii XV X iw' B
^>uK ( L'sJt el ‘^uo aaoXjs:^pr^ f^jnii^^ta/rSftAi y rxti'J
^(iS tiXiw epnabTLOpo^ «r 4/w 'st .1^/1 jt aiu x'Jbo b:tiioi">
','
'
ir«r;_
' *•-,
'' '
** y^’^TL -""jT ,,. - .•^»v*:,’5.n 0;Xi/*x
i*>^ V " ‘ S'*
’ • n Prti^.*a4c* ^&^fn,ifu ^J/^!nj::KcrqqB ax: jrl ha^ »’34
-
nations of the rules, tests for determining the accuracy of
computations when it is desirable to attain or to state a
certain degree of precision, and the effects of rounding on
statistical analyses of large numbers of observations are
ably handled by Scarborough [ i ] and Eisenhart [ 2 1
•
5.2.3 Rounding the results of a series of arithmetic operations:
Most engineers and physical scientists well know the rules for
reporting a result to the proper number of significant figures.
From a computational point of view, they know these rules too
well. It is perfectly true, for example, that a product of two
numbers should be reported to the same number of significant
figures as the least accurate of the two numbers. It is not so
true that the two numbers should be rounded to the same number
of significant figures before multiplication. A better rule is
to round the more accurate number to one more figure than the
less accurate one and then round the product to the same number
as the less accurate one. The great emphasis against reporting
more figures than are reliable has led to a prejudice against
carrying enough figures in computation.
Assuming that the reader is familiar with the rules of the
preceding section regarding significant figures in a single
arithmetical operation, this section will stress the less wcil-
-33 -
-
I, ' n ' *1
>T'v'i''(
1^'.'Hr V|*
U ^
‘
4‘
,.v ... %r >
'
> : l.i z;.
•'
i- iV i «» -o# ?.o a. o : --.io lu it-r ^*:ts.Iv i...;, , ;r v.v:o' o
>v-..\'. cai#€>'l .feO H-t -J .-*Xla 3LU '• '• : . -rr; il/-
Jt'3;jKV%
’sacl 40 ’M7 :^v> J^.r.f»‘i4.: t7 is'^ P.
]C ] . l \ . r . *;.i; ,1 ' .'-i.d.
.ao -ii. j f>in,ii j v.r; lo a 3o ^ej ’ s-r ‘- 3 v ."• .’•
wv>rr>. 1 .%."» ••’i I •;:£> -.
1 ayii'o t. X i. $ a It’' ^
•
*,
' . » X * /* y :» .’
oom a'f/r •:.*£ won;-- ^.v fio taioq -r «-. .#-.v..
I'o •. ,» . ..,{»jL'c;i../.* c;. jajstJ.; ,nj«i ' v :; - .;
'.:-AX !9t1i beu?nuf' ^ o*\5 c i
i- i bj.iixxat i;'ii. .terfj . , :
o
x!0 ,^ . x-.J I'tvrw: .^: - •:, ^^3
w :.0 3ftr.Co% 9liS :*;? ;tw l
-
known difficulties which arise in a computation consisting of
a long series of different arithmetic operations. In such a
computation, strict adherence to the rules at each stage can
wipe out all meaning from the final results.
For example in computing the slope of a straight line
fitted to data with 3 significant figures, one would surely
not report the slope to 7 significant figures, but if one
rounds to 3 significant figures, after each necessary step
in the computation, one might end up with no significant
figures in the slope.
It is easily demonstrated by carrying out a few compu-
tations of this nature that there is real danger of losing all
significance by too-strict adherence to the rules, devised for
use at the final stage. The greatest trouble of this kind
comes where we must subtract two nearly equal numbers, and many
statistical computations involve such a step.
The rules generally given for rounding off were given
in a period when the average was the only property of interest
in a set of data. Reasonable rounding does little damage to
the average. Almost always now, however, we calculate the
standard deviation, and this statistic does suffer from too-
strict rounding. Suppose we have a set of numbers:3.13.23.3
Avg. * 3.2
-34 -
-
Ic . '>:•• . r »q n ir xi'liw SGi ^ L'- j''
a fil.’.'
. ,*:> "'T' .1 ;. .•..; i .' '..
•/ X a’Xc-: .P
x;'^ •'tr ... j: . ... «-;••' 1
'*
;v‘**. 'T'r ^
,eriw.-r>i T- ^ ' .-r '• .
jJ~,: -^
}
‘-.. 'I'.. .Ji '.>' ...ivt::-:>
rr' ^iird'^
-^./i .• o yP3 ' 'v- '^i; .
y 5 >; C' “.r.f t-rl r.*''"^/ -
:.C-.
Si 4 ?J
:inc' -Tf*-
0^1 i 1-
—'i '.V#tr 1
.-, .--
«'.--b»x/::-; •, *;?|'
• v> J; 5:iie '
. / . lo.eiiofi?.
•ioi; ';j SOiiC“5 :;:/^4i:e.:'•
•
i' r: ;aj*:,!' a€>Xv-
>.J V V.''T \V‘ -*i.; •
oi’;* > r:H i I..I. - .. X /-.iv5^ .'Xv xrri .x/x..
-
If the three numbers are rounded off to 1 significant figure,
they are then all the same. The average of the rounded
figures is the same as the rounded average of the original
figures, but all information about the variation in the
original numbers is lost.
The generally recommended procedure is to carry two
or three extra figures throughout the computation and then
to round off the final reported answer (e.g., standard devia-
tion, slope of a line, etc.) to a number of significant figures
consistent with the original data.
[1] J.B. Scarborough, Chapter I - Numerical MathematicalAnalysis , the Johns Hopkins Press, Baltimore (1930)
.
[2] C. Eisenhart, Chapter 4 - Techniques of StatisticalAnalysis, McGraw-Hill Book Co., New York (194'/).
-35-
-
./ /V\4- 4 I,' rwt': : >'. J
i. ,'3rf>„;
V.:;:.. ;:^i. ' /r: '^X..r , v/V
' 0(t*/ i tv : . A , .. ^ ,::> / ;r
^
^ r-y .:ts.^- r-::.\'^ r . -•^
1' > ' ^ .^^’iiIi !i X - JuO ; .;> i:.'\/-,...: !’'ir , I ; *^'r'XlL^a ^'t ?'
. .''ii 1'i!'
,-;„ V.- '.'.'.J j jiv»
- » ; » •4- .
-
U. S. DEPARTMENT OF COMMERCELbwis L. StfausB, Secretary
NATIONAL BURfeAU OF STANDARDSA. V. Ailih, Director
THE IVATIOIVAL R1JREA1J OF STANRARRSThe scope of activities of the National Bureau of Standards at its headquarters in Washington,
D. C., Snd its major laboratories in Boulder, Colo., is suggested in the following listing of the
divisions and sections engaged in technical Work. In general, each section carries out specialized
research, development, and engineering in the field indicated hy its title. A brief 'description ofthe activities, and of the resultant publications, appears on the inside front cover.
WASHINGTON, D. C.• Kl©45ti*ldty and KI©Ctl*onlcs« Resistance and Reactance. Electron Devices. Electrical In*
strumentsi Magnetic Measurements. Dielectrics.. Engineering Electronics. Electronic Instru*
mentation. Electrochemistry.
Optics and ]Hotrolo|ty« Photometry and Colorimetry. Optical Instruments. PhotographicTechnology. Length. Engineering Metrology.
Heat* Temperature Physics. Thermodynamics. Cryogenic Physics. Rheology. Engine Fuels.Free Radicals Research.
Atomic and Radiation Physics* Spectroscopy. Radiometry. Mass Spectrometry. SolidState Physics. Electron Physics. Atomic Physics. Neutron Physics. Radiation Theory.
Radioactivity. X-rays. High Energy Radiation. Nucleonic Instrumentation. Radiological
Equipment. '
Chemistry* Organic Coatings, ’^urface Chemistry. Organic Chemistry. Analytical Chemistry.Inorganic Chemistry. Electrodeposition. Molecular Structure and Properties of Gases. Physical
Chemistry. Thermochemistry. Spectrochemistry. Pure Substances.
IMechanics* Sound. Mechanical Instruments. ' Fluid Mechanics. Engineering Mechanics. Massand Scale. Capacity, Density, and Fluid Meters. Combustion Controls.
Organic and FIhrous .materials* Rubber. Textiles. Paper. Leather. Testing andSpecifications. Polymer Structure. Plastics. Dental Research,
metallurgy* Thermal Metallurgy. Chemical Metallurgy. Mechanical Metallurgy. Corrosion.Metal Physics.
mineral Products* Engineering Ceramics. Glass. Refractories, Enameled Metals. ConcretingMaterials. Constitution and Microstructure.
Building Technology* Structural Engineering. Fire Protection. Air Conditioning, Heating,and Refrigeration. Floor, Roof, and Wall Coverings. Codes and Safety Standards. Heat Transfer.
Applied mathematics* Numerical Analysis. Computation. Statistical Engineering. Mathe-matical Physics.
Rata Processing Systems* SEAC Engineering Group. Components and Techniques. DigitalCircuitry. Digital Systems. Anolog Systems. Application Engineering.
• Office of Basic Instrumentation. • Office of Weights and Measures.
BOrUiER, COLORAIIOI
I'ryogenic Engineering* Cryogenic Equipment. Cryogenic Pr«)cesses. P.’operlies of Mate-rials. Gas Liquefaction.
Rlldio Pr
-
.^ ' h.i ' : ^
i