on errors of measurement

9

Click here to load reader

Upload: m-jilek

Post on 06-Jun-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On Errors of Measurement

I Czechoslovak Academy of Sciencee. h t i t u t e of Miombiology

Biom. J. 24 - 1082 . no. 6, I08 -501

On Errors of Measurement

M. J~LEK

Abstract

When any pro- of meaeuring is coneidered, one of the besic questions ia how to ~ 8 8 e ~ ~ the precision of meaeurement methods and/or inetrumente. In this paper, this question ie formulated end solved aa a problem of tolerance regiona for abeolute and relative normaly distributed errors of measurements.

Key w o r h : Precision, absolute errors of measurement, normal distribution, tolerance regions.

1. Introduction

One of the most frequent questions, arising when the precision is investigated of any method of measurement and/or measurement instrument in any labora- tory, is that of the largest possible or allowable error of measurement, absolute or relative. It seems that, so far, no adequate answer to this question has been suggested.

(1.1)

Most frequently the interval

(-u ((1 +B)/2) 0, +u ((1 +8)/2) 0 )

is used, where B is B given real number ( O < j ? c 1). u(p) is the p-quantile of stan- dard normal distribution N ( 0 , 1) defined by the equation

-- and u is the standard deviation of distribution of random error~. Implicit in this approach is the assumption that these errors are normally N ( 0 , u2) distributed, which holds, a t least approximately, very often, but not always. If the normality assumption was fulfilled and 02 was h o p aa well, 100s percent of population of random errors would fall in the interval (1.1) and the answer to the above question could be secured. Mostly, however, u2 is not known and must thus be estimated. Especially when complex measurement methods are implemented 33 Biom. J.. vol. 24, no. 6

Page 2: On Errors of Measurement

494 M. &EK

for the investigation of expensive or scarce materials, experimental data to be ueed for this estimate are not very numemus - and it is well-known that the variability of point estimates of variance is by no means negligible.

2. Formulation of the problem

Let us assume that the X’s obtained with the use of the measuring method under consideration are i.i.d. normal random variables with a common mean p and variance u2.

The absolute error of measurement d is usually defined &8

(2.1) A = X - ( L ,

while the relative error of measurement 6 aa

6=- x - p (/A*’).

P (2.2)

Then, clearly, A “(0, u2), and b - N ( O , u2/p2).

In the literature, various measures are employed to amem the precision of measurement - the variance, the standard deviation, and their estimates being those most often ueed. The employment of these statietics for the comparisons of either two or more m e t h e or devices, or of work of diverse laboratories (see, e.g., MAUBICE and WIQGEW, 1959, MANDEL, 1971, GBIJBBS, 1973) is appropriate. On the other hand, their application is not s~~clear -cu t when the precision of one method is concerned, and they do not give a reply to the above-mentioned question of the experimenters. Nor can be confidence intervals for these para- meters satisfactorily interpreted.

The only rmmnable answer to given question is provided by (/?-content) tole- rance regions for measurement errors: For given /I and y we try to find such region, based on n independent observations of a random variable X, which contains with probability y at least 100 /I O/o of all future (potential) random errors A’s or 6’s. In other words, for given positive integer n s 2 and real /I, y ( O - z B , y < 1) we look for such statietics D and d that

(2-3) Pr{X,, . . . , X , ) : P r { A : / A I s D ( X , , . . . , X , ) } ~ / ! ? } = y or

(2.4)

respectively.

Note 1 : The mean of n repeated meaeurements of the same value is sometimes presented ~ F J a reeult of some process of measurement, instead of individual measured values. In this case, the problem of finding the range of random errors is only a slight modification of the original problem.

Pr {Xi, . . . , XJ: Pr (6: 161 s d ( X , , . * . , X,)}iZ/9}=r,

Page 3: On Errors of Measurement

Errors of measurement 495

3. Tolerance regions for absolute enom

It is easy to find tolerance regions fulfilling (2.3) for absolute errors A’s defined by (2.1) because A’s are normally distributed random variables with mean 0 and variance a2:

U the variance u2 is known, then, with probability 1,

(3.1) Pr {A : Id I su (( 1 +B)/2) a} =B . If the variance 02 is unknown (which is, in fact, usually the case), then it is

eaay to aee that

(3.2) D(ziJ * * ’ 9 ZJ, y)=gS-i(B, 7 )

in the caae of unknown p, or

(3.3) D ( z , 7 ’ ’ ’ I zm; @, 7 ) =Km(! , y ) ’ ’,” in the case of known p, respectively, where

1/1

8,= { 72-l i:i 2 ( X i - 3 ,

II

f = n - i 2 x i , 1-i

and K is the so-called “tolerance factor” which can be calculated using the following formula

(3.4) q p , y)=u ((1+8)/2) [x ; (1 -7)/f1-i’2 where x; (1 - 7 ) is the (1 -7)-quantile of the y2-distribution with f degrees of freedom (factors K were tebul~ted - see, e.g., PROSCHAN, 1953, J ~ K and L h B , 1960, DIEM, 1962, M-EB, NEUMANN and STOBM, 1973 - or may be easily cal- culated by the use of common statietical tables).

Note 2 : As we already mentioned at the end of Section 2, we are sometimeti inter& not in individual observed values but in the average of m repea,ted observations. If we denote the absolute error of this average by A,,, and the corresponding tolerance limits of A,,, by Om, then clearly D,, ,=h-in.

With the study of precision of some methods, the situation is more compli- cated: if only a few repeated observation8 can be carried out under homogeneous conditions, however, 8 greater number of such small samples are available. If we can suppose that a constant variance is maintained, and if we have k ran- dom samples (zit, . . . , zCi) with ample size ni( e 2) from normal distribution N ( p i , &), i = 1, . . . , k, then the problem is solved by the stitistic

( 3 . W D + ( Z l l , . . , Z h k ) = g ~ - k (B, 7 ) ‘ a + 33.

Page 4: On Errors of Measurement

496 M. J~LSK

in the cwe of unknown pi)s, and

in t,he case of known pis, respectively, where k

i=l N = Z: ni,

i - i j - i

t m i

i - 1 j - I { N - i z z (zij-pi)* a+,,=

and "i

j - 1 z,=n;' 2 xij, i= 1, . . . , k

E x a m p l e 1. The precision of r a

112

ioimmunoaaaay is to be estimated. The immunological test makes i t possible to follow the presence of antiimmunoglo- bulin antibodies (rheumatoid factor) in sera of pregnant women during the first trimeatre. (a) We have chosen /? = y = 0.95. (b) Data in Tab. 1 (percentage of radioactively labelled antibodies against human IgG or IgM) have been obtained from experiments carried out by an experienced worker (REJ-NEK, 1979): The first three data in the first line are results of parallel analyses of aera of one pregnant woman, the next three data,

Table 1. Experimental data

25.61 27.35 28.29 24.73 26.01 26.59 25.34 25.48

27.88 29.60 27.71 22.42 30.46 28.73 26.41 25.08

21.18 27.9 1 31.10 23.05 26.36 28.82 26.77 26.36

27.08 23.89 28.14 28.39 25.54 26.23 25.33 30.02 28.71 22.82 29.95 30.43 27.45 26.74 24.72 26.69 24.41 29.48 22.72 28.75 27.47 26.86 26.77 30.86

30.31 27.87

29.98 30.76 30.15 30.37 28.49 26'.62 28.01 28.88

28.21 31.97 29.68 26.99 29.25 24.69 27.70 30.26

29.92 29.15 28.80 27.54 33.07 29.66 28.72 26.56

18.18 17.22 17.40 18.32 16.98 17.90 18.32 15.72

22.47 19.92 16.06 19.96 19.62 21.02 17.03 18.25

17.48 18.96 16.04 18.22 18.23 . 19.68 16.98 14.80

17.26 20.62 24.68 18.25 17.63 19.28 19.63 20.95 18.53 21.38 17.64 15.49 17.25 16.67 18.86 17.77 24.70 19.63 19.89 16.63 19.74 18.15 19.92 16.45

15.13 16.74

18.32 14.81 14.79 17.66 15.66 16.82 17.66 19.25

18.39 17.47 19.91 18.45 16.64 16.98 17.28 16.41

18.86 22.38 17.78 18.67 13.91 13.28 16.06 16.93

Page 5: On Errors of Measurement

Errors of meesurement 4 97

are the same results in a second woman, while the last four data are the results of parallel analyses of mixture of control sera of nonpregnant women. In the eecond line, the first three data were obtained in the third pregnant woman, etc. (c) A tentative test of normality (QUESENBEBRY, WHITAKEB and DICKENS, 1976) has not rejected the normality assumption. (d) COCHRAN'S test (LIKES and LAGA, 1978) carried out for these samples (size 3 or 4) has not rejected the hypothesis of common variance at the level of significance 0.05. (e) The estimate of the common variance, based on 104 degrees of freedom, equals to 3.4754; standard deviation 8+ = 1.8642. (f) According to (3.4) the tolerance factor was calculated: KiO4 (0.95, 0.95)= = 2.2145. (g) According to (3.2a)

D+(Zi i , . . . , ~ h , ; 0.95, 0.95)=4.1283.

Conclusion : Under given experimental conditions, the difference of at least, 95% of observations of random veriables from the actual values will be less than 4.13. This estimate is valid with probability 0.95.

4. Tolerance regions for relative errors

The distribution of random variable 6, defined by (2.2), depends on the variance u2 of random variable X as well as on it.s mean p.

As a matter of course, a finite length is required of the interval which contains most relative errors with chosen confidence. This requirement makes i t difficult to achieve the same degree of generality as for absolute errors. To assure the finiteness of this interval, i t is necessary to amume that p ~ f O - in what follows, we shall assume that p w o . This assumption is, however, not sufficient when the value of p is not known.

If both parameters p and a2 are known, the solution is easy: the bounds for the interval, containing 100 @ O/o of all potential relative errors with probability 1, are :

*u ((1 +B) l2 ) a l p . The solution is more difficult if at least one of the parameters is unknown:

the statistic d for respective cams is given in Tab. 2, the formulae being derived in Section 5. Let us note that if p is unknown i t is necessary to know or to find, on the basis of preceding analysis of measurement procedure, such positive number c, that pzc=-O, if the finiteness of d is to be assured (in spective of whether 02 is k n o m or not).

E > O , O<a-= l ) , In Tab. 2 the following notation is used: For given t , 6 and a

1 / t i i ( t ; a), i f t z t , ( l ; a), l/t otherwise, (4.4) t ( t ; E ; a) =

Page 6: On Errors of Measurement

498 M. J ~ E H

Table 2. Statistic d(z,, . . . , zn) when p and/or a2 are unknown

.(4.3) unknown unknown u

where yl + y z = 1 - y (O-cy , -= 112, O-=yz< 1)

where t,,(t; a) is the (1 -a)-quantile of non-central t-distribution with the para- meter of non-centrality (n-1‘2 and (n - 1 ) degrees of freedom, and t , ,-I(t; a) is such value 5‘ that t,,(t’; a) = t for given t .

N o t e 3 : The number c affects the limits of relative errors only in cases when sample values used to determine the tolerance regions are unexpectedly low; I t is therefore not necessary to be too meticulous in its choice.

N o t e 4 : Having in mind the mode of tabulation of corresponding distributions in available literature (e.g. JOHNSON and WELCH, 1939, OWEN, 1962), i t is appro- priate to use y , = y2 = ( 1 - y)/2 if we do not know sufficiently enough, in advance, the method used. On the other hand, if we know in advance that the mean of experimental data is manifold greater than their standard deviation, i t suffices to choose y 2 small. We may, however, encounter some difficulties if commonly used values of y , such as 0.90, 0.95, 0.99 etc. are to be met. If we do not need exactly these values, we can choose e.g. y = 0.94, y , = 0.05, yz = 0.01, or y = 0.989, yl=O.O1, y2=0.001, etc.

E x a m p l e 2. The amount of sulphur in certain material is determined by the combustion method. Jn a homogeneous sample of the material 16 parallel measu- rements were performed and the following values (percentage of sulphur in the material) were obtained (EUKSUELUOEB, 1969): 8.60, 8.12, 8.36, 8.67, 8.96, 8.62, 8.61, 8.75, 8.69, 8.72, 8.74, 9.01, 8.50, 8.40, 8.03, 8.30 (3=8.5675, ~ = 0 . 2 7 1 3 ) , and the range of relative errors of this method is to be found. We assume normal distribution of the measured variable, with both mean and variance unknown. Furthermore, we aasume that, on average, the mean amount of sulphur is greater or equal t o 8%, i.e. c = 8, and we choose /I = 0.95, y = 0.949 (yl = 0.05, y2 = 0.001). For y l = 0.05 and 5 = ( c / 8 ) { ~ ~ - ~ ( 0 . 0 0 1 ) / ( n - 1))”’ = 14.208643 we find tt6 (14.208643; 0.05)=81.776038; as t = 126.317730r81.776038, thereis r(126.3; 14.2086; 0.05)= =1/87.843892=0.011384 according to (4.4); u ( ( 1 +0.95)/2) = 1.959964, andthere- fore d(zl, . . . , z,,)=1.959964 - 0.011384=0.022312. - Conclusion: It is esti- mated that the relative error of the combustion method of sulphur is (with given probability levels @ = 0.95, y = 0.949) not greater than 0.0223 (or 2.23%).

Page 7: On Errors of Measurement

Errors of measurement 499

5. Proofs of results given in Section 4

(a) Let us find the statistic d(z i , . . . , z,,) fulfilling (2.4) in the case of p>O, known, 02 unknown : The left-hand side of (2.4) is in this case equal to

which may be rewritten as

This probability should be equal to y ; hence, if we set [@(z,, . . . , z,,)/u ((I +/?)/2]2 equal to the upper limit of the one-sided y-confidence interval for a2, we obtain (4.1). (b) In the case of known a2 but unknown p, we get from (2.4) -under assump- tion that d(zl, . . . , z,,)>O - that with probability y the following inequality holds

thereby the original problem waa replaced by the problem of determination of lower one-sided confidence limit for the mean value. It is well known that this equals to

z -u (y ) m-i'z .

p z m a x [e; ~ - u ( y ) m-1'2] . On the other hand, i t is assumed that p z c , so

(5.3) If right-hand sides of inequalities'(5.2) and (5.3) are compared, (4.2) is obtained. (c) If neither p 'nor a2 is known, then the problem of finding the statistic d(s,, . . . , 2,) fulfilling (2.4) leads to the problem of finding the upper confidence limit of the variation coefficient. However, as shown by KOOPMLLNS, OWEN and ROSENBLATT (1964, Theorem l ) , confidence interval of variation coefficient with finite length can be ensured only when sequential procedure is used. In many experimental situations, however, sequential sample scheme cannot be employed. If, however, aforementioned assumption concerning p, viz. p s c where c=-0 is

Page 8: On Errors of Measurement

500 M. JfLEK

known can be made, then (according to Khpmans, OWIEN and ROBENBUTT, 1964, Theorem 2) confidence interval for variation coefficient can be found, which has a finite length with probability 1. The proof of this theorem makes it poaaible to construct such an interval. We shall uae it when deriving the formuls

As in the caw (a), (5.1) can be rewritten as (4.3) :

As this probability is to equal y, t he original problem was actually replaced by the problem of finding upper one-sided confidence limits for variation coefficient. If the construction proof of Theorem 2 in (KOOPMANS, OWEN and ROSENBLATT, 1964) is followed, then following expression is obtained:

U (5.5) Pr {(zl, . . . , zn): - P

If (5.4) and (5.5) are compared, it can be easily envisaged that d(zl, . . . ,zn) gven by (4.3) satisfies (2.4), if the equality in (2.4) is replaced by an inequality ( n)

Zusammenfeeeung

Zn den grundeiitzlichen Fragen bei der Betrachtung von MeD-Prozeaeen gehoren die Fragen iiber die Gensuigkeit der MeBmethoden und der Apparatur. In dieeem Beitrag kt dime Frage als daa Problem dea Tolenrnzbereichea f i n die absolut und relativ zufiilligen normalverteilten MeBfehler formuliert und geiiist.

References

DIEM, K. (4.1, 1962: Docnmenta Geigy. Scientific Tables, 6th ed. Basel: J. R. G e b . ECIKSCIUAOEE. K., 1969: Errore. Measurement and Resulta in Chemical Analysis. London: Van

GEUBBS, F. E., 1973: Errors of measurement, precision, accuracy and the statistical comparison

J ~ E K , M., 0. Lbk. 1960: Tolerance regions of the normal distribution with known p , unknown

JOHNSON, N. L., B. L. Wmm, 1939: Applicatiom of the non-central t-distribution. Biometrika

KOOPMANS, L. H., D. B. O m , J. I. ROSENBLATT, 1964: Confidence intervals for the coefficients

Lrrr~S, J., J. LAGA. 1978: Baaic Statiatical Tables (in Czech). Prague: SNTL ~ N D E L , J., 1971 : Repeatability and reproducibility. Materials Research and Standard8 11 (8),

MAURICE, M. J., B. G. WIQOER~, 1959: Uber die Vergleichung der Genauigkeiten von zwei Ana-

M ~ ; L L E ~ P. H., P. N ~ m m , R. STOBM, 1973: Tafeln der mathematkchen Statist&. Leipzig:

OWEN, D. B.. 1962: Handbook of Stetistical Tables. Reading, Mass.: Addison-Wesley.

Noetrand.

of memuring instruments. Technometrica 16, 53-66.

02. Biom. Z. 2, 204-209.

81, 362-381.

of variation for the normal and log normal distribution. Biometrika 61, 25-32.

8-16.

lyeenmethoden. Freaenius’ 2. Anal. Chem. 168, 335-339.

VEB Fachbuchverlag.

Page 9: On Errors of Measurement

Errors of meaeurement 501

PROSCHAN, F., 1053: Confidence and tolerance intervals for the normal distribution. J. Amer.

QIJESEHBEBBY. C. B., T. B. WBITAXEB, J. W. DICKENS. 1976: On testing normality using several

REJNEK, J., 1979: Unpublished data.

Statist. Aeeoc. 48, 560-664.

earnplea: 8n 8ndJ’EiE Of pe8nUt a f 1 8 h X h date. Biometrica 83, 753-759.

U n m l p t reeelved: 8.1. I981 Author’s address:

Dipl.-Ing. MIL.O& J~LEK, CSC. Czechoslovak Academy of Sciences Institute of Microbiology VidebkB 1083 CS - 14220 Prague 4 Czechoslovakia