outlier handout

2
Some simpler statistical tests for rejecting outliers in quantitative data* Large data sets (N>100):     For small data sets: 1  Rule of Huge Error: If you have a single outlier, then you can dis card it with 98% confidence if any of the following conditions are met. | |   5 8 6   8 14 5   15 4  Dixon’s Q-test: If you have a single outlier, and your data has a normal dis tribution, then you can discard the outlier if . Order the data values in increasing or decreasing order, such that the outlier is the final data point (x  N ). 3 7   8 10   11 13  *Compiled from: -Personal webpage of Prof. James K. Hardy, Dept. of Chemistry, University of Akron, “Statistical Treat ment of Data” at http://ull.chemistry.uakron.edu/analytical/Statistics/. This has good notes for basic statistics and refers to specific tests for the rejection of data and discusses large and small sample sets. -“Dixon's Q-test: Detection of a single outlier”, which includes an Applet for doing Q-test calculations and a brief discussion on rejecting data from small data sets, on the University of Athen’s Department of Chemistry website at http:// www.chem.uoa.gr/Applets/AppletQtest/Appl_Qtest2.h tml.  Note: although much of the dep artment’s website is in Greek, this page is in English. -“Statistical Treatment of Analytical Data: Outliers (Chapter 6)” by Z.B. Alfassi, Z. Boger and Y. Ronen. CRC P ress: 2005. This chapter is available for reading through Google books if your library doesn’t have a c opy.

Upload: hibonardo

Post on 04-Jun-2018

244 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Outlier Handout

8/13/2019 Outlier Handout

http://slidepdf.com/reader/full/outlier-handout 1/2

Some simpler statistical tests for rejecting outliers in quantitative data*

Large data sets (N>100):

 

∑    

For small data sets:

∑ 1  

Rule of Huge Error: If you have a single outlier, then you can discard it with 98% confidence ifany of the following conditions are met.

| |  

5 8 6 

8 14 5 

15 4 

Dixon’s Q-test: If you have a single outlier, and your data has a normal distribution, then you candiscard the outlier if . Order the data values in increasing or decreasing order, such thatthe outlier is the final data point (x N).

3 7  

8 10  

11 13

 

*Compiled from:-Personal webpage of Prof. James K. Hardy, Dept. of Chemistry, University of Akron, “Statistical Treatment of Data” at

http://ull.chemistry.uakron.edu/analytical/Statistics/. This has good notes for basic statistics and refers to specific tests for the rejection of data and

discusses large and small sample sets.

-“Dixon's Q-test: Detection of a single outlier”, which includes an Applet for doing Q-test calculations and a brief discussion on rejecting data fromsmall data sets, on the University of Athen’s Department of Chemistry website at http://www.chem.uoa.gr/Applets/AppletQtest/Appl_Qtest2.html.

 Note: although much of the department’s website is in Greek, this page is in English.

-“Statistical Treatment of Analytical Data: Outliers (Chapter 6)” by Z.B. Alfassi, Z. Boger and Y. Ronen. CRC Press: 2005. This chapter is available

for reading through Google books if your library doesn’t have a copy.

Page 2: Outlier Handout

8/13/2019 Outlier Handout

http://slidepdf.com/reader/full/outlier-handout 2/2

 

Grubbs’ T-test: This test can be used to evaluate multiple possible outliers. Start with the furthestoutlier, | | , and discard it if T > Tcrit.

| |  

If you discard the outlier, and suspect others, then recalculate  and s in order to evaluate the next

furthest point.

Qcrit Values for Dixon's Q-test Outliers

Data points

N 0.5 1 5 10

3 0.994 0.988 0.941 0.886

4 0.926 0.889 0.765 0.679

5 0.821 0.780 0.642 0.557

6 0.740 0.698 0.560 0.482

7 0.680 0.637 0.507 0.434

8 0.725 0.683 0.554 0.479

9 0.677 0.635 0.512 0.441

10 0.639 0.597 0.477 0.409

11 0.713 0.679 0.576 0.517

12 0.675 0.642 0.546 0.490

13 0.649 0.615 0.521 0.467

Risk of false rejection (%)

Tcrit Values for Grubbs' T-test for Outliers

Data points

N 0.1 0.5 1 5 10

3 1.155 1.155 1.155 1.153 1.148

4 1.496 1.496 1.492 1.463 1.425

5 1.780 1.764 1.749 1.672 1.602

6 2.011 1.973 1.944 1.822 1.729

7 2.201 2.139 2.097 1.938 1.828

8 2.358 2.274 2.221 2.032 1.909

9 2.492 2.387 2.323 2.110 1.977

10 2.606 2.482 2.410 2.176 2.036

15 2.997 2.806 2.705 2.409 2.247

20 3.230 3.001 2.884 2.557 2.385

25 3.389 3.135 3.009 2.663 2.486

50 3.789 3.483 3.336 2.956 2.768

100 4.084 3.754 3.600 3.207 3.017

Risk of false rejection (%)