design and analysis of animal experiments · design and analysis of animal experiments prof. dr....

29
Vakgroep Fysiologie en Biometrie Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science

Upload: others

Post on 31-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Vakgroep Fysiologie en Biometrie

Design and analysis of animal

experiments

Prof. Dr. Ir. L. Duchateau

DVM. Ir. B. Ampe

2009-2010

Master in Laboratory Animal Science

Page 2: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

i

Preface

The use of statistics in animal experimentation is often restricted to the statisti-cal analysis of the data. Although it is important to base the data analysis on asound statistical basis to prevent over- or mis-interpretation of the data, statis-tics has much more to offer than the correct analysis of the data. In this shortcourse, we will demonstrate how statistics can be used at each stage of experi-mentation, from the generation of the objectives of the experiment, hypothesissetting, design of the experiment and finally actually performing the experiment.

Considering statistical aspects of the experiment at each different stage preventsthe setup of an experiment that is completely useless for the objectives studied.

Therefore, the course notes will describe in the different chapters the differentstages of experimentation and the statistical aspects relevant at each stage. Inchapter 1, it will be shown how the objective of an experiment can be trans-lated into an hypothesis to be tested statistically. In chapter 2, power analysisand sample size calculations will be explained. Power analysis prevents the in-vestigator from having too small but also too large trials for the investigatedobjectives. In chapter 3, we stress the importance of good experimental design.An optimal design can reduce the required sample size substantially. Finally,once the experiment is designed and run, the investigator can perform the sta-tistical analysis, the topic of chapter 4.

Page 3: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Contents

1 Hypothesis setting 1

1.1 Scientific and statistical hypotheses . . . . . . . . . . . . . . . . . 1

1.2 Generating and testing hypotheses . . . . . . . . . . . . . . . . . 2

1.3 Specifying statistical hypotheses . . . . . . . . . . . . . . . . . . 3

2 Replication and sample size calculation 5

2.1 The neccesity of replication . . . . . . . . . . . . . . . . . . . . . 5

2.2 Replication versus repeated measure . . . . . . . . . . . . . . . . 5

2.3 Sample size calculation . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3.1 Parameters determining the sample size . . . . . . . . . . 7

2.3.2 Sample size calculations for comparing µA and µB . . . . 8

2.3.3 Significant versus relevant differences . . . . . . . . . . . . 9

3 Experimental design 11

3.1 Randomized complete block design . . . . . . . . . . . . . . . . . 11

4 Analysis 13

4.1 Reasoning behind statistical analysis . . . . . . . . . . . . . . . . 13

4.2 Statistical analysis based on the normal distribution . . . . . . . 15

4.3 P-values, *** or confidence intervals . . . . . . . . . . . . . . . . 17

4.4 Analysis for the randomized complete block design . . . . . . . . 19

4.5 Multiple comparisons analysis . . . . . . . . . . . . . . . . . . . . 21

4.6 Absence of evidence is not evidence of absence . . . . . . . . . . 22

5 Tables 24

ii

Page 4: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Chapter 1

Hypothesis setting

1.1 Scientific and statistical hypotheses

One of the keys to successful experimentation is to define clearcut research ques-tions at the start of the experimentation. One or more experiments can be setup in order to address these research questions. In order to evaluate an ex-periment for its appropriateness to address the relevant research question, theresearch question needs to be defined in terms of a clearly defined hypothesis inscientific terms. Then, this hypothesis needs to be translated into a hypothesisthat is based on a statistical model in order to set up an adequate testing pro-cedure. Once this statistical hypothesis is defined, different experiments can beevaluated and compared for their usefulness.

As you will see in the following example, an experiment can lead to manydifferent possibly relevant hypotheses out of which one has to be chosen.

Example 1.1

In a mastitis experiment, 5 heiffers are randomly assigned to a low inoculumdose of E. coli ( 104 CFU) and 5 heiffers to a high inoculum dose of E. coli( 106 CFU). In the next 24 hours after inoculation, the milk somatic cell count(SCC) is measured every 3 hours. The general objective of the experiment isto study the differences in SCC for the two inoculation groups. This objective,however, is far too general and does not allow us to specify the hypothesis instatistical terms. It is of utmost importance to specify in detail what is to beexpected if there are genuine differences between the two inoculation groups. Inwhat follows we describe such specific hypotheses that can be converted into astatistical hypothesis

• In general we expect that over the 24 hours SCC will be larger in thehigh dose inoculation group. We thus could take the average of all theSSC values measured on a specific cow and hypothesize that this average islarger if the cow has received the high dose inoculum. Remark that in thiscase with equally spaced measurement intervals, the average correspondsto the area under the curve.

1

Page 5: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 1. HYPOTHESIS SETTING 2

• We expect that the maximum value for SCC over the 24-hours period willbe higher in the high dose inoculation group, and we thus would use onlythe maximum SCC for each individual heiffer.

• We expect that SCC will increase more quickly in the high dose inoculationgroup, and we can thus compare the (linear) increase in the two inoculationgroups.

• We expect, as in the previous hypothesis, that SCC will increase morequickly in the high dose inoculation group, but as an alternative hypothesiswe could compare the time needed for an individual animal to have aparticular (relative or absolute) increase of SCC.

• If we do not know at all in advance at which moment in time differ-ences in SCC might appear, the only alternative is to test differences inSCC at each measurement point separately leading in total to 8 differenttests. This, however, is not an efficient approach as corrections for mul-tiple comparisons need to be done so that each test hast to be done at amore stringent significance level thus decreasing the chances of finding asignificant difference when truly there is one.

• We can also test in a general statistical model whether there are globaldifferences between the two inoculum doses, and when that is the case, wecan further investigate whether there is an interaction between time andinoculum dose, addressing the question whether there is a different patternin time according to inoculum dose. The disadvantage of this approachis that it consists of rather general hypotheses. Therefore, it can happenthat an important difference at a particular time point is not picked upby this general hypothesis. Assume, for instance, that SCC is increasingfaster during the first 4 hours in the high inoculum group, but after thatSCC is pretty much the same in the two groups. This difference wouldhave been significant if the hypothesis concerned the increase of SCC atthe first 4 hours, but not if the general hypothesis is used.

As demonstrated in the example, it is important to clearly specify the hypothe-sis. The more specific the hypothesis, the higher the probability that it will besignificant if it is true.

Translating the research question to a statistical hypothesis is not easy butalways has to be done before the actual experiment takes place so that it is clearwhich hypothesis will be tested once the data are generated. This procedureenables us to ensure that the relevant hypothesis is indeed testable based on thedata that will be generated by the experiment and it further encourages us tothink about the real objectives of the experiment.

1.2 Generating and testing hypotheses

As noted above, it is essential to clearly define the goals and correspondinghypotheses of an experiment. Often, investigators try to define and test asmany hypotheses as possible, but, given that the statistical analysis is correctly

Page 6: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 1. HYPOTHESIS SETTING 3

done and reported, this often leads to indecisive experiments as none of theplanned hypotheses, when properly corrected for multiple comparisons, turnsout to be significant.Ideally, only one hypothesis, clearly defined before the start of the experiment,should be tested in the analysis. In the pharmaceutical industry, a statisticianhas to write a statistical analysis plan that is submitted the Food and DrugAdministration (FDA) before the experimental data are known. This statisticalanalysis plan must contain the hypothesis that will be tested and preferably alsothe program that will be run on the data to perform the statistical analysis. Thesame high standards should apply to more fundamental research if the objectiveof the trial is to prove or disproof a particular hypothesis.With respect to this, we can make a distinction between hypothesis-generatingor exploratory trials and hypothesis-testing of formal trials. In exploratorytrials, the investigator does not know what to expect from the trial and eachdifference that is found can be relevant. Such trials, however, should not beanalyzed by statistics, as hypotheses are only generated according to what onesees in the data and are thus entirely data driven and significant results couldbe a mere coincidence. In the case that interesting results appear from anexploratory trial that warrants further research, a new confirmatory trial hasto be set up with a clearcut hypothesis obtained from the previous exploratorytrial. Often the distinction between these two types of trials is not so sharp,and the best strategy is often to define, even if it is difficult, one hypothesis thatwill be formally tested and consider all the other information generated by thetrial as exploratory. Often this first hypothesis is called the primary hypothesisand the other hypotheses the secondary hypotheses.

1.3 Specifying statistical hypotheses

The specification of a statistical hypothesis consists of both a null hypothesisand an alternative hypothesis. The null hypothesis is the status-quo, it is actu-ally the hypothesis we would like to reject. In the case that the null hypothesisis rejected, we accept the alternative hypothesis. The alternative hypothesis isactually what we want to prove.As an example, consider a new and a standard drug and you want to prove thatthe new drug is better than the standard drug. In the null hypothesis we willthen state that the drugs are equal, whereas in the alternative hypothesis wewill state that the new drug is better than the standard drug.Only if we can reject the null hypothesis, we can draw conclusions. The non-rejection of the null hypothesis does not mean that the null hypothesis is true.Often, in practice, a non-rejection of the null hypothesis, for instance statingthat two drugs are equal, is seen as a proof that there is equality, but that isa false conclusion. It would then be very easy to proof that two drugs havethe same effect: take as few animals as possible in the experiment and you willnever find a significant effect, regardless whether there is one or not. In the casethat you want to proof that two drugs are equal, you have to use equivalencetests; but we will not consider this any further.

We will now specify the null and alternative hypothesis for the different examples

Page 7: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 1. HYPOTHESIS SETTING 4

seen in the previous section.

• With µh the average SCC in the population of heiffers with high inoculumdose and similarly µl the average SCC in the population of heiffers withlow inoculum dose, the hypothesis is given by

H0 : µh − µl = 0 versus Ha : µh − µl > 0

This is called a one-sided hypothesis because only deviations from 0 in onedirection (µh greater than µl) will lead to rejection of the null hypothesis.

• With maxh the maximum SCC (of the 8 time points) in the population ofheiffers with high inoculum dose and similarly maxl the maximum SCC inthe population of heiffers with low inoculum dose, the hypothesis is givenby

H0 : maxh − maxl = 0 versus Ha : maxh − maxl > 0

• With βh the linear increase in SCC in the population of heiffers with highinoculum dose and similarly βl the linear increase in SCC in the populationof heiffers with low inoculum dose, the hypothesis is given by

H0 : βh − βl = 0 versus Ha : βh − βl > 0

• With tmaxh the average time required to obtain maximum SCC in thepopulation of heiffers with high inoculum dose and similarly tmaxl the av-erage time required to obtain maximum SCC in the population of heifferswith low inoculum dose, the hypothesis is given by

H0 : tmaxh − tmaxl = 0 versus Ha : tmaxh − tmaxl > 0

• With µih the average SCC in the population of heiffers with high inoculumdose at time point i and similarly µil the average SCC in the populationof heiffers with low inoculum dose at time point i, there are 8 differenthypotheses, the hypothesis at time point i given by

H0 : µih − µil = 0 versus Ha : µih − µil > 0

• With µih the average SCC in the population of heiffers with high inoculumdose at time point i and similarly µil the average SCC in the populationof heiffers with low inoculum dose at time point i, the general hypothesisis given by

H0 : µih − µil = 0 versus Ha : µih − µil > 0; i = 1, . . . , 8

The difference between this hypothesis and the previous one is that the 8hypotheses are in this case tested simultaneously.

Page 8: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Chapter 2

Replication and sample size

calculation

2.1 The neccesity of replication

If the objective of an experiment is to determine whether two treatments to begiven to animals differ or not, it is necessary to assign each of the two treat-ments to at least two different animals. Assume, for instance, that the aim ofan experiment is to compare two diets, A and B, with respect to weight gain.One chicken receives diet A and has a weight gain equal to 1.8 kg, whereas an-other chicken on diet B has a weight gain of 2 kg. We would thus estimate thetreatment difference as 0.2 kg. It is, however, impossible, to make a statementwith respect to the true difference between these two diets. It could be verywell the case that the observed difference is due to random variability betweenthe weight gain in chickens rather than to the differences in the two diets.

If two chickens are randomly assigned to each of the two diets, however, andwe find for the two chickens on diet A a weight gain equal to 1.75 and 1.85 kgand for the two chickens on diet B a weight gain of 1.95 and 2.05 kg, then astatement can be made with respect to the difference between the two diets,as we know can also estimate the inherent variability of the weight gain inchickens. Although the treatment difference is still estimated to be 0.2 kg, thistreatment difference can be compared against the variability between chickensthat received the same diet.

2.2 Replication versus repeated measure

It is of utmost importance to make a distinction between a repeated measureand a replication. If a treatment is assigned at random to a particular entityor experimental unit, for instance an animal, than this is a genuine replication.However, if the same animal is measured several times, either at different loca-tions or at different moments in time and a treatment is assigned to the animal

5

Page 9: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 2. REPLICATION AND SAMPLE SIZE CALCULATION 6

as a whole, than these measurements are repeated measures and not replica-tions. The repeated measures allow us to make a more precise assessment ofthe response of the particular animal, but does not give us any additional in-formation on the variability between animals against which we have to test thetreatment effect. Thus the statistical analysis should be based on genuine repli-cations and not on repeated measures. As the concept of repeated measureversus replication is very important to perform a correct statistical analysis, wewill discuss this concept further based on an example with repeated measuresin time.

Example 2.1

Two cows were experimentally infected with trypanosomes and followed up forpacked cell volume (PCV). One cow received previously an experimental vaccineand the other did not. If different measurements are taken over time for each ofthe two cows, these measurements are repeated measures and not replications.Even very large differences between the measurements of the two cows are not aproof that the vaccine is efficient because differences can be merely due to cowdifferences as there exists already a large variability in PCV in healthy animals.The only way to test the vaccine is to include more animals in the experimentso that genuine replicates are available.

2.3 Sample size calculation

The objective of most experiments is to draw a clear conclusion based on thegenerated data. If the null hypothesis is rejected, the conclusion is straightfor-ward and the ultimate goal of the experiment has been obtained. If, however,the null hypothesis can not be rejected, it does not mean that the null hypoth-esis is correct. There could be different reasons why the null hypothesis couldnot be rejected

1. The null hypothesis is correct.

2. By accident, the difference between the sample means is equal and thus thedata support the null hypothesis, although there is a difference betweenthe population means.

3. There is a large difference between the two sample means but the samplesize is too small to exclude that the large difference between sample meansis due to random variation.

Point 1 does not cause a problem and point 2 is inevitable as we do not measurethe whole population but merely a sample. The probability of making such anerror, however, is decreasing with larger sample sizes and/or larger treatmentdifferences.

Point 3, however, can be controlled by the investigator, as we can chose thesample size before the experiment starts.

Page 10: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 2. REPLICATION AND SAMPLE SIZE CALCULATION 7

The sample size must be chosen so that a particular experiment leads to asignificant result with high probability (often set at 80%) given that a relevantdifference between the population means, say ∆, exists. For instance, we arenot interested to demonstrate that the difference in weight gain between thetwo diets equals only 10 grams, as such a difference is irrelevant in practice, andit would require a lot of replications. But a difference of for instance 200 gramscould be relevant.

2.3.1 Parameters determining the sample size

The probability that the null hypothesis is rejected for the comparison of themeans of two populations depends on a number of parameters

1. The type I error, α, the probability of falsely rejecting the null hypothesis.

2. The true difference between the population means, µ1 − µ2.

3. The variance between the experimental units, σ2.

4. The sample size n.

This probability, called the power and often presented as 1 − β needs to besufficiently large and is typically set to 0.80. The parameter α is often setto 0.05. The true difference is often taken as the smallest difference that isrelevant in practice. The only remaining parameter, apart from n, that we needto determine is σ2. Sometimes the investigator has a good estimate of thisparameter from previous trials. If not, the investigator must make an educatedguess but can also study the effect of different values of σ2 on the requiredsample size in order to make his final decision.

A useful tool in sample size determination is the power function. The powerfunction depicts the power (or the probability of a successful experiment) as afunction of the other parameters. These graphs show the immediate effect ofcertain assumptions as will be shown in the following example.

Example 2.2

Assume that the objective of the experiment is to demonstrate a difference inweight gain between two different diets, A and B. The power is first shown asa function of the variance σ2 between animals on the same diet (figure 2.1.a)for fixed values of the other parameters with ∆ set at 200 mg, α at 0.05 andn, the number of animals per diet, at 10. Similarly, the power is depicted as afunction of ∆ (figure 2.1.b), n (figure 2.1.c) and α (figure 2.1.d). It is clear fromthese pictures that the power increases with decreasing variance, increasing ∆,increasing n and increasing α.

Based on these figures, the investigator can make an assessment of the requiredsample size. For instance, with σ = 200, ∆ = 200 and α = 0.05, we need atleast 16 animals per diet in order to have a power of 80% and thus adequateprobability to run a successful experiment.

Page 11: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 2. REPLICATION AND SAMPLE SIZE CALCULATION 8

Figure 2.1: Power as a function of different parameters

2.3.2 Sample size calculations for comparing µA and µB

When comparing the means of two populations, µA and µB , with a normallydistributed outcome, the sample size calculations are straightforward and canbe presented in simple formulas. We will not derive these formulae but willanyway try to explain from the formulas the effect of certain parameters andalso demonstrate in an example the use of these formulas.

The sample size formula for a one-sided hypothesis test

H0 : µA − µB = 0 versus Ha : µA − µB > 0

is given by

n =2 (zβ + zα)

2σ2

∆2

whereas the sample size formula for a two-sided hypothesis test

H0 : µA − µB = 0 versus Ha : µA − µB 6= 0

is given by

n =2(

zβ + zα/2

)2σ2

∆2

Page 12: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 2. REPLICATION AND SAMPLE SIZE CALCULATION 9

where zx corresponds to the xth percentile of the standard normal distribution.These values can be obtained from table 5.1.

It is clear from both these equations that larger values for α and β will lead tosmaller sample sizes and also smaller values for σ2 and larger values for ∆.

Example 2.3

The aim of an experiment is to compare two diets for chicken. It can be assumedfrom previous experiments that the standard deviation is equal to 200 and wedo not expect the standard deviation to be different for the two diets. We set αto 0.05 and we further want to test two-sided. We would like to take a samplesize so that the relevant difference of 200 mg will be significant with a powerequal to 80%. Thus, zα/2 = z0.025 = −1.96 and zβ = z0.2 = −0.842.

Therefore, the sample size is given by

n =2 (−1.96 − 0.842)

22002

2002= 15.68

and we thus need to assign 16 animals to each of the two diets.

2.3.3 Significant versus relevant differences

It is important to make a distinction in experimentation between significant andrelevant differences.

If a significant result is obtained, that has also practical relevance, the exper-iment has been successful. If the observed difference between the two diets is200 grams and this difference is practically relevant and the statistical analysisreveals a significant result, the aim of the experiment has been obtained. This,however, does not mean that the experiment was optimal. It might be possiblethat the same difference could have been demonstrated to be significant withfar less animals. From the ethical point of view, an investigator has to strive forthe smallest experiment in terms of animals that is capable of demonstratingrelevant differences that are statistically significant.

In the case that a significant result is obtained but the difference is practicallyirrelevant, then the sample size used was far too high and the experiment wasinefficient.

The most frustrating result for an investigator happens when the observed dif-ference is large and relevant in practice, but no significant difference betweenthe two population means appears. This type of result is often due to the factthat either no power calculations were done or the variance was underestimatedin the sample size calculation. Such experiments by themselves are useless andwarrant similar experiments with larger sample size to prove that the observeddifference between the two populations was genuine and not due to random

Page 13: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 2. REPLICATION AND SAMPLE SIZE CALCULATION 10

variation. Again this shows the importance of sample size calculations beforethe actual experiment is done. In experimenting, it is often better to performone large trial with clearcut conclusions than a number of small trials, each bythemselves inconclusive.

Finally, it can happen that the observed difference is neither significant norrelevant in practice. The conclusion then seems straightforward but it is not.Suppose, for instance, that the investigator has to decide to either continue re-search on a chemical compound based on the results or dismiss the compoundfully. If the decision is taken that the compound is dismissed, based on a exper-iment with small sample size, then a potentially interesting compound is lostforever. It is very well possible that no relevant difference is found in a smallexperiment due to variation, although the compound has high potential. There-fore, the power is often set at a large value (e.g. 95%) when investigating a newcompound to reduce the risk of dismissing a potentially interesting compound.

Page 14: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Chapter 3

Experimental design

In the previous chapter, we investigated how we should adapt the sample size inorder to obtain an experiment with adequate power to reject the null hypothesis.We assumed that the variance σ2 was an unknown but fixed quantity. Otherstrategies, however, than increasing the sample size, can be followed to increasethe power of an experiment. The variance σ2 might be reduced by using amore precise measurement method or by measuring the same experimental unitseveral times and by using the mean of the different measurements. Anotherstrategy is to use experimental design. The field of experimental design is vastand many different experimental designs exist that often have been developedfor particular circumstances. We can merely give a brief introduction to thisimportant discipline in experimentation by introducing the simplest but oftenused experimental design: the randomized complete block design.

3.1 Randomized complete block design

The idea behind the randomized complete block design is to make blocks ofanimals that are similar to each other. For instance animals from the samelitter share more genetic material as compared to the overall population andwill therefore be more similar to each other. Within a block of such animals,each of the treatments to be studied will be randomly assigned to one of theanimals. This way, the variability between the litters can be filtered out of theanalysis, thus reducing the variance σ2 against which the treatment effects haveto be tested.

We will now, in the example, discuss the simplest case, with a block consistingonly of two experimental units, and only two treatments need to be compared toeach other. The analysis of these data will be postponed until the next chapter.

Example 3.1

Some zoologists hypothesize that the average length of the rear legs of a deerare longer than the average length of the front legs. In order to test this hy-pothesis, front legs and rear legs of different deers were measured and the dataare presented in table 3.1. Thus, the deer consists of the block here, and there

11

Page 15: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 3. EXPERIMENTAL DESIGN 12

are two ’treatments’ per deer, the front and the rear legs. It is obvious thatthis experimental design is much more efficient than the completely randomizeddesign where we would measure the front legs of some animals and the rearlegs of other animals. Large deers will have, on average and compared to thepopulation, both long front and rear legs, but also for these large deers, it mightbe true that the rear legs are longer than the front legs. By blocking by deer,we filter out the variability due to the size of the deer.

Table 3.1: Length of rear and front legs of 10 deers

Deer number length rear legs length front legs difference1 142 138 42 140 136 43 144 147 -34 144 139 55 142 143 -16 146 141 57 149 143 68 150 145 59 142 136 610 148 146 2

Page 16: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Chapter 4

Analysis

In this chapter, the fundaments of statistical testing are explained in an intuitiveway based on the randomisation test in Section 1. In a second section we extendthe previous discussion to the normal distribution case. We discuss the differentways to present the results of the statistical analysis for a simple completelyrandomized experiment in the third section. Next, we demonstrate how morecomplex designs, such as the randomized complete block design can be analyzed.In the fourth section, we discuss the problem of multiple comparisons, one ofthe most frequently encountered errors in reporting results. Another frequentlymade error, stating that there is no difference if the null hypothesis is notrejected, is discussed in Section 5.

4.1 Reasoning behind statistical analysis

A basic course in statistics is often based on the normal distribution assumptionof the data, leading to test statistics such as the paired and unpaired t-test andthe F-test in the analysis of variance.Another class of test statistics, randomisation tests, do not make any assump-tion with respect to the distribution of the data, and it is much easier to explainin an intuitive way how a statistical test works in the context of randomisationtests.Randomisation tests are based on all possible permutations of the observationsbetween the two treatment groups and the P-value is given by the percentageof permutations that leads to a result as extreme as or more extreme than theobserved result.

Example 4.1

In order to study the effect of the dose of a known carcinogen on the numberof mutations, 5 litters of transgenic mice were treated with a control, 5 litterswith a low dose and 5 litters with a high dose. The mean number of mutationsfor each litter are shown in Table 4.1.

13

Page 17: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 14

Table 4.1: Litter-average number of mutations at three different doses of acarcinogen

Control Low dose High dose11.7 18.8 23.410.3 16.8 17.96.5 12.2 22.55.4 11.8 18.914.2 13.7 19.4

We now compare the low dose of mutagen with the the control. We want totest the following one-sided hypothesis

H0 : µL − µC = 0 versus Ha : µL − µC > 0

with µL and µC the average number of mutations in the mice populations withrespectively low dose mutagen and no mutagen.

As a test statistic, that expresses the difference between low dose mutagen andcontrol, we will use the difference between the two means. The observed dif-ference equals 5.04. If there is no difference between the two treatments, eachpermutation of these observations has the same probability. In total there are(

10

5

)

= 252 different possibilities to choose 5 observations out of 10 for the con-trol group. Under the null hypothesis, each of these permutations has the sameprobability of occuring, namely 1/252. We then have to count the number ofpermutations that lead to the same or a more extreme test statistic than theone observed. One such permutation, for instance, is obtained by switching theobservation 14.2 of the control group with 11.8 of the low dose mutagen group.In total, there are 8 such permutations. The P-value is given by the probabilityof finding the same or a more extreme test statistic than the one observed underthe null hypothesis, and is thus given by 8/252=0.0317. This is a significantresult at the 5% significance level.

What is the P-value when comparing the High dose group with the control ?

Often, the observations are replaced with the ranks and the test is then basedon the sum of the ranks in one of the two groups. Now, using the same permu-tations as in the previous randomisation test, we can enumerate the number ofpermutations with a results as extreme or more extreme than the one observed.The main advantage of this procedure is its robustness against outliers. Thistype of test for comparing two groups is known as the Wilcoxon rank sum test.

Example 4.2

For the previous example, the litter with the lowest number of mutations getsthe lowest rank. Thus, litter 4 in the control group with 5.4 mutations getsrank 1. On the other hand, litter 1 in the low dose mutagen group has the

Page 18: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 15

highest observed number of mutations equal to 18.8, and thus gets rank 10. Inthe control group, we have ranks 1,2,3,4 and 8. The test statistics thereforeequals 18, the sum of the ranks in the control group. In total there are 7permutations with rank sum smaller or equal to 18, and therefore, the P-valueequals 7/225=0.0277.

4.2 Statistical analysis based on the normal dis-

tribution

The most often used (and misused as well) test in statistical analysis is thet-test. It is based on the assumption that observations are normally distributedand independent of each other. We typically have two groups that we wantto compare, and we furthermore assume that the two groups share the samevariance (simplest version of the t-test).We thus have for the observations X1 taken from the first group or popula-tion that X1 ∼ N

(

µ1, σ2)

and for the second group or population that X2 ∼N

(

µ2, σ2)

with µ1 and µ2 the respective population means and σ2 the commonvariance of the observations in the two populations.The common variance is estimated by pooling the variance estimators in thetwo groups

s2p =

(n1 − 1)s21 + (n2 − 1)s2

2

n1 + n2 − 2

with

s21 =

∑n1

i=1(x1i − x̄1)

2

n1 − 1and s2

2 =

∑n2

i=1(x2i − x̄2)

2

n2 − 1

with n1 and n2 the number of observations taken from the two populations.

Assume that interest is in the following hypothesis

H0 : µ1 = µ2 versus Ha : µ1 > µ2

The statistical analysis is based on the difference of the means in the two groupsX̄1 − X̄2, and it can be proven that under the null hypothesis

X̄1 − X̄2

sp

1

n1

+ 1

n1

∼ tn1+n2−2

where tn1+n2−2 is the t-distribution with n1 + n2 − 2 degrees of freedom.

Page 19: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 16

The P-value is then given by

P

tn1+n2−2 >x̄1 − x̄2

sp

1

n1

+ 1

n2

with x̄1 − x̄2 the value observed in the experiment. The P-value can be deter-mined from Table 5.2 which represents the cumulative t-distribution.

Example 4.3

We compare the low dose mutagen group again with the control group, but nowbased on the t-test. We first need to calculate the group means and variances.They are given by x̄C = 9.62 and s2

C = 13.327 for the control group and byx̄L = 14.66 and s2

L = 9.218 for the low dose mutagen group. Therefore, thepooled variance estimator is given by s2

p = 11.273, the average of the two vari-ances (check that taking the average corresponds to the formula of the pooledvariance given before. In what circumstances can we take the average?).

The hypothesis is given by

H0 : µL − µC = 0 versus Ha : µL − µC > 0

and therefore the P-value is given by

P

t8 >14.66 − 9.62

√11.273

1

5+ 1

5

= P (t8 > 2.37) < 0.025

Example 4.4

Assume that there are two different formulations for a drug, tablet (T) andsolution (S), and we set up two studies, a small one consisting of 3 subjects ineach treatment arm and a bigger one with 8 subjects in each arm. We considerthe area under the curve (AUC) of the drug concentration in each subject andthe results are given in Table 4.2.

If we would test the null hypothesis that the two drug formulations are the same,against the alternative that the tablet formulation leads to a higher AUC, wefind for the small trial an estimated difference between tablet and solution equalto 0.33, and the P-value is larger than 0.05 so that we cannot reject the nullhypothesis that the two drug formulations are equal. For the bigger trial, wefind that at the 5% nominal significance level, the tablet leads to a higher AUC(P=0.025), although the estimated difference was lower in the bigger trial. It isclear that we should not claim in the first trial that the two drug formulationsare the same. If equivalence is at stake, the hypothesis must be stated in anotherway (see Section 4.6).

Page 20: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 17

Table 4.2: AUC of drug concentration according to formulation in two studies

Small study Bigger studySolution Tablet Solution Tablet

2.8 2.9 2.9 3.12.7 3.1 2.5 2.92.3 2.8 2.7 3.2

2.8 2.83.1 3.32.6 2.9

4.3 P-values, *** or confidence intervals

The results of a statistical analysis can be reported in different ways. Nowadaysthe significance of results is often demonstrated by the P-value. The P-valuecorresponds to the probability of the observed result given that the null hypoth-esis is true. Thus, the smaller the P-value the less credible the null hypothesiswill be. The cut-off value is often chosen to be equal to 0.05 with P-values be-low this cut-off value considered to be significant results. Although the P-valueinforms us about the credibility of the null hypothesis, it does not contain anyinformation with respect to the actual treatment effect. It cannot inform us, forinstance, whether the treatment effect, although significant, is relevant as well.

A simpler version of the P-value, but less informative, is to make use of stars.Often, one star is used when a comparison is significant at the 5% level andthree stars for significance at the 0.1% level. The disadvantage of this approachis that we do not have an idea about the non-significant results. For instance, atreatment effect with a P-value equal to 0.049 will be shown with a star, whereasa treatment effect with a P-value equal to 0.051 will not be shown with a star.These results, however, are very similar from the point of view of the P-value,and it is only because of the artificial cut-off value that they are qualitativelydifferent when using the stars presentation. Sometimes a system based on let-ters is used whereby the different treatment means are given with a particularletter combination and treatments that do not have the same letter(s) are notsignificantly different from each other. This will be further demonstrated in theexample.

Taking into consideration the disadvantages discussed above, a better way ofdescribing the results of a statistical analysis is based on confidence intervals.The 95% confidence interval for a particular parameter contains the value of thetrue population parameter with 95% probability, so that the confidence inter-val contains information about the treatment effect itself. Moreover, confidenceintervals can also be used for testing. If the 95% confidence interval does notcontain the value of the parameter under the null hypothesis, it means that thenull hypothesis would be rejected at the 5% significance level, or in other words,

Page 21: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 18

that the P-value is smaller than 5%.

The most adequate way to report the results of a statistical analysis is thus tomake use of confidence intervals of the relevant parameters or functions thereof,and if needed, also add the P-value. It is also an additional check of the resultsbecause the rejection/non rejection of the null hypothesis based on the P-valueshould not contradict the decision based on the confidence interval.

Example 4.5

Consider the study on the number of mutations for which the data are presentedin Table 4.1. If we are interested in all pairwise comparisons, then the results ofthe statistical analysis can be presented in the different ways explained above.For the time being, we will not adjust for multiple comparisons, but we willshow in section 3 of this chapter how this should be done.

We can either base the representation on the differences between the 3 dosesor on the mean number of mutations per dose. Lets first consider the meannumber of mutations per dose. The results are then reported in Table 4.3. Nowwe have to make use of the letters. It is clear from this output that each ofthe three treatments are significantly different from each other, but the level ofsignificance is unclear.

Table 4.3: Table of mean number of mutations of three different doses of acarcinogen

Treatment Mean number of mutationsControl 9.64a

Low Dose 14.66b

High dose 20.42c

Means with different letters differ significantly with α = 0.05.

Alternatively, the results can be presented as the difference between the treat-ment means as demonstrated in table 4.4. It is clear that the most significantdifference is between the control and the high dose with a P-value lower than0.001 for this comparison.

Table 4.4: Table of differences between three different doses of a carcinogen ofthe mean number of mutations

Treatment Difference Mean number of mutationsControl-Low dose -5.02∗

Control-High dose -10.78∗∗∗

Low Dose-High dose -5.76∗

*** denotes significant difference at α = 0.001, * at α = 0.05.

Page 22: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 19

Next we can produce the same table, but now replacing the stars with the P-value as in table 4.5. This table is more informative. It is now clear that theweakest proof for a difference is found for the comparison between the controltreatment and the low dose, with a P-value of 0.046, just below the 5% cut-offvalue.

Table 4.5: Table of differences between three different doses of a carcinogen ofthe mean number of mutations with P-values added

Treatment Difference Mean number of mutations P-valueControl-Low dose -5.02 0.0460Control-High dose -10.78 0.0006

Low Dose-High dose -5.76 0.0104

Finally, we can further expand this table and include the confidence interval asshown in table 4.6. From this table, we can deduce between which values thetrue difference will be situated with a certain confidence. For instance, we havea confidence of 95% that the true difference between the control and low dosetreatment is between -9.33 and -0.11. Thus this interval almost contains thevalue 0, the parameter value under the null hypothesis. This is in accordancewith the P-value being close to 0.05.

Table 4.6: Table of differences (95% confidence interval) between three differentdoses of a carcinogen of the mean number of mutations

Treatment Difference Mean number of mutations P-value(95% confidence interval)

Control-Low dose -5.02 (-9.33;-0.11) 0.0460Control-High dose -10.78 (-9.75;-1.77) 0.0006

Low Dose-High dose -5.76 (-15.29;-6.27) 0.0104

4.4 Analysis for the randomized complete block

design

Statistical analysis of more complex experiments is often not straightforward.There is, however, a general model, called the mixed model, that enables usto analyze most properly designed experiments with normally distributed ob-servations in a general framework. We will first describe the previous, simpleexperiment in terms of the mixed model, and then also the more complex designintroduced in the previous chapter, the randomized complete block design.

Page 23: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 20

The data described in example 4.5 are generated in a completely randomizeddesign: each litter is randomly assigned to one of the three treatments and theaverage number of mutations per litter is observed. The model can be writtenas

yij = µi + eij i = 1, 2, 3; j = 1, . . . , 5

where yij is the number of mutation in the jth litter of the group treated withdose i, µi is the mean number of mutations in litters treated with dose i, andeij is the random error term. This is termed also a fixed effects model becausethe only source of random variation is due to the error term eij . A further as-sumption of the error terms is that they are independent from each other withthe same normal distribution and the same variance, eij ∼ N(0, σ2).

In the case of the randomized complete block design, we have to extend thismodel because of the presence of blocks, the blocks being random effects inwhich we are not interested. Rather, we incorporate the blocks to get rid of thevariance linked to the blocks. The model for example 3.1 then becomes

yij = di + µj + eij i = 1, . . . , 10; j = 1, 2

where yij is the length of the legs on the jth side (front or rear) of deer i, µj isthe mean length of the legs of side j, di is the random effect of deer i and eij

is the random error term. This is termed a mixed model because now we havean additional source of random variation due to the random effect of deer di.A further assumption of the random deer effects is that they are independentfrom each other and from the random error term with all the same normal dis-tribution and the same variance, di ∼ N(0, σ2

d).

Once the structure of the experiment is put in the mixed model format, theanalysis follows as is demonstrated in example 4.6

Example 4.6

The analysis is performed with the free ware package R, that can be down-loaded from the World Wide WEB (http://www.r-project.org/). The followingprogram can be submitted in R.

library(nlme)

deer<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10)

side<-rep(c(1,2),10)

length<-c(142,138,140,136,144,147,144,139,142,143,

146,141,149,143,150,145,142,136,148,146)

lengthdeer<-data.frame(deer=as.factor(deer),side=as.factor(side),

length=length)

results.mixed<-lme(length~side,random=~1|deer,data=lengthdeer)

summary(results.mixed)

Page 24: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 21

which leads to the following output

Linear mixed-effects model fit by REML

Data: lengthdeer

AIC BIC logLik

105.8439 109.4054 -48.92197

Random effects:

Formula: ~1 | deer

(Intercept) Residual

StdDev: 3.040455 2.161538

Fixed effects: length ~ side

Value Std.Error DF t-value p-value

(Intercept) 144.7 1.1796871 9 122.65965 <.0001

side2 -3.3 0.9666693 9 -3.41378 0.0077

Correlation:

(Intr)

side2 -0.41

Standardized Within-Group Residuals:

Min Q1 Med Q3 Max

-1.22864503 -0.46939984 -0.05032968 0.49376467 1.68594590

Number of Observations: 20

Number of Groups: 10

Thus there is a significant difference in length between the front and rear legs,the difference being estimated as -3.3 (standard error: 0.966; P-value=0.0077).Furthermore, we observe that the variability between deers is quite substantial,being equal to 3.04 and thus larger than the random error term variance.

4.5 Multiple comparisons analysis

If multiple comparisons are tested in an experiment, an adjustment to the sig-nificance level is required. We will explain this based on the following example.Assume we want to experiment with a factor that has a different levels andwe want to know which levels are different from each other. It seems thenstraightforward to compare each of the a levels with the other a−1 levels. Thisprocedure, however, is not correct, because in testing multiple comparisons wewant to ensure that the statements with respect to the different multiple com-parisons are simultaneously correct with a confidence level of 1 − α. Thus, ifwe use a confidence level of 95%, we want that 95 out of 100 replications of theexperiment would not show significant results if there are in reality none. If, forinstance, the experiment consists of 4 treatments and we compare treatment 1with treatment 2 and treatment 3 with treatment 4 at the α significance level,then it can be proven that the probability that there will be either one or two

Page 25: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 22

significant treatment differences is no longer α but 2α − α2, so for the caseα = 0.05, the type I error is no longer the required 5% but rather 9.75%.

In order to correct for this, we will need to test each of the multiple comparisonsat a significance level lower than α. Different techniques have been developed,each having certain advantages in particular situations. The most general oneis due to Bonferroni and is the only one we will discuss because of its simplicity.The Bonferroni rule tells us to test each individual comparison at a significancelevel equal to α/g, where g is the total number of comparisons. We will applythis rule now to our previous example 4.5.

Example 4.7

In the analysis done in example 4.5 we did not yet take into account the fact thatthree multiple comparisons are done. Rather than testing at the 5% significancelevel, we should test, according to Bonferroni’s rule, at a significance level equalto 5/3= 1.67%. Thus the first comparison between control and low dose is nolonger significant and we need to adapt the different tables which we have donefor the first table as an example.

Table 4.7: Table of mean number of mutations of three different doses of acarcinogen

Treatment Mean number of mutationsControl 9.64a

Low Dose 14.66a

High dose 20.42b

Means with different letters differ significantly with global α = 0.05and comparison wise α = 0.0167 (Bonferroni)

4.6 Absence of evidence is not evidence of ab-

sence

It is important to realize that a statistical analysis can never prove that twotreatments lead to exactly the same results. In the case that the null hypothesisof no difference between two treatments can not be rejected, it does not meanthat the two treatments are equal to each other. It could be merely due to thefact that the experiment included too few animals. Therefore, a statement thattwo treatment are equal based on a null hypothesis of no difference is incorrectand should never be put in a publication.

We can however demonstrate in a statistical analysis that two treatments areequivalent at a certain confidence level. In order to do so, we need to define anull hypothesis which, when rejected, leads to the equivalence conclusion. Thenull hypothesis then contains the values of the difference for which the two

Page 26: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

CHAPTER 4. ANALYSIS 23

treatments can no longer be considered to be equivalent, whereas the alterna-tive hypotheses contains the values of the difference that are consistent withequivalence. The following example will clarify this.

Example 4.8

Assume that there are two different formulations for a drug, tablet (T) andsolution (S), and we want to test their bio equivalence. When the difference inAUC between the two formulations is smaller than ∆ = 3µg/ml, we considerthese two formulations to be equivalent because such small difference is clinicallyirrelevant.

The hypothesis can then be formulated as

H0 :| µS − µT |≥ 3 versus Ha :| µS − µT |< 3

We thus wish to reject the null hypothesis to state with a certain confidencethat the difference is smaller than 3µg/ml. This hypothesis is most easily testedby calculating the confidence interval of the difference between the two formu-lations. When the confidence interval is entirely contained in the equivalenceinterval [−∆;+∆], the null hypothesis can be rejected, otherwise not.

Page 27: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

Chapter 5

Tables

24

Page 28: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

TABLES 25

Table 5.1: Standaard normal distribution

F(z)

Zz

P(Z<z)

Tweede decimaal van xx .00 .01 .02 .03 .04 .05 .06 .07 .08 .090.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .53590.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .57530.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .61410.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .65170.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879

0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .72240.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .75490.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .78520.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .81330.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .86211.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .88301.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .90151.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .91771.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319

1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .94411.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .95451.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .96331.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .97061.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767

2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .98172.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .98572.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .98902.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .99162.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936

2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .99522.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .99642.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .99742.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .99812.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .99863.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990

Page 29: Design and analysis of animal experiments · Design and analysis of animal experiments Prof. Dr. Ir. L. Duchateau DVM. Ir. B. Ampe 2009-2010 Master in Laboratory Animal Science. i

TABLES 26

Table 5.2: t-distribution

F(t)

tn-1t

P(tn-1 <t)

Pv.g. .750 .900 .950 .975 .990 .995 .999

1 1.000 3.078 6.314 12.706 31.821 63.656 318.2892 0.816 1.886 2.920 4.303 6.965 9.925 22.3283 0.765 1.638 2.353 3.182 4.541 5.841 10.2144 0.741 1.533 2.132 2.776 3.747 4.604 7.1735 0.727 1.476 2.015 2.571 3.365 4.032 5.8946 0.718 1.440 1.943 2.447 3.143 3.707 5.2087 0.711 1.415 1.895 2.365 2.998 3.499 4.7858 0.706 1.397 1.860 2.306 2.896 3.355 4.5019 0.703 1.383 1.833 2.262 2.821 3.250 4.297

10 0.700 1.372 1.812 2.228 2.764 3.169 4.14411 0.697 1.363 1.796 2.201 2.718 3.106 4.02512 0.695 1.356 1.782 2.179 2.681 3.055 3.93013 0.694 1.350 1.771 2.160 2.650 3.012 3.85214 0.692 1.345 1.761 2.145 2.624 2.977 3.78715 0.691 1.341 1.753 2.131 2.602 2.947 3.73316 0.690 1.337 1.746 2.120 2.583 2.921 3.68617 0.689 1.333 1.740 2.110 2.567 2.898 3.64618 0.688 1.330 1.734 2.101 2.552 2.878 3.61019 0.688 1.328 1.729 2.093 2.539 2.861 3.57920 0.687 1.325 1.725 2.086 2.528 2.845 3.55221 0.686 1.323 1.721 2.080 2.518 2.831 3.52722 0.686 1.321 1.717 2.074 2.508 2.819 3.50523 0.685 1.319 1.714 2.069 2.500 2.807 3.48524 0.685 1.318 1.711 2.064 2.492 2.797 3.46725 0.684 1.316 1.708 2.060 2.485 2.787 3.45026 0.684 1.315 1.706 2.056 2.479 2.779 3.43527 0.684 1.314 1.703 2.052 2.473 2.771 3.42128 0.683 1.313 1.701 2.048 2.467 2.763 3.40829 0.683 1.311 1.699 2.045 2.462 2.756 3.39630 0.683 1.310 1.697 2.042 2.457 2.750 3.38540 0.681 1.303 1.684 2.021 2.423 2.704 3.30760 0.679 1.296 1.671 2.000 2.390 2.660 3.232

120 0.677 1.289 1.658 1.980 2.358 2.617 3.160∞ 0.674 1.282 1.645 1.960 2.326 2.576 3.090