chi-square analysis

22
Chi-Square Analysis Mendel’s Peas and the Goodness of Fit Test

Upload: zonta

Post on 15-Mar-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Chi-Square Analysis. Mendel’s Peas. and the Goodness of Fit Test. We will develop the use of the χ 2 distribution through an example from the history of biology. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chi-Square Analysis

Chi-Square Analysis

Mendel’s Peas

and the Goodness of Fit Test

Page 2: Chi-Square Analysis

We will develop the use of the χ2 distribution through an example from the history of biology.

Page 3: Chi-Square Analysis

In Austria in the mid 1800s, an Augustine monk, Gregor mendel, studied the garden pea and seven of its traits, such a shape and color of the peas, position of flowers on the plant, etc. He is credited with discovering patterns of inheritance, the basis of the field of genetics.Curiously, Mendel studied seven traits, one from each of the pea’s seven chromosomes. His theory of the independent assortment of genes occurs only when genes are on different chromosomes.

Page 4: Chi-Square Analysis

Consider two different characteristics of peas, color and shape. The peas may be yellow or green, round or wrinkled.

We will use one of Mendel’s studies, and some of his original data, to explore the χ2 test of significance.

Page 5: Chi-Square Analysis

If we cross a plant with yellow round peas with a plant having green wrinkled peas, and examine the progeny we will discover a uniform F1 generation.

The traits yellow and round are each dominant, while green and wrinkled are recessive. We use the letter Y for color, and R for pea shape, so the alleles are Y, y, R, and r.

Page 6: Chi-Square Analysis

Yellow round peaGreen wrinkled

pea

This is a Punnett square to illustrate this dihybrid cross.

Notice the uniformity among the offspring, as all are YyRr.

Page 7: Chi-Square Analysis

Now we cross the F1 among themselves to produce the F2:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr

Page 8: Chi-Square Analysis

Now we identify the yellow round peas:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr

Page 9: Chi-Square Analysis

Now we identify the yellow wrinkled peas:gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr

Page 10: Chi-Square Analysis

Next we identify the green round peas:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr

Page 11: Chi-Square Analysis

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr

Finally, the last type of pea is green and wrinkled:

Page 12: Chi-Square Analysis

So now we have four phenotypes (different physical forms) of peas originating from the single phenotype of the F1 generation.They are, along with their genotypes and expected frequencies:

Yellow round YYRR, YYRr, YyRR, YyRr

Yellow wrinkled YYrr, Yyrr

Green round yyRR, yyRr

Green wrinkled yyrr

Page 13: Chi-Square Analysis

If Mendel’s understanding of genetics were correct, and the crosses made as he believed, the proportions of the four phenotypes should fit the calculations from the Punnet square.Using the χ2 distribution, we are able to test to see if groups of individuals are present in the same proportions as expected. This is rather like conducting multiple Z-tests for proportions at once.

In this example Mendel carried out the dihybrid cross to produce an F1 generation, and as expected, the F1 were all of the same phenotype, yellow and round.Further, the F1 were crossed among themselves to produce the F2 generation. Mendel recorded the numbers of individuals in each category.

Page 14: Chi-Square Analysis

The following table gives the observed numbers of each category.

Phenotype ObservedExpected frequency

Yellow round 315

Yellow wrinkled 101

Green round 108

Green wrinkled 32

Page 15: Chi-Square Analysis

To make a χ2 test for “goodness of fit” we start as with all other tests of significance, with a null hypothesis.Step 1:

H0: The F2 generation is comprised of four phenotypes in the proportions predicted by Mendelian genetics.Ha: The F2 generation is not comprised of

four phenotypes in the proportions predicted by Mendelian genetics. Another way of saying this is that the null hypothesis claims the population fits our expected pattern, while the alternate hypothesis says it does not.

Page 16: Chi-Square Analysis

Step 2:

Assumptions: Our first assumption is that our data are counts. (We cannot use proportions or means.) With χ2, we do not always have a

sample of a population, and sometimes examine an entire population, as with this example. When working from a sample we must ensure that the sample is representative.

1. All expected counts must be one or more.

In order to check assumptions for this goodness of fit test we must calculate the expected counts for each category. Then we must meet two criteria:

2. No more than 20% of the counts may be less than 5.

Page 17: Chi-Square Analysis

We calculate the expected counts by finding the total number of observations and multiplying that by each expected frequency.

PhenotypeObserved

countsExpected frequency Expected counts

Yellow round 315

Yellow wrinkled 108

Green round 101

Green wrinkled32

Page 18: Chi-Square Analysis

As you can see, all expected counts are greater than 5, so all assumptions are met.Step 3:

The formula for the χ2 test statistic is:

where o = observed counts, and e = expected counts

This calculation needs to be made in the graphing calculator.Enter the observed counts in L1. Enter the expected frequencies in L2, as exact numbers. (Enter numbers like 1/3, directly, as fractions, never round to just .3 or .33.)

Page 19: Chi-Square Analysis

In L3 multiply L2 by 556. This will give the expected counts. The sum of L1 can be found using 1-Var Stats.Now in L4, enter (L1-L3)2/L3, this will give you the χ2 contribution for each category. Finally, χ2 is the sum of L4.

For this problem, the χ2 statistic is .4700. In χ2, we always need to know and report the degrees of freedom. The degrees of freedom are the number of categories minus one. Here we have 3 degrees of freedom.

Page 20: Chi-Square Analysis

Step 4:

Page 21: Chi-Square Analysis

Step 5:

Step 6:

Fail to reject H0, as p = 0.9254 > α = .05.

Step 7:

We lack evidence that the pattern of pea phenotypes is different from expected. That is, the F2 generation are present in the expected proportions, 9:3:3:1.

The area can also be found with cdf(.4700,10^99,3).

Page 22: Chi-Square Analysis

Gregor Mendel did not have modern statistics to rely on for his data analysis, but none-the-less analyzed data in a way that led to this major scientific discovery, important to this day. There has been speculation about his studies, or how he reported them, as the data is almost better than chance variation would produce. He was, however, an Augustine monk, so perhaps he had a little help…