lecture 23: quantitative traits iii
DESCRIPTION
Lecture 23: Quantitative Traits III. Date: 11/12/02 Single locus backcross regression Single locus backcross likelihood F2 – regression, likelihood, etc. Backcross Model. m 1 is the genotypic value of QQ m 2 is the genotypic value of Qq. Backcross – t-test. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/1.jpg)
Lecture 23: Quantitative Traits III
Date: 11/12/02 Single locus backcross regression Single locus backcross likelihood F2 – regression, likelihood, etc
![Page 2: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/2.jpg)
Backcross Model
Marker Genotype Count
Marg. Freq.
QTL Genotype
Trait ValueQQ Qq
AA n1 0.5 1- 12
Aa n2 0.5 1- 1)2
1 is the genotypic value of QQ2 is the genotypic value of Qq
![Page 3: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/3.jpg)
Backcross – t-test
classmarker each in means observed theare ˆ and ˆ where
variancepooled theis ˆ where
sampled sindividual ofnumber total theis where
2)df(~11
ˆ
ˆˆ
2
21
21
2
AaAA
AaAAM
s
nnN
Nt
nns
t
Gen. Freq. Phen.
AA n1 X11,X12,...,X1n1
Aa n2 X21,X22,..., X2n2
![Page 4: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/4.jpg)
Backcross – Linear Regression (BLG)
One may also test the data using a simple linear regression model.
Where yj is the trait value for the jth individual, xj is a dummy variable indicating marker genotype (AA or Aa).
You know that estimates of the coefficients are given by:
We seek the expectation of these coefficients under a genetic model.
jjj xy
x
yxb
xbya
Var
,Covˆ
ˆ
![Page 5: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/5.jpg)
BLG – Expected Sample Statistics
To find the expected values under the genetic model, we need the expectation of the sample means and variances:
2121212
212121
222
212
111
2
111
2
1ˆE
2
11
2
11
2
1E
112
11
2
1ˆE
012
11
2
1E
xy
x
s
y
s
x
![Page 6: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/6.jpg)
BLG – Expected Coefficients
Recalling the coefficient estimators:
Finally, recalling our genetic models:
212
21
212
1ˆE
ˆEE
2
1EE
x
xy
s
sb
ya
QQ a 2a
Qq d (1+k)a
qq -a 0
![Page 7: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/7.jpg)
BLG – Hypothesis Testing
We conclude that the expected regression coefficient is:
So, again, rejecting H0: =0 means =0.5 (NO LINKAGE) a=0 or a=d=0 (NO VARIATION) k=1 or a=d (COMPLETE DOMINANCE)
ak
dab
1212
1
212
1
212
1E 21
![Page 8: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/8.jpg)
Backcross – Likelihood (BL)
One may also set up a likelihood function for backcross progeny.
Trait values are assumed approximately normal (lots of little effects added together).
The distribution of trait values for each marker class are assumed to be a mixture of two normals, one for each possible genotype at the QTL.
The mixing proportions are determined by the recombination fraction.
![Page 9: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/9.jpg)
Genotypic Value
BL – Distributions Class AA
21 AA
80%
20%
Suppose =0.2
![Page 10: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/10.jpg)
BL – Distributions Class Aa
Genotypic Value
21 Aa
80%
20%
![Page 11: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/11.jpg)
BL – Assumptions
Assume the trait variances for the two QTL genotypes in the backcross are equal.
Assume the traits are normally distributed. Assume there is no marker / trait interaction,
so the distributions remain unchanged in both marker classes (i.e. same variances).
![Page 12: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/12.jpg)
BL – Likelihood
The likelihood function for the backcross is then:
where Qj is one of the (unknown) two possible genotypes at the marker locus.
N
i j
jiijN
yMQL
1
2
12
2
2expP
2
1
![Page 13: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/13.jpg)
BL – Log Likelihood
Take the log of the likelihood to obtain:
N
i j
jiij
NyMQl
1
22
12
2
2log22
expPlog
![Page 14: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/14.jpg)
BL – Null Hypothesis A
One null hypothesis of interest is that the mean genotypic values for the two distributions are not in fact different, so
H0: 1 = 2 = .
In this case, the log likelihood becomes:
2
1
2
2
1
22
2
2log22
1
2log22
explog
Ny
Nyl
N
ii
N
i
i
![Page 15: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/15.jpg)
BL – Null Hypothesis B
Another, perhaps more interesting null hypothesis, is that there is no linkage, so
H0: =0.5
Under this assumption, the log likelihood becomes
N
i j
ji Nyl
1
22
12
2
2log22
explog
![Page 16: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/16.jpg)
BL – Statistical Test
The G statistic that is commonly calculated to test for linkage is:
However, this test is less powerful than the t test introduced earlier.
21df
221
221 ~5.0,ˆ,ˆ,ˆˆ,ˆ,ˆ,ˆ2 rlrlG
![Page 17: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/17.jpg)
BL – LOD Scores
Again, LOD scores are commonly used for QTL detection.
Where, we interpret, as usual, that a lod score of l means the alternative hypothesis is 10l times as likely as the null hypothesis.
5.0,ˆ,ˆ,ˆlogˆ,ˆ,ˆ,ˆloglod 22110
22110 LL
![Page 18: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/18.jpg)
BL – Likelihood Maximization
Analytic solutions are difficult to achieve. Iterative approaches are generally used (EM,
NR). Combinations of methods are also used. For
example, the variance is commonly estimated with the pooled variance:
2
222
1
2112ˆ
n
yy
n
yy jj
![Page 19: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/19.jpg)
To facilitate calculations even more, a grid of values with maximization on 1 and 2 can be used.
So suppose you have multiple markers with known map position. Then, evaluate a G statistic or lod score for 3 possible locations of the QTL:
BL – Likelihood Maximization
Marker 0 0.25m 0.5m
1 =0 =f(0.25m12) =f(0.5m12)
2
![Page 20: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/20.jpg)
BL – Sample Results
0
1
2
3
4
5
6
7
0 0.5 1 1.5 2
Chromosome Location
LO
D S
core
![Page 21: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/21.jpg)
BL – Caveats
When there is more than one QTL in the same vicinity, the peaks in the LOD score plot may not correspond to QTLs.
Recall that these results are still based on single-locus analysis for which we cannot separate genetic effect from linkage. Thus, there is little good information about QTL location in such a plot, even though it looks like there should be.
![Page 22: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/22.jpg)
BL – Comments
Note, that if marker density is high, then there is no need to evaluate at multiple levels of for each marker.
However, when marker density is low, information is gained when multiple QTL locations are considered.
When =0 is assumed, the estimates of 1 and 2 are simple means.
![Page 23: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/23.jpg)
Single Marker F2 (F2)
There are now three possible genotypes to consider for both the marker and the QTL locus.
ni
Marg. Freq.
P(Qj | Mi)
QQ Qq qq
AA n1 0.25 (1-)2 2(1-) 2
Aa n2 0.50 (1-) (1-)2+2 (1-)
aa n3 0.25 2 2(1-) (1-)2
![Page 24: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/24.jpg)
F2 – Expected Trait Values
ni
Marg. Freq. Expected Trait Value
AA n1 0.25
Aa n2 0.50
aa n3 0.25
da
adaAA
1221
121 22
QQQqqq
a-a d
d
adaAa
22
22
1
111
da
adaaa
1221
112 22
![Page 25: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/25.jpg)
F2 – Dominant Marker
Similar tables can be derived for the case of a dominant marker.
In general, the procedure is as follows: Derive the QTL genotype probabilities
conditional on the marker phenotype. Using the conditional probabilities, derive the
expected trait value for each marker phenotype class.
![Page 26: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/26.jpg)
F2 – Regression (F2R)
The regression model is
where yj is the trait value of the jth individual in the population
where x1j is the dummy variable for marker additive effect taking on value 1 for AA, 0 for Aa, and –1 for aa.
where x2j is the dummy variable for marker dominance effect taking on value 1 for AA and –1 for Aa and 1 for aa.
jjjj xxy 22110
![Page 27: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/27.jpg)
F2R – Matrix Notation
XYXX
XYXX1'ˆ
'
2312
31
321
3
1
0
2215.0
215.0
225.0
100
05.00
001
![Page 28: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/28.jpg)
F2R – Expected Coefficients
The coefficient estimates have expectation:
d
a
21
212
225.0
2215.0
215.0
225.0
100
020
001
ˆ
ˆ
ˆ
E
321
2312
31
321
3
1
0
![Page 29: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/29.jpg)
F2R – F Statistics
The F statistic is the ratio between the residual mean squares for the reduced model and the full model.
The full model has residual mean square:
XYXYS '2full
![Page 30: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/30.jpg)
F2R – Reduced Models
Reduced models of interest are:
And the F statistics are:
20,0021
202201
201102
21
1
2
0,0
0
0
Sy
Sxy
Sxy
jj
jjj
jjj
3,2df
3,1df
3,1df
2full
20,0
0,0
2full
20
0
2full
20
0
21
21
1
1
2
2
NS
SF
NS
SF
NS
SF
![Page 31: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/31.jpg)
F2R – Dominant Marker
If the marker locus segregates as a dominant trait, then:
Thus, significant regression coefficient tests for a confounded additive effect, dominance effect, and linkage.
jjj xy 10
da 21 21212
3
1E
![Page 32: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/32.jpg)
F2 – Likelihood Approach (F2L)
Assume trait variances for the three QTL genotypes are equal.
For each marker class, the trait value is a mixture of three normal distributions with different means, equal variances, and expected proportions based on degree of linkage.
The expected proportions are given in slide #23.
![Page 33: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/33.jpg)
F2L – Log Likelihood
The likelihood then becomes a sum over three normals:
N
i j
jiijNF
yMQL
1
3
12
2
2 2expP
2
1
N
i j
jiijF
NyMQl
1
23
12
2
2 2log22
expPlog
![Page 34: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/34.jpg)
F2L – Null Hypothesis A
If the null hypothesis is
H0: a = 0
2
1
2
22
2
2
21
31
2
2log2
2expP
2expPP
log0
N
yMQ
yMQMQ
alN
ii
i
iii
F
![Page 35: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/35.jpg)
F2L – Null Hypothesis B
Suppose instead that the null hypothesis is
H0: d = 0
2
1
2
23
3
2
2
31
2
2
21
1
2 2log2
2expP
221
expP
2expP
log0
N
yMQ
yMQ
yMQ
dlN
i
ii
i
i
ii
F
![Page 36: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/36.jpg)
F2L – Null Hypothesis C
Suppose instead that the null hypothesis is
H0: a = 0, d = 0
N
iiF
Nydal
1
22
22 2log22
10,0
![Page 37: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/37.jpg)
F2L – Null Hypothesis D
When the null hypothesis is
H0: = 0.5
2
1
2
23
2
22
2
21
2 2log2
2exp
4
1
2exp
2
1
2exp
4
1
log5.0
N
y
y
y
lN
i
i
i
i
F
![Page 38: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/38.jpg)
F2L – Statistical Test
The G statistic
21df2
3212
23212 ~
5.0ˆ,ˆ,ˆ,ˆ,ˆ
ˆ,ˆ,ˆ,ˆ,ˆ2
F
F
l
lG
![Page 39: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/39.jpg)
F2L – Maximization
Iterative methods are required to find the maximum likelihood estimates.
Other approaches have been suggested, such as combining moment estimation with maximum likelihood approach. The resulting system of equations to solve for the estimators is given on the next slide.
![Page 40: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/40.jpg)
F2L – Finding MLEs
2
322
22
1222
23
22
21
22
23
222
21
222
32
212
3222
1
32
212
112
11211
121
112
111
121
AAAAAAaa
AAAAAAAa
AAAAAAAA
aa
Aa
AA
mmmS
mmmS
mmmS
m
m
m
![Page 41: Lecture 23: Quantitative Traits III](https://reader035.vdocuments.us/reader035/viewer/2022081519/5681364b550346895d9dca02/html5/thumbnails/41.jpg)
F2L – Dominant Marker Model
Modify the likelihood equations with QTL genotypes probabilities conditional on the marker genotype for a dominant marker.
Modify the expected trait values for each marker genotype.
Done.