the statistical models of genomic prediction...mendelian sampling sire child 1 child 2 child 3 child...

The statistical models of genomic prediction

John M Hickey, Chris Gaynor, Gregor Gorjanc

www.alphagenes.roslin.ed.ac.uk

@hickeyjohn

Genomic selection

Goddard & Hayes Nat. Rev. Genet. 2009

“GS is the quantitative geneticists revenge on molecular genetics” - A. Archibald

Relationship within and between training and prediction individuals

Relationships between TP and selection candidates leveraged for prediction

Selection candidates Training Pop.

Sel

ectio

n ca

ndid

ates

Tr

aini

ng P

op.

226240244248254542316323823619523522223223723923418480765511594103105444058167881461291947117513625113811551631413148373524223416576133180916833174166158114561161016118591704359118121124102211871509739651116711947751232021922092042062501541492039216019983182208179205201207454168425121910912815916122510812513711726107848102162242201981811341431878590521771275327195714162462021171301386610014515111048916413546491701511321213182412451891402471061475022721415224319036176781127238212253213301692916218819319623332156237760249956421197218242131622281481101781711572302524531446996737417298791732002818614219118322321582251153229998622121793122126210139120

120139210126122932172218699229153251822152231831911421862820017379981727473966914435425223015717117811014822862131242218197126495249607723156322331961931881622916930213253212387211278176361902431522142275014710624714018924524118131213215117049461351648910411151451006613813017212024616145719275312717752908518714313418119822022421610884107261171371251082251611591281092195142684145207201205179208182831991609220314915425020620420919220212375471196711165399715087211102124121118594370911856110111656114158166174331689180133671653422243537483114116315581113251361757119412914688167584044105103941155576801842342392372322222351952362386323154254248244240226

Haplotypes Genomic relationship matrix

Useful things with matrices

•  Counting how many animals passing a scales

•  Summing the animals weight

x =111

!

"

###

$

%

&&&

x'x = 1 1 1!"

#$

111

!

"

%%%

#

$

&&&= 1×1( )+ 1×1( )+ 1×1( ) = 3

y =101520

!

"

###

$

%

&&&

x'y = 1 1 1!"

#$

101520

!

"

%%%

#

$

&&&= 1×10( )+ 1×15( )+ 1×20( ) = 45


•  Averaging the animals weight

x'y = 1 1 1!"

#$

101520

!

"

%%%

#

$

&&&= 1×10( )+ 1×15( )+ 1×20( ) = 45

x'x = 1 1 1!"

#$

111

!

"

%%%

#

$

&&&= 1×1( )+ 1×1( )+ 1×1( ) = 3

x'yx'x

= x'x[ ]-1 x'y = b

453=13× 45=15


•  Summing total weight in males and females

•  Weight of average male and average female

X =110

001

!

"

####

$

%

&&&&

X'y =1 1 0

0 0 1

!

"

###

$

%

&&&

101520

!

"

###

$

%

&&&= 25

20

!

"#

$

%&y =

101520

!

"

###

$

%

&&&

X'yX'X

= X'X[ ]-1 X'y = b2520

!

"#

$

%&

2 00 1

!

"#

$

%&

=

12

0

0 11

!

"

####

$

%

&&&&

2520

!

"#

$

%&=

12.520

!

"#

$

%&

b11 =12×25

"

#$

%

&'+ 0×20( ) =12.5

Shrinkage – Random Wand

•  Ridge regression •  BayesA •  BayesB •  BayesC •  BayesLasso •  BayesR •  FnBayesB

•  All differ in the shrinkage parameter –  Some measure of our belief

Lets put in a little bit of genetics

•  Diploid genomes

–  Markers are AA, Aa, aA, or aa

–  Label a=0 and A=1

–  Thus the dosage is: •  AA=2 •  Aa=1 •  aA=1 •  aa=0

Mixed model equations

•  Sample mean 0.75 •  True intercept is 0.19 •  True effect is 0.50

X'X X'ZZ'X Z'Z

!

"#

$

%&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

X'X X'ZZ'X Z'Z+ Iλ

!

"#

$

%&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

y =

0.100.701.300.65

1.250.120.681.20

!

"

###########

$

%

&&&&&&&&&&&

Z =

0121

2012

!

"

###########

$

%

&&&&&&&&&&&

X =

1111

1111

!

"

###########

$

%

&&&&&&&&&&&

LHS = 8 99 15

!

"#

$

%&

LHS = 8 99 15.85

!

"#

$

%&

b = 0.110.57

!

"#

$

%&

RHS = 69.53

!

"#

$

%&

RHS = 69.53

!

"#

$

%&

b = 0.200.49

!

"#

$

%& λ = 0.85

TBV =

0.00.51.00.5

1.00.00.51.0

!

"

###########

$

%

&&&&&&&&&&&

A range of shrinkage values

•  If Lambda =1000 the SNP solution =0.00 •  And the solution for the intercept = 0.75

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0 0.2 0.4 0.6 0.8 1 1.2

BetaHat

Lambda

Shrinkage versus more data

•  Two data sets •  One with 8 animals, the other with 80 animals •  Compare effect of Lambda in both

X'X X'ZZ'X Z'Z+ Iλ

!

"#

$

%&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

LHS = 8 99 15

!

"#

$

%& RHS = 6

9.53

!

"#

$

%& b = 0.11

0.57

!

"#

$

%& LHS = 80 85

85 263

!

"#

$

%& RHS = 57.4

147.94

!

"#

$

%& b = 0.18

0.50

!

"#

$

%&

No Lambda

Lambda = 5.0 (extremely high value)

LHS = 8 99 20

!

"#

$

%& b = 0.43

0.28

!

"#

$

%& LHS = 80 85

85 268

!

"#

$

%& b = 0.18

0.50

!

"#

$

%&RHS = 57.4

147.94

!

"#

$

%&RHS = 6

9.53

!

"#

$

%&

Mendelian sampling

Sire

Child 1 Child 2 Child 3 Child 4

Child 5 Child 6

In theory you can have sibs that are genetically unrelated

This is why I am different from my brother

0

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Additive genetic relation

HSFSm=4A =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.25 0.25 0.250.50 0.25 1.00 0.25 0.250.50 0.25 0.25 1.00 0.250.50 0.25 0.25 0.25 1.00

!

"

######

$

%

&&&&&&

G =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.20 0.30 0.200.50 0.20 1.00 0.20 0.300.50 0.30 0.20 1.00 0.200.50 0.20 0.30 0.20 1.00

!

"

######

$

%

&&&&&&

And “hidden” relationships

Population version

10 animal example

•  10 animal example –  2 unrelated sire families (FamA and FamB) –  Dam’s are unrelated

•  In each family –  2 half sibs used in prediction set –  Sire and 2 half sibs used in training set –  5 individuals from other family used in training set

•  Purpose –  Show prediction due to parent average versus MS –  Pedigree versus genomics –  Close versus distant relatives –  Show shrinkage

Pedigree

ID Sire Dam

1 0 02 1 03 1 04 1 05 1 06 0 07 6 08 6 09 6 0

10 6 0

Genetic relationships

•  Captured in a matrix A – traditionally built using pedigree –  Relationship between each pair of individuals

•  Range from 0 to 2

–  Inbred individuals have a relationship with themselves of 2

–  Pair of completely unrelated individuals have a coefficient of relationship of 0

–  Full sib have a relationship of 0.5 •  If parents are not related

–  Half sibs have a relationship of 0.25 •  If parents are not related

“some animals are more equal than others”…….. even if the additive genetic relationship is the same

0

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Additive genetic relation

HSFSm=4

Genomic Relationships

e.g. actual relationship between HS can vary between 0.2 and 0.3

Lets do 5 animals first ID Sire Dam 1 0 02 1 03 1 04 1 05 1 0

A =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.25 0.25 0.250.50 0.25 1.00 0.25 0.250.50 0.25 0.25 1.00 0.250.50 0.25 0.25 0.25 1.00

!

"

######

$

%

&&&&&&

G =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.20 0.30 0.200.50 0.20 1.00 0.20 0.300.50 0.30 0.20 1.00 0.200.50 0.20 0.30 0.20 1.00

!

"

######

$

%

&&&&&&

Pedigree tells: Which family you belong to

Genomics tells: Which family you belong to Which sib you are more closely related to And shows “hidden” relationships (We will see the last bit with 10 animals) Linkage

10 animals for comparison

A =

1.00 0.50 0.50 0.50 0.50 0.00 0.00 0.00 0.00 0.000.50 1.00 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 1.00 0.25 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 0.25 1.00 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 0.25 0.25 1.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 0.00 0.00 1.00 0.50 0.50 0.50 0.500.00 0.00 0.00 0.00 0.00 0.50 1.00 0.25 0.25 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 1.00 0.25 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 0.25 1.00 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 0.25 0.25 1.00

!

"

##############

$

%

&&&&&&&&&&&&&&

ID Sire Dam 1 0 02 1 03 1 04 1 05 1 06 0 07 6 08 6 09 6 0

10 6 0

G =

1.00 0.50 0.50 0.50 0.50 0.02 0.02 0.02 0.02 0.020.50 1.00 0.20 0.30 0.20 0.02 0.01 0.03 0.01 0.030.50 0.20 1.00 0.20 0.30 0.02 0.03 0.01 0.03 0.010.50 0.30 0.20 1.00 0.20 0.02 0.01 0.03 0.01 0.030.50 0.20 0.30 0.20 1.00 0.02 0.03 0.01 0.03 0.010.02 0.02 0.02 0.02 0.02 1.00 0.50 0.50 0.50 0.500.02 0.01 0.03 0.01 0.03 0.50 1.00 0.20 0.30 0.200.02 0.03 0.01 0.03 0.01 0.50 0.20 1.00 0.20 0.300.02 0.01 0.03 0.01 0.03 0.50 0.30 0.20 1.00 0.200.02 0.03 0.01 0.03 0.01 0.50 0.20 0.30 0.20 1.00

!

"

##############

$

%

&&&&&&&&&&&&&&

Family relationships Family relationships Segregation within family Missing pedigree = “Unrelated” Linkage Linkage disequilibrium

5 animal example

X'X X'ZZ'X Z'Z+G-1λ

!

"##

$

%&&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

X'X X'ZZ'X Z'Z+A-1λ

!

"##

$

%&&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

A =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.25 0.25 0.250.50 0.25 1.00 0.25 0.250.50 0.25 0.25 1.00 0.250.50 0.25 0.25 0.25 1.00

!

"

######

$

%

&&&&&&

G =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.20 0.30 0.200.50 0.20 1.00 0.20 0.300.50 0.30 0.20 1.00 0.200.50 0.20 0.30 0.20 1.00

!

"

######

$

%

&&&&&&

LHS−1 =

1.55 -1.27 -1.18 -1.18 -0.64 -0.64-1.27 1.64 1.09 1.09 0.82 0.82-1.18 1.09 1.53 0.93 0.55 0.55-1.18 1.09 0.93 1.53 0.55 0.55-0.64 0.82 0.55 0.55 1.91 0.41-0.64 0.82 0.55 0.55 0.41 1.91

"

#

$$$$$$$

%

&

'''''''

LHS−1 =

1.52 -1.26 -1.15 -1.15 -0.63 -0.63-1.26 1.63 1.07 1.07 0.81 0.81-1.15 1.07 1.49 0.88 0.58 0.50-1.15 1.07 0.88 1.49 0.50 0.58-0.63 0.81 0.58 0.50 1.90 0.32-0.63 0.81 0.50 0.58 0.32 1.90

"

#

$$$$$$$

%

&

'''''''

Solutions =

0.000.00-1.201.200.000.00

!

"

#######

$

%

&&&&&&&

Solutions =

0.000.00-1.231.23-0.150.15

!

"

#######

$

%

&&&&&&&

TrueValues =

0.000.00−2.002.00−2.002.00

"

#

$$$$$$$$

%

&

''''''''

y =

0.00-2.002.00MissingMissing

!

"

######

$

%

&&&&&&

RHS =

0.000.00-2.002.00##

!

"

#######

$

%

&&&&&&&

5 animal example

•  When we do BLUP we get an estimated breeding value (EBV)

•  An EBV is simply a weighted average of all the phenotypic data available –  With simultaneous correction for all other effects

•  The weightings are determined by the inverse of the LHS –  This is primarily driven by the relationship matrix

5 animal example

X'X X'ZZ'X Z'Z+G-1λ

!

"##

$

%&&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

X'X X'ZZ'X Z'Z+A-1λ

!

"##

$

%&&

-1X'yZ'y

!

"##

$

%&&= b

u

!

"##

$

%&&

A =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.25 0.25 0.250.50 0.25 1.00 0.25 0.250.50 0.25 0.25 1.00 0.250.50 0.25 0.25 0.25 1.00

!

"

######

$

%

&&&&&&

G =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.20 0.30 0.200.50 0.20 1.00 0.20 0.300.50 0.30 0.20 1.00 0.200.50 0.20 0.30 0.20 1.00

!

"

######

$

%

&&&&&&

LHS−1 =

1.55 -1.27 -1.18 -1.18 -0.64 -0.64-1.27 1.64 1.09 1.09 0.82 0.82-1.18 1.09 1.53 0.93 0.55 0.55-1.18 1.09 0.93 1.53 0.55 0.55-0.64 0.82 0.55 0.55 1.91 0.41-0.64 0.82 0.55 0.55 0.41 1.91

"

#

$$$$$$$

%

&

'''''''

LHS−1 =

1.52 -1.26 -1.15 -1.15 -0.63 -0.63-1.26 1.63 1.07 1.07 0.81 0.81-1.15 1.07 1.49 0.88 0.58 0.50-1.15 1.07 0.88 1.49 0.50 0.58-0.63 0.81 0.58 0.50 1.90 0.32-0.63 0.81 0.50 0.58 0.32 1.90

"

#

$$$$$$$

%

&

'''''''

RHS =

0.000.00-2.002.00##

!

"

#######

$

%

&&&&&&&

Solutions =

0.000.00-1.201.200.000.00

!

"

#######

$

%

&&&&&&&

Solutions =

0.000.00-1.231.23-0.150.15

!

"

#######

$

%

&&&&&&&

TrueValues =

0.000.00−2.002.00−2.002.00

"

#

$$$$$$$$

%

&

''''''''

y =


!

"

######

$

%

&&&&&&

5 animal example

LHS−1 =

1.55 -1.27 -1.18 -1.18 -0.64 -0.64-1.27 1.64 1.09 1.09 0.82 0.82-1.18 1.09 1.53 0.93 0.55 0.55-1.18 1.09 0.93 1.53 0.55 0.55-0.64 0.82 0.55 0.55 1.91 0.41-0.64 0.82 0.55 0.55 0.41 1.91

"

#

$$$$$$$

%

&

'''''''

LHS−1 =

1.52 -1.26 -1.15 -1.15 -0.63 -0.63-1.26 1.63 1.07 1.07 0.81 0.81-1.15 1.07 1.49 0.88 0.58 0.50-1.15 1.07 0.88 1.49 0.50 0.58-0.63 0.81 0.58 0.50 1.90 0.32-0.63 0.81 0.50 0.58 0.32 1.90

"

#

$$$$$$$

%

&

'''''''

RHS =

0.000.00-2.002.00##

!

"

#######

$

%

&&&&&&&

Solutions =

0.000.00-1.201.200.000.00

!

"

#######

$

%

&&&&&&&

Solutions =

0.000.00-1.231.23-0.150.15

!

"

#######

$

%

&&&&&&&

TrueValues =

0.000.00−2.002.00−2.002.00

"

#

$$$$$$$$

%

&

''''''''

y =


!

"

######

$

%

&&&&&&

uSon4 = LHSSon4,Mean−1 ×RHSMean( )+ LHSSon4,Sire

−1 ×RHSSire( )+ LHSSon4,Son2−1 ×RHSSon2( )+ LHSSon4,Son3

−1 ×RHSSon3( )

uSon4 = −0.63×0.00( )+ 0.81×0.00( )+ 0.58×−2.00( )+ 0.50×2.00( ) = −0.15→Genomic

uSon4 = −0.64×0.00( )+ 0.82×0.00( )+ 0.55×−2.00( )+ 0.55×2.00( ) = 0.00→ Pedigree

A =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.25 0.25 0.250.50 0.25 1.00 0.25 0.250.50 0.25 0.25 1.00 0.250.50 0.25 0.25 0.25 1.00

!

"

######

$

%

&&&&&&

G =

1.00 0.50 0.50 0.50 0.500.50 1.00 0.20 0.30 0.200.50 0.20 1.00 0.20 0.300.50 0.30 0.20 1.00 0.200.50 0.20 0.30 0.20 1.00

!

"

######

$

%

&&&&&&

10 animals

A =

1.00 0.50 0.50 0.50 0.50 0.00 0.00 0.00 0.00 0.000.50 1.00 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 1.00 0.25 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 0.25 1.00 0.25 0.00 0.00 0.00 0.00 0.000.50 0.25 0.25 0.25 1.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 0.00 0.00 1.00 0.50 0.50 0.50 0.500.00 0.00 0.00 0.00 0.00 0.50 1.00 0.25 0.25 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 1.00 0.25 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 0.25 1.00 0.250.00 0.00 0.00 0.00 0.00 0.50 0.25 0.25 0.25 1.00

!

"

##############

$

%

&&&&&&&&&&&&&&

ID Sire Dam 1 0 02 1 03 1 04 1 05 1 06 0 07 6 08 6 09 6 0

10 6 0

G =

1.00 0.50 0.50 0.50 0.50 0.02 0.02 0.02 0.02 0.020.50 1.00 0.20 0.30 0.20 0.02 0.01 0.03 0.01 0.030.50 0.20 1.00 0.20 0.30 0.02 0.03 0.01 0.03 0.010.50 0.30 0.20 1.00 0.20 0.02 0.01 0.03 0.01 0.030.50 0.20 0.30 0.20 1.00 0.02 0.03 0.01 0.03 0.010.02 0.02 0.02 0.02 0.02 1.00 0.50 0.50 0.50 0.500.02 0.01 0.03 0.01 0.03 0.50 1.00 0.20 0.30 0.200.02 0.03 0.01 0.03 0.01 0.50 0.20 1.00 0.20 0.300.02 0.01 0.03 0.01 0.03 0.50 0.30 0.20 1.00 0.200.02 0.03 0.01 0.03 0.01 0.50 0.20 0.30 0.20 1.00

!

"

##############

$

%

&&&&&&&&&&&&&&

Family relationships Family relationships Segregation within family Missing pedigree = “Unrelated” Linkage Linkage disequilibrium

Z'Z+A-1λ!" #$-1Z'y[ ] = u[ ] Z'Z+G-1λ!" #$

-1Z'y[ ] = u[ ]

10 animal example

LHS−1 =

0.59 0.12 0.12 0.00 0.00 0.00 0.29 0.29 0.00 0.000.12 0.62 0.02 0.00 0.00 0.00 0.06 0.06 0.00 0.000.12 0.02 0.62 0.00 0.00 0.00 0.06 0.06 0.00 0.000.00 0.00 0.00 0.59 0.12 0.12 0.00 0.00 0.29 0.290.00 0.00 0.00 0.12 0.62 0.02 0.00 0.00 0.06 0.060.00 0.00 0.00 0.12 0.02 0.62 0.00 0.00 0.06 0.060.29 0.06 0.06 0.00 0.00 0.00 1.65 0.15 0.00 0.000.29 0.06 0.06 0.00 0.00 0.00 0.15 1.65 0.00 0.000.00 0.00 0.00 0.29 0.06 0.06 0.00 0.00 1.65 0.150.00 0.00 0.00 0.29 0.06 0.06 0.00 0.00 0.15 1.65

"

#

$$$$$$$$$$$$$$

%

&

''''''''''''''

RHS =

0-22201822####

!

"

##############

$

%

&&&&&&&&&&&&&&

A G1 2 3 6 7 8 4 5 9 10 1 2 3 6 7 8 4 5 9 10

1 1.00 0.50 0.50 0.00 0.00 0.00 0.50 0.50 0.00 0.00 1 1.00 0.50 0.50 0.02 0.02 0.02 0.50 0.50 0.02 0.022 0.50 1.00 0.25 0.00 0.00 0.00 0.25 0.25 0.00 0.00 2 0.50 1.00 0.20 0.02 0.01 0.03 0.30 0.20 0.01 0.033 0.50 0.25 1.00 0.00 0.00 0.00 0.25 0.25 0.00 0.00 3 0.50 0.20 1.00 0.02 0.03 0.01 0.20 0.30 0.03 0.016 0.00 0.00 0.00 1.00 0.50 0.50 0.00 0.00 0.50 0.50 6 0.02 0.02 0.02 1.00 0.50 0.50 0.02 0.02 0.50 0.507 0.00 0.00 0.00 0.50 1.00 0.25 0.00 0.00 0.25 0.25 7 0.02 0.01 0.03 0.50 1.00 0.20 0.01 0.03 0.30 0.208 0.00 0.00 0.00 0.50 0.25 1.00 0.00 0.00 0.25 0.25 8 0.02 0.03 0.01 0.50 0.20 1.00 0.03 0.01 0.20 0.304 0.50 0.25 0.25 0.00 0.00 0.00 1.00 0.25 0.00 0.00 4 0.50 0.30 0.20 0.02 0.01 0.03 1.00 0.20 0.02 0.025 0.50 0.25 0.25 0.00 0.00 0.00 0.25 1.00 0.00 0.00 5 0.50 0.20 0.30 0.02 0.03 0.01 0.20 1.00 0.01 0.039 0.00 0.00 0.00 0.50 0.25 0.25 0.00 0.00 1.00 0.25 9 0.02 0.01 0.03 0.50 0.30 0.20 0.02 0.01 1.00 0.2010 0.00 0.00 0.00 0.50 0.25 0.25 0.00 0.00 0.25 1.00 10 0.02 0.03 0.01 0.50 0.20 0.30 0.02 0.03 0.20 1.00

Solutions =

0.00-1.201.2016.4714.0916.490.000.008.248.24

!

"

##############

$

%

&&&&&&&&&&&&&&

Solutions =

0.09-1.091.3516.5813.9016.340.250.378.168.41

!

"

##############

$

%

&&&&&&&&&&&&&&

TrueBreedingValues =

0-22201822-221822

!

"

##############

$

%

&&&&&&&&&&&&&&

0.5853 0.1219 0.1219 0.0012 0.0017 0.0017 0.2926 0.2926 0.0040 0.00400.1219 0.6247 0.0094 0.0017 -0.0006 0.0053 0.0992 0.0225 -0.0014 0.01280.1219 0.0094 0.6247 0.0017 0.0053 -0.0006 0.0225 0.0992 0.0128 -0.00140.0012 0.0017 0.0017 0.5853 0.1219 0.1219 0.0040 0.0040 0.2926 0.29260.0017 -0.0006 0.0053 0.1219 0.6247 0.0094 -0.0014 0.0128 0.0992 0.02250.0017 0.0053 -0.0006 0.1219 0.0094 0.6247 0.0128 -0.0014 0.0225 0.09920.2926 0.0992 0.0225 0.0040 -0.0014 0.0128 1.6380 0.0539 0.0167 0.01080.2926 0.0225 0.0992 0.0040 0.0128 -0.0014 0.0539 1.6380 -0.0092 0.03670.0040 -0.0014 0.0128 0.2926 0.0992 0.0225 0.0167 -0.0092 1.6380 0.05390.0040 0.0128 -0.0014 0.2926 0.0225 0.0992 0.0108 0.0367 0.0539 1.6380

LHS−1 =

10 animal example

uSonA4 = LHSSonA4,SireA−1 ×RHSSireA( )+ LHSSonA4,SonA2

−1 ×RHSSonA2( )+ LHSSonA4,SonA3−1 ×RHSSonA3( )

uSonA4 = LHSSonA4,SireA−1 ×RHSSireA( )+ LHSSonA4,SonA2

−1 ×RHSSonA2( )+ LHSSonA4,SonA3−1 ×RHSSonA3( )+ LHSSonA4,SireB

−1 ×RHSSireB( )+ LHSSonA4,SonB6−1 ×RHSSonB6( )+ LHSSonA4,SonB7

−1 ×RHSSonB7( )

uSon4 = 0.29×0.00( )+ 0.06×−2.00( )+ 0.06×2.00( ) = 0.00→ Pedigree

uSon4 = 0.2926×0.00( )+ 0.0992×−2.00( )+ 0.0225×2.00( )+ 0.0040×20.00( )+ −0.0014×18.00( )+ 0.00128×22.00( ) = 0.18→Genomic

Phenotype is missing for other non-zero coefficient

Matrix Inversion

•  http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/lecture-3-multiplication-and-inverse-matrices/

•  Minute 37 of this video from Gilbert Strang

•  Gauss-Jordan Elimination

Inversion by Gauss-Jordan

A = 1 32 7

!

"#

$

%&

1 32 7

1 00 1

!

"

##

$

%

&&

A−1 = 7 −3−2 1

"

#$

%

&' I = 1 0

0 1

!

"#

$

%&

A-1A = I = AA-1

1 30 1

1 0−2 1

"

#

$$

%

&

''

1 00 1

7 −3−2 1

"

#

$$

%

&

''

Move 1 Subtract 2 of Row 1 from Row 2

Move 2 Subtract 3 of Row 2 from Row 1


•  Averaging the animals weight

x'y = 1 1 1!"

#$

101520

!

"

%%%

#

$

&&&= 1×10( )+ 1×15( )+ 1×20( ) = 45

x'x = 1 1 1!"

#$

111

!

"

%%%

#

$

&&&= 1×1( )+ 1×1( )+ 1×1( ) = 3

x'yx'x

= x'x[ ]-1 x'y = b

453=13× 45=15

Gauss Seidel Residual Update

•  Easy efficient way to solve and understand genomic prediction equations

•  Form –  X’X (diagonal) –  Form X’y –  Initialize values for beta’s –  Assume current values of beta-i’s are correct –  Form new y vector (called e) based on the residuals –  Estimate new solution for betai (Xi’e divided by Xi’Xi) –  Repeat until convergence

•  Simple extension to Bayesian model

Legarra and Misztal (JDS 2008)

Excel Sheet

•  Lots of little examples with Excel

GBLUP versus other methods

•  Genomic BLUP is the simplest method to do genomic evaluations

•  Algebraically identical to ridge regression

•  Ridge regression treats each marker as a random effect

•  Ridge regression has the same shrinkage parameter for each marker

•  Other methods allow heterogeneous shrinkage parameters

Brief description of SNP models

•  Genomic selection prediction models treat markers as random effects

•  MARS treats markers as fixed effects

•  Random effects have two benefits –  Trick to allow all markers be fitted simultaneously –  Shrinkage

•  Fixed effect models overestimate marker effects •  Random effects models correct for this overestimation by

shrinking marker effects back towards the mean of all marker effects

•  Shrinkage is proportional to the uncertainty in the marker effect (and a statistical prior)

–  More uncertainty = more shrinkage towards the mean –  More information to estimate effect = less shrinkage

•  Ridge regression –  All SNP’s in the model –  All have equal shrinkage parameter –  Shrinkage parameter is estimated or set apriori

•  BayesA –  All SNP’s in the model –  Each SNP has unique shrinkage parameter –  Each shrinkage parameter estimated –  Shape and scale parameters are fixed –  Problem is it cannot shrink to zero


•  BayesLasso –  Similar to BayesA –  Each SNP has unique shrinkage parameter –  Each shrinkage parameter estimated –  Shape and scale parameters are estimated –  Use inverse Gaussian distribution instead of inverse chi square

which allows greater shrinkage towards zero


•  BayesB –  Similar to BayesA except that proportion of 1- π SNPs

are in the model –  Thus can shrink SNPs to zero –  Mixture model

•  BayesCpi –  Has similarities to SnpBlup and BayesB –  Estimates π –  All SNP have equal shrinkage parameter –  Estimates shrinkage parameter –  Mixture model


Summary of SNP models

•  Ridge regression SNP have EQUAL shrinkage parameter

•  BayesA/BayesLasso achieve shrinkage •  Unequal shrinkage parameter for all SNPs

•  BayesB achieves shrinkage •  Only including the proportion of SNPs in each round •  These SNPs have UNEQUAL shrinkage parameter

•  BayesCpi achieves shrinkage –  Only including the proportion of SNPs in each round –  These SNPs have EQUAL shrinkage parameter

•  Models have an equivalence to genomic relationship matrix G = MHM’

Non-linear models

•  Reproducing kernel Hilbert space

•  Neural networks

•  Basically these bend the relationships in the genomic relationship matrix

•  Close relatives get more weight

•  Distant relatives less weight

•  Capture epistatic interactions that may be shared by close relatives but not by distant relatives

•  Perhaps useful for advanced yield trials

•  I favour simpler models –  Prevents a pointless debate about which model –  Genetic improvement is an additive thing

the statistical models of genomic prediction...mendelian sampling sire child 1 child 2 child 3 child...

Documents