variance heterogeneity in genetic mapping · variance heterogeneity in genetic mapping robert corty...

Click here to load reader

Upload: others

Post on 22-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Variance Heterogeneity in Genetic Mapping

    Robert Corty

    Curriculum of Bioinformatics and Computational BiologyValdar lab

    UNC Chapel Hill

    May 29, 2018

    1 / 64

  • Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)

    2. Genetic mapping with the double generalized linear model (30 minutes)

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    2 / 64

  • Introduction to Genetic Mapping Background

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    3 / 64

  • Introduction to Genetic Mapping Background

    The Need for Genetic Mapping

    I’ve tried everything in the book,but the patient is still suffering.What else can I try?

    I’m a world-renowned expert inthe field, but I can’t answer yourquestion.

    Would be very helpful to know which genes and pathways are involvedin disease pathophysiology.

    shutterstock.com4 / 64

  • Introduction to Genetic Mapping Background

    The Need for Genetic Mapping

    I’ve tried everything in the book,but the patient is still suffering.What else can I try?

    I’m a world-renowned expert inthe field, but I can’t answer yourquestion.

    Would be very helpful to know which genes and pathways are involvedin disease pathophysiology.

    shutterstock.com4 / 64

  • Introduction to Genetic Mapping Background

    The Need for Genetic Mapping

    I’ve tried everything in the book,but the patient is still suffering.What else can I try?

    I’m a world-renowned expert inthe field, but I can’t answer yourquestion.

    Would be very helpful to know which genes and pathways are involvedin disease pathophysiology.

    shutterstock.com4 / 64

  • Introduction to Genetic Mapping Background

    Approaches to Genetic Mapping

    QTL = “quantitative trait locus”A genomic region containing factors that influence a quantitativetrait of interest

    First step toward finding causal genes and regulatory factors, whichteach us about (patho)physiology and can provide drug targets.

    Human health and disease:human studies: find QTL for the relevant trait directlymodel organisms: find QTL that influence a model trait

    5 / 64

  • Introduction to Genetic Mapping Background

    Genetic Mapping is Ubiquitous and Productive

    relevant for every organism(first page of PubMed results include cow, wheat, maize, barley,soybean, petunia, pundamilia, horse, peanut, eucalyptus, catfish)relevant for (nearly) every disease and characteristicmouse

    the Jackson lab lists 12,397 QTL identifiedR/qtl has been cited 2,558 times

    humanNHGRI lists 5,168 QTL identified in humanGCTA has been cited 2,008 times

    6 / 64

  • Introduction to Genetic Mapping Background

    Genetic Mapping Procedure (Generic)

    1 Measure the trait of interest in a population of genetically-diverseorganisms

    2 Measure genetic variants an an appropriate density3 Apply an appropriate statistical test to each locus to quantify the

    evidence that it is a QTL

    7 / 64

  • Introduction to Genetic Mapping Motivating Concept

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    8 / 64

  • Introduction to Genetic Mapping Motivating Concept

    Example QTL

    A Ballele

    trai

    t

    A Ballele

    trai

    t

    9 / 64

  • Introduction to Genetic Mapping Motivating Concept

    Variance Heterogeneity

    A Ballele

    trai

    t

    may be interesting as a QTL itselfmay be useful to “accommodate”

    10 / 64

  • Introduction to Genetic Mapping Motivating Concept

    Variance Heterogeneity

    A Ballele

    trai

    t

    may be interesting as a QTL itselfmay be useful to “accommodate”

    10 / 64

  • Introduction to Genetic Mapping Motivating Concept

    Variance Heterogeneity

    A Ballele

    trai

    t

    may be interesting as a QTL itselfmay be useful to “accommodate”

    10 / 64

  • Genetic mapping with the double generalized linear model

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    11 / 64

  • Genetic mapping with the double generalized linear model Introduction to Linkage Disequilibrium Mapping

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    12 / 64

  • Genetic mapping with the double generalized linear model Introduction to Linkage Disequilibrium Mapping

    Terms

    SLM = standard linear modela linear model with a Gaussian error term

    DGLM = double generalized linear modela combination of two linear models, where one models patternsof mean heterogeneity and the other models patterns ofvariance heterogeneity (Smyth, 1989, JRSSB)

    Both of these models assume that, conditional on the locus andcovariates, the phenotypes are independent.

    13 / 64

  • Genetic mapping with the double generalized linear model Introduction to Linkage Disequilibrium Mapping

    Relevant Mapping Populations

    F2 intercross backcross

    }}inbreds

    studypop. }

    }inbredsstudypop.

    Collaborative CrossDrosophila Synthetic Population ResourceRequires careful and intentional breeding

    14 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    15 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    Standard Approach to Linkage Disequilibrium (LD)Mapping

    1 Breed a mapping population2 Measure the trait of interest in each organism3 Measure genotypes at a relevant set of markers (often sparse)4 Infer haplotype probabilities at a dense set of loci5 At each locus, test whether haplotype influences trait mean

    16 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    Example QTL Mapping Result

    Suto, 2013, J. Vet. Med. Sci.

    17 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    What do the Data look like?

    typical locus

    A Ballele

    trai

    t

    QTL

    A Ballele

    trai

    t

    LOD score = logarithm of the oddslogarithm of the ratio of the likelihood of the alternative model tothat of the null model

    LOD = log( p(y |locus is QTL)

    p(y |locus is not QTL)

    )

    18 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    What do the Data look like?

    typical locus

    A Ballele

    trai

    t

    QTL

    A Ballele

    trai

    t

    LOD score = logarithm of the oddslogarithm of the ratio of the likelihood of the alternative model tothat of the null model

    LOD = log( p(y |locus is QTL)

    p(y |locus is not QTL)

    )

    18 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    What do the Data look like?

    typical locus

    A Ballele

    trai

    t

    QTL

    A Ballele

    trai

    t

    LOD score = logarithm of the oddslogarithm of the ratio of the likelihood of the alternative model tothat of the null model

    LOD = log( p(y |locus is QTL)

    p(y |locus is not QTL)

    )

    18 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    Standard Linear Model (SLM)

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    with

    mi = xTi β + qTi α

    Legendre, 1805

    19 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    Standard Linear Model (SLM)

    β α

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    with

    mi = xTi β + qTi α

    Legendre, 1805

    19 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    SLM Test for mQTL

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    null model{

    mi = xTi β

    alternative model{

    mi = xTi β + qTi α

    alternative modelon permuted data

    {mi = xTi β + qTπ(i)α

    Churchill and Doerge, 1994, Genetics

    20 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    SLM Test for mQTL

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    null model{

    mi = xTi β

    alternative model{

    mi = xTi β + qTi α

    alternative modelon permuted data

    {mi = xTi β + qTπ(i)α

    Churchill and Doerge, 1994, Genetics

    20 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    SLM Test for mQTL

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    null model{

    mi = xTi β

    alternative model{

    mi = xTi β + qTi α

    alternative modelon permuted data

    {mi = xTi β + qTπ(i)α

    Churchill and Doerge, 1994, Genetics

    20 / 64

  • Genetic mapping with the double generalized linear model Standard Approach to LD Mapping

    SLM Test for mQTL

    covariate locus

    traitmean

    yi ∼ N(mi , σ2)

    null model{

    mi = xTi β

    alternative model{

    mi = xTi β + qTi α

    alternative modelon permuted data

    {mi = xTi β + qTπ(i)α

    Churchill and Doerge, 1994, Genetics

    20 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)

    21 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Types of QTL

    mQTL

    A Ballele

    trai

    t

    22 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Types of QTL

    vQTL

    A Ballele

    trai

    t

    22 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Types of QTL

    gene-by-gene interaction

    A Ballele

    trai

    t

    22 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Types of QTL

    mvQTL

    A Ballele

    trai

    t

    22 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Double Generalized Linear Model (DGLM)

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    with

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    Smyth, 1989

    23 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Double Generalized Linear Model (DGLM)

    β αγ θ

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    with

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    Smyth, 1989

    23 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Three DGLM-based Tests

    mQTL test

    covariate locus

    traitmean

    traitvariance

    vQTL test

    covariate locus

    traitmean

    traitvariance

    mvQTL test

    covariate locus

    traitmean

    traitvariance

    24 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ + qTi θ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTi θ25 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ + qTi θ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTi θ25 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ + qTi θ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTi θ25 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ + qTi θ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTi θ25 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for vQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi β + qTi αvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTi αvi = zTi γ + qTπ(i)θ26 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for vQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi β + qTi αvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTi αvi = zTi γ + qTπ(i)θ26 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for vQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi β + qTi αvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTi αvi = zTi γ + qTπ(i)θ26 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for vQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi β + qTi αvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTi αvi = zTi γ + qTπ(i)θ26 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mvQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTπ(i)θ27 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mvQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTπ(i)θ27 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mvQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTπ(i)θ27 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    DGLM Test for mvQTL

    covariate locus

    traitmean

    traitvariance

    yi ∼ N(mi , exp(vi ))

    null model{

    mi = xTi βvi = zTi γ

    alternative model{

    mi = xTi β + qTi αvi = zTi γ + qTi θ

    alternative modelon permuted data

    mi = xTi β + qTπ(i)αvi = zTi γ + qTπ(i)θ27 / 64

  • Genetic mapping with the double generalized linear model DGLM-based Approch to LD Mapping

    Results

    1 vQTL in Bailey et al. 2008 — pattern is not typically sought2 mQTL in Kumar et al. 2013 — correct for variance heterogeneity

    across QTL alleles3 mQTL in Leamy et al. 2000 — correct for variance heterogeneity

    across a background factor, F1 father

    28 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    29 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    Original Study

    Intercrossed C57BL/6J and C58/J, yielding 362 F2’s.Measured six behavioral traits that model aspects of anxiety andexploratory behavior.Reported 7 QTL, but none for rearing behavior.

    Bailey et al., GB&B, 2008

    30 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    New vQTL

    Rearing Events

    1 2 3 4 5 6 7 8 9 10

    0

    1

    2

    3

    LOD

    sco

    re

    Chromosome

    Corty et al., 2018 under revision at G3

    31 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    New vQTL

    Rearing Events

    1 2 3 4 5 6 7 8 9 10

    1

    0.1

    0.01

    −lo

    g10(

    p)

    Chromosome

    Corty et al., 2018 under revision at G3

    31 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    New vQTL

    α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10

    −lo

    g10(

    p)

    mQTL

    vQTL

    mvQTL

    SLM

    Rearing Events

    Corty et al., 2018 under revision at G3

    31 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    Understanding the vQTL

    ● ●●

    2.5

    3.0

    3.5

    B6 Het C58

    Chr 2, 65Mb marker

    Rea

    ring

    Beh

    avio

    r

    sex

    female

    male

    Rearing Behavior (counts)

    Corty et al., 2018 under revision at G3

    32 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    Understanding the vQTL (2)

    ● ●

    0.20

    0.25

    0.30

    0.35

    0.40

    3.0 3.1

    mean estimate +/− 1 SE

    SD

    est

    imat

    e +

    /− 1

    SE

    chr 2,65Mb marker

    ● B6

    Het

    C58

    Sex

    female

    male

    Mean and Variance Effect Estimatesfor Rearing Events

    Corty et al., 2018 under revision at G3

    33 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    Understanding the vQTL (2)

    ● ●

    0.20

    0.25

    0.30

    0.35

    0.40

    3.0 3.1

    mean estimate +/− 1 SE

    SD

    est

    imat

    e +

    /− 1

    SE

    chr 2,65Mb marker

    ● B6

    Het

    C58

    Sex

    female

    male

    Mean and Variance Effect Estimatesfor Rearing Events

    Corty et al., 2018 under revision at G3

    33 / 64

  • Genetic mapping with the double generalized linear model Bailey Reanalysis Identifies vQTL

    This vQTL is Robust...

    ...to log transform:

    α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

    −lo

    g10(

    p) mQTL

    vQTL

    mvQTL

    ...to inverse normal transform:α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

    −lo

    g10(

    p) mQTL

    vQTL

    mvQTL

    ...to square root transform:

    α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

    −lo

    g10(

    p) mQTL

    vQTL

    mvQTL

    ...to Poisson regression:

    α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X

    −lo

    g10(

    p) mQTL

    vQTL

    mvQTL

    Corty et al., 2018 under revision at G3

    34 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    35 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Original Study

    Intercrossed C57BL/6J and C56NL/6N, two “sister strains”Measured cocaine response and circadian behavior traitsReported 1 QTL for cocaine response, identified QTNNo QTL for circadian behavior traits by standard analysis

    Kumar et al., Science, 2013

    36 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Replicate Published QTL

    α = 0.05

    α = 0.01

    11 12 13 14 15 16 17 18 19 20

    −lo

    g10(

    p)

    mQTL

    vQTL

    mvQTL

    SLM

    30 Minute Cocaine Response

    Kumar et al., Science, 2013

    37 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    New “mQTL”

    α = 0.05

    α = 0.01

    1 2 3 4 5 6 7 8 9 10

    −lo

    g10(

    p)

    mQTL

    vQTL

    mvQTL

    SLM

    Circadian Wheel Running Activity (revolutions/minute)

    Corty et al., 2018 under revision at G3

    38 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Understanding the new QTL

    ●●

    0

    10

    20

    30

    40

    C57BL/6J Het C57BL/6N

    chr6, rs30314218

    aver

    age

    whe

    el s

    peed

    sex

    female

    male

    Circadian Wheel Running Activity (revolutions/minute)

    Corty et al., 2018 under revision at G3

    39 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Understanding the new QTL (2)

    5.0

    7.5

    10.0

    20 25 30

    mean estimate +/− 1 SE

    SD

    est

    imat

    e +

    /− 1

    SE

    rs30314218

    ● C57BL/6J

    Het

    C57BL/6N

    sex

    female

    male

    Mean and Variance Effect Estimates for Circadian Wheel Running Activity (revolutions/minute)

    Corty et al., 2018 under revision at G3

    40 / 64

  • Genetic mapping with the double generalized linear model Kumar Reanalysis Identifies “mQTL”

    Domain-specific view of the Activity Trait

    0h 12h 24h 36h 48h 0h 12h 24h 36h 48h 0h 12h 24h 36h 48h

    Corty et al., 2018 under revision at G3

    41 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    42 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Original Study

    Backcrossed CAST/Ei into M16i350 mice in mapping population, 92 markersSkull morphometrics, limb bone lengths, organ and body weightPublished many QTL, but none for body weight

    Leamy et al., Genet. Res., 2000, Physiol. Genom., 2002, Yi et al., Genetics, 2005

    43 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Body Weight at Three Weeks

    ●● ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●● ●

    ●●

    ●●●

    ● ●

    ● ●

    ●●●

    ●●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●●

    ●●

    ●●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●●

    ●●

    ●●

    ●● ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●●

    ●●●●

    ● ●

    ●●●

    ●● ●

    ●●

    ●●

    ●●●

    ●●

    ●● ●

    ● ●●●

    ●●●

    ●● ●●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ● ●●

    ●●

    ●●●

    ●●

    ●● ●●●●

    ●●

    ●●

    ●●

    ● ●●●●

    ●●

    ●●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    −10

    −5

    0

    5

    10

    1 2 3 4 5 6 7 8 9

    father

    SLM

    res

    idua

    l

    0.5

    1.0

    1.5

    2.0

    DGLM weight

    Residuals from Standard Linear Modelof Bodyweight at Three Weeks

    Corty et al., 2018 under revision at G3

    44 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    New mQTL

    α = 0.05

    α = 0.01

    10 11 12 13 14 15 16 17 18 19

    −lo

    g10(

    p)

    mQTL

    vQTL

    mvQTL

    traditional

    Bodyweight at Three Weeks

    Corty et al., 2018 under revision at G3

    45 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Understanding the new QTL

    ●●

    5

    10

    15

    20

    AA AB

    D11MIT11

    bw3w

    k SEX

    female

    male

    Corty et al., 2018 under revision at G3

    46 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Understanding the new QTL (2)

    ●●

    ● ●●●

    1

    2

    3

    4

    11 12 13 14

    mean estimate +/− 1 SE

    SD

    est

    imat

    e +

    /− 1

    SE

    father

    1

    2

    3

    4

    5

    6

    7

    8

    9

    D11MIT11

    ● AA

    AB

    Effects of father and D11MIT11on Bodyweight at Three Weeks

    Corty et al., 2018 under revision at G3

    47 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Summary — What is a QTL and what’s not?

    *BVH

    non-QTL QTL

    48 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Summary — What is a QTL and what’s not?

    *BVH

    non-QTL QTL

    48 / 64

  • Genetic mapping with the double generalized linear model Leamy Reanalysis Identifies mQTL

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    49 / 64

  • Genetic Mapping with the weighted linear mixed model

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    50 / 64

  • Genetic Mapping with the weighted linear mixed model Background

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    51 / 64

  • Genetic Mapping with the weighted linear mixed model Background

    Motivating Observation

    ●●

    ● ●

    ●●●

    ●●

    ●●

    ●●

    ●●●●

    ●●

    ●●

    ●●●●

    ●●

    ●●●

    ●●

    ●●

    ●●●●

    ● ●●●

    ●●

    ●●

    ●●

    ●●●●

    ●●●

    ●●

    ●●●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●● ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●●●

    ●●

    ●●

    ●● ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●

    ●●

    ●●●●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●●

    ●●

    ●●●

    ●●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●

    ●●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●

    ●●

    ●●

    ●●●●

    ●●●

    ●●

    ●●

    ●● ●

    ●●

    ●●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●● ●

    ●● ●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●●

    0

    5000

    10000

    15000

    129S

    1/S

    vIm

    JA

    /JA

    KR

    /JB

    ALB

    /cB

    yJB

    TB

    R T

    + tf

    /JB

    UB

    /BnJ

    C3H

    /HeJ

    C57

    BL/

    6JC

    57B

    LKS

    /JC

    57B

    R/c

    dJC

    57L/

    JC

    58/J

    CB

    A/J

    CE

    /JC

    ZE

    CH

    II/E

    iJD

    BA

    /2J

    DD

    Y/J

    clS

    idS

    eyF

    rkJ

    FV

    B/N

    JI/L

    nJK

    K/H

    IJLG

    /JLP

    /JM

    A/M

    yJM

    OLF

    /EiJ

    MR

    L/M

    pJM

    SM

    /Ms

    NO

    D/S

    hiLt

    JN

    ON

    /Shi

    LtJ

    NO

    R/L

    tJN

    ZB

    /BIN

    JN

    ZO

    /HIL

    tJN

    ZW

    /Lac

    JP

    /JP

    ER

    A/E

    iJP

    L/J

    PW

    D/P

    hJR

    IIIS

    /JS

    EA

    /GnJ

    SJL

    /JS

    KIV

    E/E

    iJS

    M/J

    SW

    R/J

    TALL

    YH

    O/J

    ngJ

    WS

    B/E

    iJZ

    ALE

    ND

    E/E

    iJ

    Strain

    Tota

    l Dis

    tanc

    e

    52 / 64

  • Genetic Mapping with the weighted linear mixed model Background

    Relevant Populations

    SLM and DGLM are only appropriate when individuals areequally-related, rare in populations not intentionally bred for linkagedisequilibrium mappingThe linear mixed model (LMM) accommodates this “differentialrelatedness” with a random effect term that has covariance patternedafter the genomic similarity.

    53 / 64

  • Genetic Mapping with the weighted linear mixed model Background

    Linear Mixed Model

    Consider an association mapping study with T = 4 strains. Lety be the vector of estimated strain means.

    y = xβ + α + �

    with

    α ∼ N(

    0,Kτ2)

    � ∼ N(

    0, Iσ2)

    where K is the genomic similarity matrix and I is the residualvariance matrix.

    V = Kτ2 + Iσ2

    h2 = τ2

    τ2 + σ2

    54 / 64

  • Genetic Mapping with the weighted linear mixed model Background

    Fitting the LMM

    For each genetic locus, use Brent’s method to find the ML value ofh2. For each fixed value of h2, the LMM can be fit by generalizedleast squares (GLS). For each h2...

    1 Find a matrix M such that MTM = V−12 Use M to fit the model by GLS.3 Add in log |V| to “back out” the LMM solution

    Very easy and fast if you know M!

    55 / 64

  • Genetic Mapping with the weighted linear mixed model Heteroskedastic LMM

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    56 / 64

  • Genetic Mapping with the weighted linear mixed model Heteroskedastic LMM

    Considering Heteroskedastic Residuals

    Homoskedastic LMM, M is known (Kang, Genetics, 2008):

    α ∼ N

    0, τ2 � ∼ N

    0, σ2

    Heteroskedastic LMM, previously no M was known:

    α ∼ N

    0, τ2 � ∼ N

    0, σ2

    57 / 64

  • Genetic Mapping with the weighted linear mixed model Heteroskedastic LMM

    Mhom and Mhet

    Mhom =(h2ΛK + (1− h2)I

    )− 12 UKTMhet = (h2ΛL + (1− h2)I)−

    12 ULTD−

    12

    where ΛX and UX are the eigen values and vectors of X and:

    L = D−12 KD−

    12

    58 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    Table of Contents

    1. Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    2. Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies “mQTL”Leamy Reanalysis Identifies mQTL

    3. Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results

    59 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    Example QQ Plot

    Null QQ plot for GWAS with100 strains and h2 = 0.5

    theoretical

    observed

    0.000 0.002 0.004 0.006 0.008 0.010

    0.000

    0.002

    0.004

    0.006

    0.008

    0.010LMEMMAISAMwISAM

    60 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    QQ Plots for a GWAS with 50 Strains

    61 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    Example ROC Plot

    0.00

    0.25

    0.50

    0.75

    1.00

    0.00 0.25 0.50 0.75 1.00

    False Positive Rate

    True

    Pos

    itive

    Rat

    e

    test●

    SLM

    EMMA

    ISAM

    wISAM

    62 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    AcknowledgmentsValdar lab: Will Valdar, Greg Keel, Paul Maurizio, Dan Oreper, Wes Crouse,Yanwei Cai, Kathie SunTarantino lab: Lisa Tarantino, Sarah Schoenrockdissertation committee: Fernando, Will, Lisa, Yun, JimMDPhD program: Gene Orringer, Mohanish Deshmukh, Toni Darville, AlisonRegan, Carol HerionBCB program: Tim Elston, Cara Marlow, John Cornettfriends and family: mom, dad, grandparents, Edward, Maithri, Mike, Brooke,Marni, Chris, Kelly, Patrick, Lee, Sarahfunders: NIMH RC: F30; NIGMS WV: R01, MIRA;

    NIGMS TE, WV: T32; NIGMS GO, MD, TD: T32

    63 / 64

  • Genetic Mapping with the weighted linear mixed model Simulation Results

    Acknowledgment

    64 / 64

    Introduction to Genetic Mapping (10 minutes)BackgroundMotivating Concept

    Genetic mapping with the double generalized linear model (30 minutes)Introduction to Linkage Disequilibrium MappingStandard Approach to LD MappingDGLM-based Approch to LD MappingBailey Reanalysis Identifies vQTLKumar Reanalysis Identifies ``mQTL''Leamy Reanalysis Identifies mQTL

    Genetic Mapping with the weighted linear mixed model (10 minutes)BackgroundHeteroskedastic LMMSimulation Results