copy number variations and association mappingsssykim/teaching/s13/slides/lecture_cnvassoc.pdf ·...

26
Copy Number Variations and Association Mapping 02715 Advanced Topics in Computa8onal Genomics

Upload: voquynh

Post on 15-Sep-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Copy Number Variations and Association Mapping

02-­‐715  Advanced  Topics  in  Computa8onal  Genomics  

SNP and CNV Genotyping

•  SNP  genotyping  assumes  two  copy  numbers  at  each  locus  (i.e.,  no  CNVs)  

•  CNV  genotyping  assumes  no  SNPs  in  the  region  

•  However,  SNP  and  CNV  coexist  throughout  the  genome  

•  Ignoring  either  SNP  or  CNV  will  result  in  genotyping  error  –  E.g.,  genotypes  like  AAB,  A  

BirdSuite

•  A  joint  es8ma8on  of  SNP  calls  and  CNV  calls  by  combining  SNP  and  CNV  probe  informa8on  –  Discover  CNV  genotypes  for  known  CNVs  (previously  catalogued  CNVs)  –  Discovery  of  novel  and  rare  CNVs  

•  Associa8on  analysis  that  incorporate  both  SNP  and  CNV  informa8on  

BirdSuite

•  Canary:  assigns  copy  number  for  known  common  CNPs    

•  BirdSeed:  assigns  genotypes  for  SNPs  

•  BirdEye:  detects  novel  and  rare  CNVs    

•  Fawk:  integrates  SNPs,  CNPs  and  CNVs  

Birdsuite

Detecting CNPs

•  Probe  intensi8es  for  CNPs  across  mul8ple  individuals  

Detecting CNPs

•  Correlated  CNP  probe  intensi8es  across  neighboring  genome  regions  

Detecting CNPs

•  One-­‐dimensional  mixture  model  – Mean:  intensity  loca8on  of  the  CNP  

– Variance  around  each  copy  number  

– EM  algorithm  to  es8mate  the  parameters  

Detecting CNPs

•  Mixture  models  in  different  popula8ons  

Anomalous  CNPs  for  YRI  

Birdseed: Genotyping SNPs

•  SNP  probe  intensi8es  across  samples  at  a  locus  

Birdseed: Genotyping SNPs

•  Combining  SNP  and  CNP  probe  data  

•  Two-­‐dimensional  mixture  model  for  two-­‐copy  genotypes  

Birdseed: Genotyping SNPs

•  Mixture  component  for  minor  allele  homozygous  sites  may  be  hard  to  detect  if  minor  allele  frequency  is  low  

Birdseed: Genotyping SNPs

•  Impu8ng  mixture  components  for  SNP  genotypes  of  other  CNPs    

Birdseye: Detecting De Novo CNVs

•  Combine  informa8on  from  Canary  and  Birdseed  

CNV  probes   SNP  probes  

Birdseye: Detecting De Novo CNVs

•  Combine  informa8on  from  Canary  and  Birdseed  

Birdseye: Detecting De Novo CNVs Parents   Child  

Evaluation

•  There  is  no  ground-­‐truth  available.  However,  consistency  in  Mendelian  inheritance  (Mendelian  inconsistency,  or  MI)  in  HapMap  trio  samples  can  be  used  for  evalua8on.  

Evaluation

•  Birdsuite  vs.  Birdseed:  the  rate  of  mendelian  inconsistency  (MI)  in  SNPs  that  overlap  a  known  CNP  for  91  children    

Association Analysis with SNPs and CNVs

•  At  each  locus,  we  have  both  SNP  and  CNV  informa8on  and  want  to  incorporate  both  SNP  and  CNV  in  associa8on  test  –  How  can  we  disentangle  the  effects  of  SNPs  and  CNVs?    –  If  the  SNP  for  copy  B  lowers  ac8vity  than  A,  A  and  BB  may  have  similar  

phenotypes    

Association Analysis with SNPs and CNVs

•  Assuming  A,  B  represent  two  SNP  alleles,  we  fit  a  regression  model  

•  A+B:  total  copy  numbers  

•  A-­‐B:  SNP  genotypes  •  b1:  CNV  effect  •  b2:  SNP  effect  

Association Analysis with SNPs and CNVs

•  Assuming  A,  B  represent  two  SNP  alleles,  we  fit  a  regression  model  

–  When  there  is  no  copy  number  varia8ons  at  the  locus,  the  model  reduces  the  regression  model  with  only  SNP  effects  

–  When  there  is  no  SNP  genotype  varia8on  at  the  locus,  the  model  reduces  the  regression  on  only  CNVs  

Simulation Study

•  Scenarios  to  be  considered  –  Dele8on:  genotypes  {A,  B,  -­‐}  at  candidate  locus  –  Duplica8on:  genotypes  {A,  B,  BB}  at  candidate  locus  

–  Fix  the  frequency  of  B  alleles  and  duplica8on/dele8on  events  

•  Different  associa8on  tests  

Association Analysis with SNPs and CNVs

•  Simula8on  study  results  

SNP and CNV Associations

•  CNVs  have  been  found  implicated  in  rare  genomic  disorders  

•  CNVs  have  been  implicated  in  only  a  few  percent  of  the  2000  or  more  mendelian  diseases  

•  Complex  diseases  might  be  more  suscep8ble  to  ‘sod’  forms  of  varia8on  (varia8on  in  noncoding  sequences  and  copy  number  varia8ons)    

•  In  an  eQTL  study,  SNPs  and  CNVs  were  associated  with  83%  and  18%  of  the  gene  expression  traits  –  Poten8ally  greater  roles  of  SNPs  –  Possible  underes8ma8on  of  CNV  effects  -­‐  need  a  more  extensive  

catalogue  of  CNVs  

Association Studies with CNVs

•  Gender  ar8fact  for  dispersed  duplica8ons:  males/females  are  not  equally  represented  in  case  and  control  groups  

Summary

•  Birdsuite  determines  the  genotypes  based  on  both  CNVs  and  SNPs  

•  The  combined  genotypes  for  CNVs  and  SNPs  can  be  used  to  disentangle  the  effects  of  SNPs  and  CNVs  on  phenotypes