applying haplotype models to association study design natalie castellana june 7, 2005
Post on 19-Dec-2015
217 views
TRANSCRIPT
Background
Certain characteristics are linked to genetic factors.
By finding models for genetic variation, we can determine which genes render an individual susceptible to a certain disease.
Normal Allele
Mutant allele
1000011010110100000010
Background(2)
SNPHaplotype
Association TestGiven that this sample has haplotype 01101, does it have the disease?
…1110101…
…1000011…
Genetic Variation
Mutation:
…1000001…
Recombination:
…1110011…
…1000101…
…1001001…
Because of recombination, similar genetic variation can be found within closely linked regions.
Generating Data
Generate genetic segments Isolate the disease causing allele, and
segregate the case (diseased) samples from the control (healthy) samples.
…1 0 1 1 1 0 1 1 1 1…
…1 1 1 0 0 1 1 0 1 0 …
…0 0 1 1 0 0 1 1 0 1…
…1 0 1 0 1 1 0 0 1 0…
Control Case
Testing individual SNP’s
Go through each SNP and determine which SNP’s accurately predict which samples have the disease and which do not.
Case Control
10 11….. 010 0…
01 10...... 100 1…
Haplotype block method
Instead of looking at each individual SNP, we can look at groups of contiguous SNP’s.
1101000000…11…
1101100100…01…
0111000000…10…
1101100100…00…
Blocks vs. SNP’s
High Return (Bounded Blocks)C = 15
0
100
200
300
400
500
600
700
800
900
1000
1 39 77 115 153 191 229 267 305 343 381 419 457 495 533 571 609 647 685 723 761 799 837 875 913 951 989 1027 1065
SNP Location
Chi S
quar
e Va
lue
Blocks
SNPs
Haplotype motif method
Notion that a sequence is the concatenation of segments (like the block method) but does not require conservation of boundaries.
1101000000…1100100100…0111000000…1101100111…
Approximation Algorithm
General idea:
…1 0 0 0 1 …………………………………
c c c cc c c c
Pick the best partition, minimizing the number of motifs needed to explain all the data.