Download - How to Measure Genetic Heterogeneity
![Page 1: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/1.jpg)
How to Measure Genetic Heterogeneity
International Workshop on Statistical-Mechanical Informatics
2009/09/13-2009/09/16
Unit of Statistical Genetics
Center for Genomic Medicine
Kyoto University
Ryo Yamada
![Page 2: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/2.jpg)
What is genetic heterogeneity?
![Page 3: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/3.jpg)
![Page 4: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/4.jpg)
![Page 5: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/5.jpg)
Biological Strategies
Landcover map by Environmental Research and Teaching at the University of Toronto
Lives cover land.
![Page 6: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/6.jpg)
Slime moldchanges its shape and moves
around but uses spores to reproduce.
Wikipedia
![Page 7: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/7.jpg)
Slime mold keeps looking for new
(better?) conditions.Space is too big to be covered completely.Therefore, multiple places are selected
and they are bridged without break.
Each part seems to act independently.
![Page 8: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/8.jpg)
Food Slime mold is clever enough to find the shortest
route in the labyrinth.
Its strategy is being
investigated as a new model of
parallel computing
system.
![Page 9: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/9.jpg)
Phylogenic tree
![Page 10: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/10.jpg)
![Page 11: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/11.jpg)
![Page 12: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/12.jpg)
![Page 13: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/13.jpg)
![Page 14: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/14.jpg)
LIFE keeps looking for new (better?)
conditions.
Space is too big to be covered completely.
Therefore, multiple places are selected
and they are bridged without break.
Each part seems to act independently.
Phylogenic tree
![Page 15: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/15.jpg)
They are bridged without break.
WE are here because WE are all offspring of “No-break” family sharing the features of continuous LIFE.
![Page 16: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/16.jpg)
?? Features of LIFE ??
• Keeps looking for something.
• Accepts multiple conditions as good ones.
• Stays contiguous each other.
• Acts independently.
![Page 17: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/17.jpg)
Slime mold distributes in physical space.
LIFE distributes in genetic space.
Distributions ~ Heterogeneity
![Page 18: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/18.jpg)
LIFE distributes in genetic space.
What is genetic space?
![Page 19: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/19.jpg)
DNA molecules4 letters, {A,T,G,C}L=3 x 109 in length (Homo sapience)
Sequence variations
4L; L=1,2,…
![Page 20: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/20.jpg)
Biological space is a part of physico-chemical space.
![Page 21: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/21.jpg)
Biological space is far much smaller than chemical space, But still enormously big.
![Page 22: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/22.jpg)
Environmental fluctuations change width of pathways in biological space
![Page 23: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/23.jpg)
Nature Reviews Genetics 3, 380-390 (2002); doi:10.1038/nrg795GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS
Nature Reviews Genetics 3, 380-390 (2002); doi:10.1038/nrg795GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS
Inter-species
Phylogeny
Intra-species
Recombination Graph
Inter-species heterogeneityIntra-species heterogeneity
![Page 24: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/24.jpg)
MutantLetters are changed
(Mutation)
Combination of letters are changed
(Recombination)
![Page 25: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/25.jpg)
Letters are changed : MutationCombination of letters are changed : Recombination
![Page 26: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/26.jpg)
4L → 2L
![Page 27: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/27.jpg)
Space is too big to be covered completely.
L=3 x 109
Variable sites ~ 10 x 106
Population size of homo sapience ~ 6x109
210,000,000>>>>> 6x109
![Page 28: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/28.jpg)
k sites → 2k sequence variations
00…000 p(1)
00…001 p(2)
00…010 p(3)
00…011 p(4)
…
11…111 p (2k) =1-(p(1)+…+p(2k-1))
• 2k -1 parameters
• Flat and equal
![Page 29: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/29.jpg)
Genetic heterogeneity
Dependency or association among variable sites
How to summarize the heterogeneity
with how many parameters?
![Page 30: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/30.jpg)
Pairwise relation : r2
Variance-covariance matrix• describes the
heterogeneity with k(k-1)/2 parameters for individual pairs.
• predicts test statistics of associated markers for association study.
![Page 31: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/31.jpg)
Ψ• Power set of {1,2,
…,k} is consisted of 2k subsets.
φ{1},{2},…,{k}{1,2},{1,3},…,{2,3},{2,4},…,{k-1,k}→Pairwise
{1,2,3},{1,2,4},…,{2,3,4},…,{k-2,k-1,k}…{1,2,…,k}
• Hierarchic parameters in full.
Hyper-cubes or lattice
![Page 32: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/32.jpg)
Subsets with tandem elements
in Ψ
Pairwise relation : r2
Tandem pairs are elements of
both.
{1,2,…,k}
{1,2}{1,2,3}
{1,2,3,4}
{1,3}
{1,4}
{1,k}
![Page 33: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/33.jpg)
One parameter for heterogeneity (1)
• EntropyH=-Σ p(i) ln(p(i)).
Effective No. sites to describe heterogeneity.
H=k ln(2) when all sites are independent.
H=0 when a clone (no variation).
• Entropy-based standardized measure of allelic association : ε ε=0 when all sites are independent.
ε=1 when only 2 types of sequence exist.
![Page 34: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/34.jpg)
One parameter for heterogeneity (2)
• Entropy-based measure of allelic association : ε When k=2,
ε=r2=Σ((obs-exp)2/exp)=Σ(obs2/exp)-1
• rk keeps the shape of the equation of r2 and fits the value from 0 to 1 for any k:rk=Σ(obs(1+1/(k-1))/exp (1/(k-1))) -1
![Page 35: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/35.jpg)
Space is too big to be covered completely.
210,000,000>>>>> 6x109
Every sequence is unique.Frequency is not useful.
![Page 36: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/36.jpg)
Sparse graph
• Sequences can be plotted at nodes in k-dimensional hyper cube.
• Graph distance between sequences is No. mutations.
![Page 37: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/37.jpg)
Graph distance between sequences is No. mutations.
Recombination’s distance?
Recombination is three-term relation.But graph is for two-
term relation.A more informational
tool is necessary.
![Page 38: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/38.jpg)
• Biological meaning of heterogeneity:
• It does not want to lose variations even when a significant part of it can not survive because they might be useful sometime.
![Page 39: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/39.jpg)
Survival curve of variable sites when a fraction of population extinct
• Each sequence set draws different survival curve.
• A measure to represent curves :
• The area upper the curve.
![Page 40: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/40.jpg)
Various ways to measure
• Pairwise relation r2 k(k+1)/2• Power set Ψ 2k-1 Hierarchic• Entropy H 1• Entropy-based ε 1• r2-generalization rk 1• Graph Mutation distance• Graph+α +Recombination
distance• Survival curve 1 Simulation,
Mutation and Recombination
distance
![Page 41: How to Measure Genetic Heterogeneity](https://reader035.vdocuments.us/reader035/viewer/2022081419/56812cd8550346895d919825/html5/thumbnails/41.jpg)
Unit of Statistical Genetic, Center for Genomic MedicineGraduate School of Medicine, Kyoto University
http://www.genome.med.kyoto-u.ac.jp/wiki_tokyo/index.php/