![Page 1: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/1.jpg)
Disease Gene Candidate Prioritization by Integrative Biology
Table of contents:
Where are we in the course pipeline?
Maturation of the project and the project description
Background
Networks – deducing functional relationships from PPI data networksProtein interaction networksFunctional modules / network clusters
Phenotype associationGrouping disorders based on their phenotype.Biological implications of phenotype clusters.
Method and examplesIntegrating protein interaction data and phenotype associations in an automated
large scale disease gene finding platform
![Page 2: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/2.jpg)
Sample PreparationHybridization
Array designProbe design
QuestionExperimental Design
Buy Chip/Array
Statistical AnalysisFit to Model (time series)
Expression IndexCalculation
Advanced Data AnalysisClustering PCA Classification Promoter AnalysisMeta analysis Survival analysis Regulatory Network
ComparableGene Expression Data
Normalization
Image analysis
The DNA Array Analysis Pipeline
![Page 3: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/3.jpg)
Project description
Søren Brunak, Professor, center director, Dr. Phil, PhD
Niels Tommerup Professor, Centre
Director, Dr. Med.
Masters thesis sept 2003
![Page 4: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/4.jpg)
Project description
Project:
Søren Brunak, Professor, center director, Dr. Phil, PhD
Niels Tommerup Professor, Centre
Director Dr. Med.
Find disease genes
Using Bioinformatics
![Page 5: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/5.jpg)
Project description
Project:
Søren Brunak, Professor, center director, Dr. Phil, PhD
Niels Tommerup Professor, Centre
Director Dr. Med.
Disease gene candidate prioritization by integrating protein interaction and phenotype association data.
![Page 6: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/6.jpg)
Project description AbstractThe availability of the first draft of the human genome in 2001 (Venter, Adams et al. 2001) led to an increase in the number of methods for disease gene identification. However, the general number of candidates in most loci linked to a particular phenotype is in the hundreds (McCarthy, Smedley et al. 2003; van Driel, Cuelenaere et al. 2003), and the underlying genes in over 900 of the ~ 2550 loci associated with a phenotype in the “Online Mendelian Inheritance in Man” (OMIM) database, have not yet been identified (Hamosh, Scott et al. 2005). Evidently disease gene identification continues to be a very strenuous challenge, since mutational analysis of hundreds of candidates in a critical interval using methods currently available is extremely resource demanding. Thus, prioritising the candidates based on different criteria followed by an extensive investigation of promising candidates, is a logical step in the disease gene finding process. With the advent of proteomics, we are now able to retrieve information on gene functions in a large-scale manner, thus bridging the gap between genotype and phenotype, a possibility with significant interest for disease gene candidate prioritization.We propose that automated correlation of phenotype association networks, with interolog data (the transfer of protein interactions between orthologous protein pairs in different organisms), is a powerful way of identifying good disease gene candidates in a large list of genes in loci associated with a phenotype. Our method automatically identifies potential functional modules consisting of protein components, where at least one of the components is a disease related protein. When such incriminated modules are identified, the remaining protein components of the module are correlated with loci in the genome associated with a similar phenotype. A hit is reported if other protein components of the incriminated module are the product of genes in loci associated with an identical or overlapping phenotype. Using this large scale approach we show that a gene in a locus is a heavily incriminated candidate, if the protein product of the gene interacts with a protein involved in a similar or identical phenotype, and publish a list of 60 likely candidates in various disorders.
![Page 7: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/7.jpg)
Project description AbstractThe availability of the first draft of the human genome in 2001 (Venter, Adams et al. 2001) led to an increase in the number of methods for disease gene identification. However, the general number of candidates in most loci linked to a particular phenotype is in the hundreds (McCarthy, Smedley et al. 2003; van Driel, Cuelenaere et al. 2003), and the underlying genes in over 900 of the ~ 2550 loci associated with a phenotype in the “Online Mendelian Inheritance in Man” (OMIM) database, have not yet been identified (Hamosh, Scott et al. 2005). Evidently disease gene identification continues to be a very strenuous challenge, since mutational analysis of hundreds of candidates in a critical interval using methods currently available is extremely resource demanding. Thus, prioritising the candidates based on different criteria followed by an extensive investigation of promising candidates, is a logical step in the disease gene finding process. With the advent of proteomics, we are now able to retrieve information on gene functions in a large-scale manner, thus bridging the gap between genotype and phenotype, a possibility with significant interest for disease gene candidate prioritization.We propose that automated correlation of phenotype association networks, with interolog data (the transfer of protein interactions between orthologous protein pairs in different organisms), is a powerful way of identifying good disease gene candidates in a large list of genes in loci associated with a phenotype. Our method automatically identifies potential functional modules consisting of protein components, where at least one of the components is a disease related protein. When such incriminated modules are identified, the remaining protein components of the module are correlated with loci in the genome associated with a similar phenotype. A hit is reported if other protein components of the incriminated module are the product of genes in loci associated with an identical or overlapping phenotype. Using this large scale approach we show that a gene in a locus is a heavily incriminated candidate, if the protein product of the gene interacts with a protein involved in a similar or identical phenotype, and publish a list of 60 likely candidates in various disorders.
![Page 8: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/8.jpg)
Background
![Page 9: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/9.jpg)
Background
Finding genes responsible for major genetic disorders can lead to diagnostics, potential drug targets, treatments and large amounts of information about molecular cell biology in general.
![Page 10: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/10.jpg)
BackgroundMethods for disease gene finding post genome era (>2001):
Mircodeletions Translocations
http://www.med.cmu.ac.th/dept/pediatrics/06-interest-cases/ic-39/case39.html
http://www.rscbayarea.com/images/reciprocal_translocation.gif
Linkage analysis
Fagerheim et al 1996.
1q21-1q23.1
chr1:141,600,00-155,900,000
![Page 11: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/11.jpg)
BackgroundAutomated methods for disease gene finding int the post genome era (>2001):
?
(Perez-Iratxeta, Bork et al. 2002) (Freudenberg and Propping 2002)(van Driel, Cuelenaere et al. 2005)(Hristovski, Peterlin et al. 2005)
Grouping:
Tissues, Gene Ontology, Gene Expression, MeSH terms …….
![Page 12: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/12.jpg)
Disease Gene Finding.
Summery
Background
Why do we want to find disease genes, how has it been done until now?
Networks – deducing functional relationships from network theory
Protein interactionnetworksFunctional modules / network clusters
Phenotype association
Grouping disorders based on their phenotype.Biological implications of phenotype clusters.
Method and examples
Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.Status of pipeline / infrastructure
![Page 13: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/13.jpg)
Networks and functional modules
Deducing functional relationships from protein interaction networks
![Page 14: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/14.jpg)
Networks and functional modules
Deducing functional relationships from network theory
Network theory is boooooooooring
![Page 15: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/15.jpg)
Networks
Text mining of full text corpora e.g PubMed Central
http://www.biosolveit.de/ToPNet/screenshots/fig1.html
![Page 16: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/16.jpg)
Protein interaction networks of physical interactions.
(Barabasi and Oltvai 2004).
Networks
![Page 17: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/17.jpg)
daily
weekly
monthly
(de Licthenberg et al.)
Networks
Social Networks, The CBS interactome
![Page 18: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/18.jpg)
daily
weekly
monthly
(de Licthenberg et al.)
Social Networks, The CBS interactome
Networks
![Page 19: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/19.jpg)
Extracting functional data from protein interaction networks
InWeb
Homo Sapiens
The Ach receptor involved in Myasthenic Syndrome.
Dynamic funcional module:
Eg:
Cell cycle regulation
Metabolism
![Page 20: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/20.jpg)
Trans-organism protein interaction network
Orthologs?
Orthologous genes are direct descendants of a gene in a common ancestor:
(O'Brien K, Remm et al. 2005)
S.Cerevisiae
D. Melanogaster
H.Sapiens
![Page 21: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/21.jpg)
D. Melanogaster Experim.
C. Elegans Experim.
S. Cerevisiae Experim.
H.Sapiens MOSAIC
Trans-organism protein interaction network
![Page 22: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/22.jpg)
Infrastructure status
BIND
IntAct
DIP
MINT
HPRD
Hand-curated
sets
PPI – pred.
GRID
InWeb
Homo Sapiens
Trans-organism ppi
pipeline>122.000 int.
> 22.000 genes
Scoring
A) Topological
B) No publ.
Extraction
perl modules
Direct SQL access
XML or SIF output
Web serverOpis
Command lineInweb.pl
CBS Datawarehouse
Download/reformat db’s
![Page 23: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/23.jpg)
Disease Gene Finding.
Summery
Background
Why do we want to find disease genes, how has it been done until now?
Networks – deducing functional relationships from network theory
Protein interactionnetworksFunctional modules / network clusters
Phenotype association
Grouping disorders based on their phenotype.Biological implications of phenotype clusters.
Method and examples
Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.Status of pipeline / infrastructure
![Page 24: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/24.jpg)
Phenotype association
![Page 25: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/25.jpg)
Phenotype association
Absent liver peroxisomesHepatomegalyIntrahepatic biliary dysgenesisProlonged neonatal jaundicePyloric hypertrophyPatent ductus arteriosusVentricular septal defectsBell-shaped thoraxSmall adrenal glandsAbsent renal peroxisomesClitoromegalyCryptorchidismHydronephrosisHypospadiasRenal cortical microcystsFailure to thriveAbnormal electroretinogramAbnormal helicesAnteverted naresBrushfield spotsCataractsCorneal clouding
Epicanthal foldsFlat faciesFlat occiputGlaucomaHigh arched palateHigh foreheadHypertelorismLarge fontanellesMacrocephalyMicrognathiaNystagmusPale optic diskPigmentary retinopathyPosteriorly rotated earsProtruding tongueRedundant skin folds of neckRound faciesSensorineural deafnessTurribrachycephalyUpward slanting Hyporeflexia or areflexiaHypotonia
PolymicrogyriaSeizuresSevere mental retardationSubependymal cystsPulmonary hypoplasiaCubitus valgusDelayed bone ageMetatarsus adductusRocker-bottom feetStippled epiphyses (especially patellar and acetabular regions)Talipes equinovarusTransverse palmar creaseUlnar deviation of handsWide cranial suturesTransverse palmar creaseHeterotopias/abnormal migrationHypoplastic olfactory lobes
Zelwegger syndrome
palpebral fissuresAutosomal recessiveAlbuminuriaAminoaciduriaDecreased dihydroxyacetone phosphate acyltransferase (DHAP-AT) activityDecreased plasmologenElevated long chain fatty acidsElevated serum iron and iron binding capacityIncreased phytanic acidPipecolic acidemiaBreech presentationDeath usually in first year of lifeGenetic heterogeneityInfants occasionally mistaken as having Down syndromeAgenesis/hypoplasic corpus collosum
![Page 26: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/26.jpg)
Phenotype association
Word vectors
Phenotype Sim. Score
Adrenoleukodystrophy (202370) 0.781
Hyperpipecolatemia (239400) 0.703
Cerebrohepatorenal Syndr. (214110) 0.682
Refsum Disease (266510) 0.609
Reference : Zelwegger Syndrome (214100)
A relationship between the infantile form of Refsum disease and Zellweger syndrome was suggested by the observations of Poulos et al. (1984) in 2 patients. In the infantile form of Refsum disease, as in Zellweger syndrome, peroxisomes are deficient and peroxisomal functions are impaired (Schram et al., 1986). Clinically, infantile Refsum disease, ZWS, and adreno-leukodystrophy have several overlapping features. (Stokke et al., 1984).(http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=266510)
214100 202370
![Page 27: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/27.jpg)
Phenotype association
Word vectorsPhenotype association network
Cerebro-Hepato-
renal
Zelwegger
Refsum
Adrenoleuko-dystrophy
![Page 28: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/28.jpg)
Disease Gene Finding.
Summery
Background
Why do we want to find disease genes, how has it been done until now?
Networks – deducing functional relationships from network theory
Protein interactionnetworksFunctional modules / network clusters
Phenotype association
Grouping disorders based on their phenotype.Biological implications of phenotype clusters.
Method and examples
Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.
![Page 29: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/29.jpg)
Method –
Proof of concept
![Page 30: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/30.jpg)
Method
InWeb
Homo Sapiens
Word vectors
Phenotype clustering
![Page 31: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/31.jpg)
Results - Benchmark
MIM RANK GENE P-value TRUE
278800 1 ENSG00000032514 0.300326793109544 *278800 2 ENSG00000188611 0.0125655342047565278800 2 ENSG000001382970.0125655342047565278800 2 ENSG000001654060.0125655342047565278800 3 ENSG000001966930.0121357313793756278800 3 ENSG000001855320.0121357313793756278800 4 ENSG000001979100.00680983722337082278800 4 ENSG000001653830.00680983722337082278800 4 ENSG000001725380.00680983722337082. . . .. . . .. . . .. . . .. . . .278800 4 ENSG000001655110.00680983722337082278800 4 ENSG000001823540.00680983722337082278800 4 ENSG000001726610.00680983722337082278800 4 ENSG000001655070.00680983722337082278800 4 ENSG000001784400.00680983722337082278800 4 ENSG000001382990.00680983722337082278800 4 ENSG000001977040.00680983722337082278800 4 ENSG000000127790.00680983722337082278800 4 ENSG000001973540.00680983722337082278800 4 ENSG000001890900.00680983722337082278800 4 ENSG000001075510.00680983722337082278800 4 ENSG000001265420.00680983722337082278800 4 ENSG000001983640.00680983722337082278800 4 ENSG000001858490.00680983722337082278800 4 ENSG000001501650.00680983722337082278800 4 ENSG000001288150.00680983722337082278800 4 ENSG000001786450.00680983722337082278800 4 ENSG000001382930.00680983722337082278800 4 ENSG000001768330.00680983722337082278800 4 ENSG000001792510.00680983722337082278800 4 ENSG000001698260.00680983722337082278800 4 ENSG000001726780.00680983722337082278800 4 ENSG000001977520.00680983722337082278800 5 ENSG000001076430.00412573091718715278800 6 ENSG000001657330.000263885640603109
278800 7 ENSG00000169813 6,63E+07
DE SANCTIS-CACCHIONE SYNDROME
Gene map locus 10q11 >12MB area, 103 ranked genes
CLINICAL FEATURES
De Sanctis and Cacchione (1932) reported a condition, which they called 'xerodermic idiocy,' in which patients had xeroderma pigmentosum, mental deficiency, progressive neurologic deterioration, dwarfism, and gonadal hypoplasia.http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=278800
![Page 32: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/32.jpg)
Results – Benchmarking
DE SANCTIS-CACCHIONE
SYNDROME Ranked 1
P-value 0.300326793109544
DNA excision repair
protein ERCC-6
Eukaryotic translation initiation factor 4E (eIF4E)
DNA excision repair protein ERCC-2
Eukaryotic initiation factor 4A-I (eIF4A-I)
*126340 DNA REPAIR DEFECT EM9 OF CHINESE HAMSTER OVARY CELLS, COMPLEMENTATION OF; EM9
#133540 COCKAYNE SYNDROME CKN2
#278730 XERODERMA PIGMENTOSUM, COMPLEMENTATION GROUP D
#278800 DE SANCTIS-CACCHIONE SYNDROME
#601675 TRICHOTHIODYSTROPHY
![Page 33: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/33.jpg)
Results – Benchmarking
DE SANCTIS-CACCHIONE
SYNDROME Ranked 2
P-value 0.0125655342047565
![Page 34: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/34.jpg)
Disease Gene Finding.
Summery
Background
Why do we want to find disease genes, how has it been done until now?
Networks – deducing functional relationships from network theory
Protein interactionnetworksFunctional modules / network clusters
Phenotype association
Grouping disorders based on their phenotype.Biological implications of phenotype clusters.
Method and examples
Combining network theory and phenotype associationsin an automated large scale disease gene finding platformproof of concept.
![Page 35: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/35.jpg)
Method - status
In Silico:
Two in silico proof of concept hits: BOR Syndrome
Dilated Cardiomyopathy
In Vitro:
5 Candidates being tested in the lab by mutational screening in patient material:
BOR Syndrome
Sensorineural Deafness
Holoprosencephaly
Obesity
Anhidrosis
Benchmarking:
Benchmarking by unbiased prioritizing genes in ~ 1000 critical intervals where the actual disease gene is known.
![Page 36: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/36.jpg)
Project description AbstractThe availability of the first draft of the human genome in 2001 (Venter, Adams et al. 2001) led to an increase in the number of methods for disease gene identification. However, the general number of candidates in most loci linked to a particular phenotype is in the hundreds (McCarthy, Smedley et al. 2003; van Driel, Cuelenaere et al. 2003), and the underlying genes in over 900 of the ~ 2550 loci associated with a phenotype in the “Online Mendelian Inheritance in Man” (OMIM) database, have not yet been identified (Hamosh, Scott et al. 2005). Evidently disease gene identification continues to be a very strenuous challenge, since mutational analysis of hundreds of candidates in a critical interval using methods currently available is extremely resource demanding. Thus, prioritising the candidates based on different criteria followed by an extensive investigation of promising candidates, is a logical step in the disease gene finding process. With the advent of proteomics, we are now able to retrieve information on gene functions in a large-scale manner, thus bridging the gap between genotype and phenotype, a possibility with significant interest for disease gene candidate prioritization.We propose that automated correlation of phenotype association networks, with interolog data (the transfer of protein interactions between orthologous protein pairs in different organisms), is a powerful way of identifying good disease gene candidates in a large list of genes in loci associated with a phenotype. Our method automatically identifies potential functional modules consisting of protein components, where at least one of the components is a disease related protein. When such incriminated modules are identified, the remaining protein components of the module are correlated with loci in the genome associated with a similar phenotype. A hit is reported if other protein components of the incriminated module are the product of genes in loci associated with an identical or overlapping phenotype. Using this large scale approach we show that a gene in a locus is a heavily incriminated candidate, if the protein product of the gene interacts with a protein involved in a similar or identical phenotype, and publish a list of 60 likely candidates in various disorders.
![Page 37: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/37.jpg)
Project description
Project:
Søren Brunak, Professor, center director, Dr. Phil, PhD, physicist
Niels Tommerup Professor, Centre
Director Dr. Med.
Find disease genes
Using Bioinformatics
![Page 38: Disease Gene Candidate Prioritization by Integrative Biology Table of contents: Where are we in the course pipeline? Maturation of the project and the](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d125503460f949e59d1/html5/thumbnails/38.jpg)
Acknowledgments
Disease Gene Finding :
Olga RiginaOlof Karlberg
Zenia M. Størling Páll Ísólfur Ólason
Kasper LageAnders GormAnders HinsbyYves MoreauSøren Brunak