genomics and proteomics analysis
TRANSCRIPT
-
8/3/2019 Genomics and Proteomics Analysis
1/76
-
8/3/2019 Genomics and Proteomics Analysis
2/76
Dr J Boateng BIOT 1011Bioinformatics
The biotechnology/IT
market will increase
at a compound annualgrowth rate (CAGR) of
24% to nearly $38
billion by 2006.
Source: IDC Research
Biotech and pharmaceutical companies
spent $10 billion on hardware, software,and services in 2002.
Source: Gartner
Reference: Prof. A.S. Kolaskar Vice Chancellor, University of Pune
-
8/3/2019 Genomics and Proteomics Analysis
3/76
Dr J Boateng BIOT 1011Bioinformatics
GENOMICSGenetics:the science of genes, heredity, and the variationoforganisms. In modern research, genetics providestoolsin the investigation of the functionof a particulargene, e.g. analysis of genetic interactions.
Genomics:the study of large-scale genetic patterns across thegenome for a given species. It deals with the
systematic use of genome information to provideanswers in biology, medicine, and industry.
-
8/3/2019 Genomics and Proteomics Analysis
4/76
The study of sequences, gene organization &
mutations at the DNA level i.e. the study ofinformation flow within a cell
Genomicshas the potential of offering newtherapeutic methods for the treatment ofsome diseases, as well as new diagnosticmethods.
Major tools and methods related to genomics
are bioinformatics, genetic analysis,measurement of gene expression, anddetermination of gene function.
Dr J Boateng BIOT 1011Bioinformatics
-
8/3/2019 Genomics and Proteomics Analysis
5/76
Dr J Boateng BIOT 1011Bioinformatics
GENOME COMPARISONSSpecies Chrom. Genes Base pairs
Humans 46 28-35,000 3.1 billion
Mouse 40 22.5-30000 3.1 billion
Puffer fish 44 31000 2.7 million
Malaria Mosquito 6 14000 365 million
Fruit Fly 8 14000 137 million
Roundworm 12 19000 97 million
E. Coli 1 5000 4.1 million
-
8/3/2019 Genomics and Proteomics Analysis
6/76
Dr J Boateng BIOT 1011Bioinformatics
GENOMIC ANALYSIS Many diverse studies require the determinationof the abundance of large numbers of specific
DNA or RNA molecules in complex mixtures,including, for example, the determination of the
changes in mRNA levels of many genes
Genome analysis entails the prediction of genes inuncharacterized genomic sequences.
The 21st century has seen the announcement of thedraft version of the human genome sequence. Modelorganisms have been sequenced in both the plant
and animal kingdoms.
-
8/3/2019 Genomics and Proteomics Analysis
7/76
Dr J Boateng BIOT 1011Bioinformatics
GENOMIC ANALSIS However, the pace of genome annotation is not
matching the pace of genome sequencing.
Experimental genome annotation is slow andtime consuming. The demand is to be able todevelop computational tools for gene
prediction. Computational gene prediction is relatively simple for the
prokaryotes where all the genes are converted into thecorresponding mRNA and then into proteins.
The process is more complex for eukaryotic cells wherethe coding DNA sequence is interrupted by randomsequences called introns.
-
8/3/2019 Genomics and Proteomics Analysis
8/76
Dr J Boateng BIOT 1011Bioinformatics
BIOLOGICAL QUESTIONSSome of the questions biologists want to answertoday are:
What part of and DNA sequence codes for aprotein and what part of it is junk DNA?
Classify the junk DNA as intron, untranslatedregion, transposons, dead genes, regulatoryelements.
Divide a newly sequenced genome into thegenes (coding) and the non-coding regions.
-
8/3/2019 Genomics and Proteomics Analysis
9/76
Dr J Boateng BIOT 1011Bioinformatics
Biological Research in 21st
Century
The new paradigm, now emerging is thatall the 'genes' will be known (in the sense
of being resident in databases availableelectronically), and that the starting "pointof a biological investigation will be
theoretical.- Walter Gilbert
-
8/3/2019 Genomics and Proteomics Analysis
10/76
Dr J Boateng BIOT 1011Bioinformatics
IMPORTANCE OF GENOME
ANALYSIS The importance of genome analysis can be
understood by comparing the human andchimpanzee genomes.
The chimp and human genomes vary byan average of just 2% i.e. just about 160enzymes. A complete genome analysis of
the two genomes would give a stronginsight into the various mechanismsresponsible for the differences.
-
8/3/2019 Genomics and Proteomics Analysis
11/76
Dr J Boateng BIOT 1011Bioinformatics
COMPLEXITY IS AN UNDERSTATEMENT?
-
8/3/2019 Genomics and Proteomics Analysis
12/76
Dr J Boateng BIOT 1011Bioinformatics
GENOMIC ANALYSIS_ basics
Techniques used to estimate the relativeabundance of two or more sets of mRNA
differential screening of cDNA libraries,
subtractive hybridization,
differential display,
However, more advanced methods havebeen recently developed.
-
8/3/2019 Genomics and Proteomics Analysis
13/76
Dr J Boateng BIOT 1011Bioinformatics
GENOMICS ANALYSIS_Advances
Advanced methods are particularlyamenable to organisms whose entiregenome sequences are known, such as S.cerevisiae.
It is now practicable to investigatechanges of mRNA levels of all yeast openreading frames (ORFs) in one experiment.
-
8/3/2019 Genomics and Proteomics Analysis
14/76
Dr J Boateng BIOT 1011Bioinformatics
Advanced genomic analysis techniques
DNA sequencing
DNA microarray technology analysis of gene expression profiles at the mRNA level
Bioinformatic tools to organize and analyze
such data
Chip-based analysis of samples
Models of gene networks
-
8/3/2019 Genomics and Proteomics Analysis
15/76
Dr J Boateng BIOT 1011Bioinformatics
Microarray Technology
-
8/3/2019 Genomics and Proteomics Analysis
16/76
Dr J Boateng BIOT 1011Bioinformatics
Post-genomic Era Series of omics
Comparative genomics
Structural and functional genomics
Transriptomics
Proteomics
Metabolomics
-
8/3/2019 Genomics and Proteomics Analysis
17/76
Dr J Boateng BIOT 1011Bioinformatics
Bioinformatics toolsneeded for analysis of
data from theseomics
-
8/3/2019 Genomics and Proteomics Analysis
18/76
Dr J Boateng BIOT 1011Bioinformatics
Data MiningDevelopment of new tools for data mining
Sequence alignment
Genome sequencing
Genome comparison
Micro array data analysis Proteomics data analysis
Small molecular array analysis
To derive information and gain knowledge from thedata
-
8/3/2019 Genomics and Proteomics Analysis
19/76
Dr J Boateng BIOT 1011Bioinformatics
Dr J Boateng BIOT 1011Bioinformatics
COMPARATIVE GENOMICS
Analyzing & comparing genetic material fromdifferent species to study evolution, gene
function, and inherited disease
Understand the uniqueness between different
species
Comparative genomics involves the use of
computer programs that can line up multiplegenomes and look for regions of similarityamong them.
-
8/3/2019 Genomics and Proteomics Analysis
20/76
Dr J Boateng BIOT 1011Bioinformatics
When we BLAST a sequence is that
comparative genomics?
Difference is in Scale and Direction
One or several genes
compared against all
other known genes.
Use genome toinform us about the
entire organism.
Use information frommany genomes to learn
more about the
individual genes.
Entire Genome
compared to other
entire genomes.
Other omics Comparative
-
8/3/2019 Genomics and Proteomics Analysis
21/76
Dr J Boateng BIOT 1011Bioinformatics
Background on Comparative
Genomic Analysis Sequencing the genomes of the human,
the mouse and a wide variety of otherorganisms - from yeast to chimpanzees
Driving force for the development of newfield of biological research called -
comparative genomics.
Dr J Boateng BIOT 1011Bioinformatics
-
8/3/2019 Genomics and Proteomics Analysis
22/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND Comparing the human genome with the
genomes of different organisms helps tobetter understand gene structure and
function and thereby develop newstrategies in the battle against humandisease.
Comparative genomics also provides apowerful new tool for studying evolutionarychanges among organisms.
-
8/3/2019 Genomics and Proteomics Analysis
23/76
This helps to identify the genes that areconserved among species along with thegenes that give each organism its ownunique characteristics.
Using computer-based analysis to zero in onthe genomic features that have beenpreserved in multiple organisms over
millions of years, researchers will be ableto pinpoint the signals that control genefunction.
This should in turn translate into innovativeapproaches for treating human diseaseand improving human health.
Dr J Boateng BIOT 1011Bioinformatics
-
8/3/2019 Genomics and Proteomics Analysis
24/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND
The evolutionary perspective may proveextremely helpful in understanding diseasesusceptibility. For example, chimpanzees donot suffer from some of the diseases that strikehumans, such as malaria and AIDS.
A comparison of the sequence of genesinvolved in disease susceptibility may reveal
the reasons for this species barrier, therebysuggesting new pathways for prevention ofhuman disease.
-
8/3/2019 Genomics and Proteomics Analysis
25/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND Although living creatures look and behave inmany different ways, all of their genomes
consist of DNA, the chemical chain that makesup the genes that code for thousands ofdifferent kinds of proteins.
Precisely which protein is produced by a givengene is determined by the sequence in which
four chemical building blocks - adenine (A),thymine (T), cytosine (C) and guanine (G) - arelaid out along DNA's double-helix structure.
-
8/3/2019 Genomics and Proteomics Analysis
26/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND In order for researchers to most efficiently use an
organism's genome in comparative studies, dataabout its DNA must be in large, contiguous
segments, anchored to chromosomes and, ideally,fully sequenced.
Furthermore, the data needs to be organized for
easy access and high-speed analysis bysophisticated computer software.
Organisms that have been completely sequencedinclude: mouse (Mus musculus), human (Homosapiens), fruit fly (Drosophila melanogaster); and....................
-
8/3/2019 Genomics and Proteomics Analysis
27/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND
The fledgling field of comparative genomics hasalready yielded some dramatic results.
For example, a March 2000 study comparing thefruit fly genome with the human genome discoveredthat about 60 percent of genes are conserved
between fly and human.
Simply put, the two organisms appear to share a
core set of genes. Researchers have found thattwo-thirds of human cancer genes havecounterparts in the fruit fly.
-
8/3/2019 Genomics and Proteomics Analysis
28/76
Dr J Boateng BIOT 1011Bioinformatics
BACKGROUND More surprisingly, when scientists inserted a
human gene associated with early-onset
Parkinson's disease into fruit flies, theydisplayed symptoms similar to those seen inhumans with the disorder.
This raises the possibility that the tiny insects
could serve as a new model for testingtherapies aimed at Parkinson's.
C ti G i
-
8/3/2019 Genomics and Proteomics Analysis
29/76
Dr J Boateng BIOT 1011Bioinformatics
Comparative Genomics
What one should look for?Human
P. falciparum
Mosquito
Proteins that are shared by
All genomes
Exclusively by Human & P.f.
Exclusively by Human &Mosquito
Exclusively by P.f. & Mosquito
Unique proteins in
Human
P.f. Targets foranti-malarial drugs
Mosquito
-
8/3/2019 Genomics and Proteomics Analysis
30/76
Dr J Boateng BIOT 1011Bioinformatics
Comparative Gene Prediction
GenScan: ab initio gene prediction.
GeneWise, Procrustes : homology guided.
Rosseta, SGP1 (Syntetic Gene Prediction), CEM(Conserved Exon Method) : gene prediction andsequence alignment are clearly separated.
GenomeScan: Ab Initio modified by BLASThomologies.
SGP-2, TwinScan, SLAM, DoubleScan :modification of GenScan scoring schema toincorporate similarity to known proteins.
-
8/3/2019 Genomics and Proteomics Analysis
31/76
-
8/3/2019 Genomics and Proteomics Analysis
32/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics (Practical) - the study of theproteome using technologies of large-scale proteinseparation and identification.
Large scale separation : 2DELiquid Chromatography
Identification : MALDI MS
Tandem MS/MSFT-MS ..
Proteomics by the dictionary
-
8/3/2019 Genomics and Proteomics Analysis
33/76
http:www.bio-itworld.com/archive/031704/horizons_horizons_comm.html
Dr J Boateng BIOT 1011Bioinformatics
-
8/3/2019 Genomics and Proteomics Analysis
34/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics according to MedlineDevelopment of Proteomics
1730
From 220 publications in the previous millennium (94-99)To 21,350 (!!!) publications in this millennium (00-05)
0
10002000
3000
4000
50006000
7000
8000
9000
1997 1998 1999 2000 2001 2002 2003 2004
Papers
Reviews
-
8/3/2019 Genomics and Proteomics Analysis
35/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics by Google
THE REALISTIC TRUTH.
Proteomics 886,000 hits (2004)4,700,000 hits (2005)
Genomics 2,070,000 hits (2004)16,000,000 hits (2005)
-
8/3/2019 Genomics and Proteomics Analysis
36/76
Dr J Boateng BIOT 1011Bioinformatics
Comparing Proteomics & Genomics
Genome Genomics
analysis
proteome Proteome
analysis
DNA
Nc-RNA
mRNA Coding DNA Proteins
Peptides
Glyco, other
modifications
linear Dynamic
Up/down
3D Dynamic
Up/ down
variants
Completion
Archived
(EST, cDNA,GEO
No notion ofcompletion
Poorly archived
-
8/3/2019 Genomics and Proteomics Analysis
37/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics GenomicsMore differences
Gene/ RNA
dynamic
Protein
dynamicStable molecules
Handling cheap/ easyMinimal modification
Works in isolation
Fragile molecules
Handling dependentLabile modification
Protein-interaction
Localization dependent
Handle
Tech
HTP
Sequencing (established) MS related (not yet)
DNA array / genotyping/expression / CGH/
Protein Chip (not yet)
Antibodies array (not yet)
-
8/3/2019 Genomics and Proteomics Analysis
38/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics:Original definition: study of the proteins
encoded by the genome of a biological sample
Current definition: study of the wholeproteincomplement of a biological sample (cell, tissue,
animal, biological fluid [urine, serum])
Usually involves high resolution separation of
polypeptides at front-end, followed by massspectrometry identification and analysis
-
8/3/2019 Genomics and Proteomics Analysis
39/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges facing Proteomic TechnologiesChallenges facing Proteomic Technologies Limited/variable sample material
Sample degradation (occurs rapidly, even during sample
preparation) Vast dynamic range required
Post-translational modifications (often skew results)
Specificity among tissue, developmental and temporalstages
Perturbations by environmental (disease/drugs)conditions
Researchers have deemed sequencing the genomeeasy, as PCR was able to assist in overcoming many ofthese issues in genomics.
-
8/3/2019 Genomics and Proteomics Analysis
40/76
Dr J Boateng BIOT 1011Bioinformatics
The Proteomics Tool Kit technologies for separating and
visualizing proteins and peptides
technologies for assessing protein-proteininteractions
technologies for identifying proteins* technologies for quantifying protein
expression*
bioinformatic tools for assessment andcommunication
Proteomic TechnologiesProteomic Technologies
-
8/3/2019 Genomics and Proteomics Analysis
41/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomic TechnologiesProteomic Technologies
Amino Acid Composition
Array-based Proteomics
2D PAGE
Mass Spectrometry
Structural Proteomics
Informatics (and the challenges facing the
Human Proteome Project)
-
8/3/2019 Genomics and Proteomics Analysis
42/76
Dr J Boateng BIOT 1011Bioinformatics
Amino Acid Composition (Edmund)Amino Acid Composition (Edmund)
Pioneering method of obtaining information fromproteins.
Cumbersome and tedious by todays standards.
Requires the use of terrible smelling -mercaptoethanol.
Not high-throughput by todays standards,hence, comp is no longer the most widely usedtechnique.
-
8/3/2019 Genomics and Proteomics Analysis
43/76
Dr J Boateng BIOT 1011Bioinformatics
Protein Sequencingstep 1, fragmenting into peptides
Protein Sequencingstep 1, fragmenting into peptides
-
8/3/2019 Genomics and Proteomics Analysis
44/76
Dr J Boateng BIOT 1011Bioinformatics
Protein Sequencingstep 2, sequencing the peptides by Edmund degradation.
Separation by HPLC and detect by absorbance at 269nm.
A b d P t iA b d P t i
-
8/3/2019 Genomics and Proteomics Analysis
45/76
Dr J Boateng BIOT 1011Bioinformatics
Array-based ProteomicsArray-based Proteomics
Employ two-hybrid assays
Use GFP, FRET, and GST GFP = green florescent protein
FRET = florescence resonance energytransfer
GST = glutathione S-transferase, a wellcharacterized protein used as a markerprotein.
-
8/3/2019 Genomics and Proteomics Analysis
46/76
Dr J Boateng BIOT 1011Bioinformatics
Array-based ProteomicsArray-based Proteomics
A b d P t iA b d P t i
-
8/3/2019 Genomics and Proteomics Analysis
47/76
Dr J Boateng BIOT 1011Bioinformatics
Array-based ProteomicsArray-based Proteomics
Offer a high-throughput technique forproteome analysis.
These small plates are able to hold manydifferent samples at a time.
Current research is ongoing in an attemptto interface array methodologies with
Mass Spectrometry at ORNL.
2D PAGE2D PAGE
-
8/3/2019 Genomics and Proteomics Analysis
48/76
Dr J Boateng BIOT 1011Bioinformatics
2D PAGE2D PAGE 2-D gel electrophoresis is a multi-step procedure that
can be used to separate hundreds to thousands ofproteins with extremely high resolution.
It works by separation of proteins by their pI's in onedimension using an immobilized pH gradient (firstdimension: isoelectric focusing) and then by their MW'sin the second dimension.
The core technology of proteomics is 2-DE
At present, there is no other technique that is capable ofsimultaneously resolving thousands of proteins in oneseparation procedure. (sited in 2000)
E l ti f 2 DE th d l
-
8/3/2019 Genomics and Proteomics Analysis
49/76
Dr J Boateng BIOT 1011Bioinformatics
Traditional IEF procedure:
Iso electric focusing (IEF) in run in thin polyacrylamide gelrods in glass or plastic tubes.
Gel rods containing: 1. urea, 2. detergent, 3. reductant,and 4. carrier ampholytes (form pH gradient).
Problem: 1. tedious. 2. not reproducible.
Evolution of 2-DE methodology
In the past
Evolution of 2 DE methodology
-
8/3/2019 Genomics and Proteomics Analysis
50/76
Dr J Boateng BIOT 1011Bioinformatics
SDS-PAGE Gel size:
This OFarrell techniques has been used for 20 years
without major modification.
20 x 20 cm have become a standard for 2-DE.
Assumption: 100 bands can be resolved by 20 cm long1-DE.
Therefore, 20 x 20 cm gel can resolved 100 x 100 =10,000 proteins, in theory.
Evolution of 2-DE methodology
100
100
Evolution of 2 DE methodology
-
8/3/2019 Genomics and Proteomics Analysis
51/76
Dr J Boateng BIOT 1011Bioinformatics
Problems with traditional 1st dimension IEF
Works well for native protein, not good for denaturingproteins, because:
1. Takes longer time to run.
2. Techniques are cumbersome. (the soft, thin, long gel rodsneeds excellent experiment technique)
3. Batch to batch variation of carrier ampholytes.
4. Patterns are not reproducible enough.
5. Lost of most basic proteins and some acidic protein.
Evolution of 2-DE methodology
OPERATOR DEPENDENT
2D PAGE2D PAGE
-
8/3/2019 Genomics and Proteomics Analysis
52/76
Dr J Boateng BIOT 1011Bioinformatics
2D PAGE2D PAGE
2-D gel electrophoresis process consists ofthese steps:
Sample preparation First dimension: isoelectric focusing
Second dimension: gel electrophoresis
Staining
Imaging analysis via software
Challenges for 2 DE
-
8/3/2019 Genomics and Proteomics Analysis
53/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
1. Spot number:
10,000-150,000 gene products in a cell.
PTM makes it difficult to predict real number.
Sensitivity and dynamic range of 2-DE must be adequate.
Its impossible to display all proteins in one single gels.
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
54/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
2. Isoelectric point spectrum:
pI of proteins: range from pH 3-13. (by in vitrotranslated ORF)
PTM would not alter the pI outside this range.
pH gradient from 3-13 dose not exist.
For proteins which pI > 11.5, they need to be handedseparately.
Challenges for 2 DE
-
8/3/2019 Genomics and Proteomics Analysis
55/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
3. molecular weights:
Small proteins or peptides can be analysed bymodifying the gel and buffer condition of SDS-PAGE.
Protein > 250 kDa do not enter 2nd
SDS-PAGEproperly.
1-DE (SDS-PAGE) can be run in a lane at the side of2-DE.
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
56/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
4. hydrophobic proteins:
Some very hydrophobic proteins do not go insolution.
Some hydrophobic proteins are lost duringsample preparation and iso electric focusing
(IEF).
More chemical developments are required.
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
57/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
5. Sensitivity of detection:
Low copy number proteins are very difficultto detect, even employing most sensitive
staining methods.
Sensitivity of staining methods:
1. Silver staining2. Fluorescent staining
3. Dye binding staining (CBR)
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
58/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
6. Loading capacity:
For detection of low abundant proteins,more sample needs to be loaded.
A wide dynamic range of the SDS-PAGE isrequired to prevent merging of highlyabundant protein.
Loading capacity: IEF > SDS-PAGE.
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
59/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2 DE
7. Quantitation:
The detection method must give reliablequantitative information.
Silver staining does not give reliablequantitative data.
Challenges for 2-DE
-
8/3/2019 Genomics and Proteomics Analysis
60/76
Dr J Boateng BIOT 1011Bioinformatics
Challenges for 2-DE
8. Reproducibility:
Highest importance in 2-DE experiment.
Immobilized pH gradient strip haveimproved a lot for 1st dimensionconsistency
Variation most comes from samplepreparation.
-
8/3/2019 Genomics and Proteomics Analysis
61/76
Dr J Boateng BIOT 1011Bioinformatics
A good-looking spot pattern streak and smear free is not a guarantee for best 2-DEprotocol
Technologies for identifying
-
8/3/2019 Genomics and Proteomics Analysis
62/76
Dr J Boateng BIOT 1011Bioinformatics
g y g
proteins Western blotting
Chemical (Edman) sequencing ofproteins
mass spectrometry
peptide mass fingerprint
mass spec decaydatabases and search engines
Mass SpectrometryMass Spectrometry
-
8/3/2019 Genomics and Proteomics Analysis
63/76
Dr J Boateng BIOT 1011Bioinformatics
Mass SpectrometryMass Spectrometry
Mass Spectrometry is another tool to analyzethe proteome.
In general a Mass Spectrometer consists of: Ion Source
Mass Analyzer
Detector
Mass Spectrometers are used to quantify themass-to-charge (m/z) ratios of substances.
From this quantification, a mass is determined,proteins are identified, and further analysis isperformed.
MASS SPECTROMETRY
-
8/3/2019 Genomics and Proteomics Analysis
64/76
Dr J Boateng BIOT 1011Bioinformatics
MASS SPECTROMETRY
MORE DETAILED MASS SPECTROMETRYAPPLICATIONS IN MORNING LECTURE ON
28TH NOVEMBER 2011
-
8/3/2019 Genomics and Proteomics Analysis
65/76
Dr J Boateng BIOT 1011Bioinformatics
application of bioinformatics in the
fields of genomics and proteomics
What is Bioinformatics?
-
8/3/2019 Genomics and Proteomics Analysis
66/76
Dr J Boateng BIOT 1011Bioinformatics
What is Bioinformatics?
Conceptualizing biology in terms ofmolecules and then applying
informatics techniques from math,computer science, and statistics to
understand and organize the informationassociated with these molecules on a
large scale
How do we use Bioinformatics?
-
8/3/2019 Genomics and Proteomics Analysis
67/76
Dr J Boateng BIOT 1011Bioinformatics
How do we use Bioinformatics?
Store/retrieve biological information(databases)
Retrieve/compare gene sequences
Predict function of unknown genes/proteins
Search for previously known functions of agene
Compare data with other researchers Compile/distribute data for other researchers
Sequence retrieval:
-
8/3/2019 Genomics and Proteomics Analysis
68/76
Dr J Boateng BIOT 1011Bioinformatics
National Center for BiotechnologyInformation
GenBank and other genome databases
Protein Structure:
3D modeling programs RasMol, Protein Explorer
Sequence comparison programs:
BLAST GCG MacVector
-
8/3/2019 Genomics and Proteomics Analysis
69/76
Dr J Boateng BIOT 1011Bioinformatics
Similarity Search: BLAST
-
8/3/2019 Genomics and Proteomics Analysis
70/76
Dr J Boateng BIOT 1011Bioinformatics
Similarity Search: BLAST
A tool for searching gene or protein sequence
databases for related genes of interest
The structure, function, and evolution of a genemay be determined by such comparisons
Alignments between the query sequence andany given database sequence, allowing formismatches and gaps, indicate their degreeof similarity
http://www.ncbi.nlm.nih.gov/BLAST/
% identity
-
8/3/2019 Genomics and Proteomics Analysis
71/76
Dr J Boateng BIOT 1011Bioinformatics
MRCKTETGAR
MRCGTETGAR
% identity
90%
CATTATGATA
GTTTATGATT
70%
Strengths:
-
8/3/2019 Genomics and Proteomics Analysis
72/76
Dr J Boateng BIOT 1011Bioinformatics
Accessibility
Growing rapidly
User friendly
Weaknesses:
Sometimes not up-to-date
Limited possibilities
Limited comparisons and information
Not accurate
Need for improved Bioinformatics
-
8/3/2019 Genomics and Proteomics Analysis
73/76
Dr J Boateng BIOT 1011Bioinformatics
Genomics: Human Genome Project Gene array technology
Comparative genomics Functional genomics
Proteomics: Global view of protein
function/interactions
Protein motifs
Structural databases
Data Mining
-
8/3/2019 Genomics and Proteomics Analysis
74/76
Dr J Boateng BIOT 1011Bioinformatics
Data Mining
Handling enormous amounts of data
Sort through what is important and what is not
Manipulate and analyze data to find patternsand variations that correlate with biological
function
Proteomics
-
8/3/2019 Genomics and Proteomics Analysis
75/76
Dr J Boateng BIOT 1011Bioinformatics
Proteomics Uses information determined bybiochemical/crystal structure methods
Visualization of protein structure Make protein-protein comparisons
Used to determine:
- conformation/folding
- antibody binding sites
- protein-protein interactions
- computer aided drug design
-
8/3/2019 Genomics and Proteomics Analysis
76/76
Dr J Boateng BIOT 1011Bioinformatics
bioinformatics
students educators
researchers institutions