interrelating different types of genomic data from proteome to secretome: ‘oming in on the...

Interrelating Different Types of Genomic Data

From Proteome to Secretome: ‘Oming in on the Function

Objectives

Outline the goal of functional genomics Describe the studies of H. Antelmann Explain the “-Ome” and it’s purpose Define the “-Omes” Identify the different computational and

experimental methods for defining the “-Omes”

Explain how “-Omes” are interrelated Summarize the ultimate goal of genomics

The Goal of Genomics

To complement the genomic sequence by assigning useful biological information to every gene

To improve the understanding of how different biological molecules contained within the cell combine to make the organism possible

Additional Goals

To define the three-dimensional structures of the macromolecules, their sub-cellular localizations, intermolecular interactions, and expression levels

Haike Antelmann & Group

Institute for microbiology and molecular biology @ Ernst-Moritz-Arndt-Universität Greifswald: Greifswald, Germany

Previously the group used computational methods to predict all exported proteins

Now, aim to verify previous predictions by experimentally characterizing the entire population of secreted proteins using 2D gel electrophoresis and mass spectrometry

They showed that about 50% of their original predictions were accurate

The new Lexicon of the “-Ome”

Antelmann coined the term “secretome” to define the varied populations and subpopulations in the cell

-Omes can be divided into two categories: Those that represent a population of

molecules Those that define their actions

-Omes

Provides an inventory or “parts list” of molecules contained within an organism Transcriptome: the population of mRNA

transcripts in the cell, weighted by their expression levels

Glycome: the popluation of carbohydrate molecules in the cell

Secretome: the population of gene products that are secreted from the cell

Ribonome: the population of RNA-coding regions of the genome

-Omes Continued

Transportome: the population of the gene products that are transported; this includes the secretome

Functome: the population of gene products classified by their expression levels

Translatome: the population of proteins in the cell, weighted by their expression levels

Foldome: the population of gene products classified through their tertiary structure

-Omes Continued

Describes the actions of the protein products Genome: the full complement of genetic

information both coding and non coding in the organism

Proteome: the protein coding regions of the genome

Physiome: quantitative description of the physiological dynamics or functions of the whole organism

Metabolome: the quantitative complement of all the small molecules present in a cell in a specific physiological state

-Omes Continued

Morphome: the quantitive description of anatomical structure, biochemical and chemical composition of an intact organism, including its genome, proteome, cell, tissue and organ structures

Interactome: list of interactions between all macromlecules in a cell

Orfeome: the sum total of open reading frames in the genome, without regard to whether or not they code; a subset of this is the proteome

Phenome: qualitative identification of the form and function derived from genes , but lacking a quantitative, integrative definition

-Omes Continued

Regulome: genome-wide regulatory network of a cell

Cellome: the entire complement of molecules and their interactions within a cell

Operome: the characterization of proteins with unknown biological function

Pseudome: the complement of pseudogenes in the proteome

Unknome: genes of unknown factor

Computational Methods

Algorithmic methods for predicting genes, protein structure, interactions, or localization based patterns in individual sequences or structures.

Annotation transfer through homology. (Inferring structure or function based on sequence and structural information of homologous proteins.)

“Guilt-by-Association” method based on clustering where functions or interactions are inferred from clusters of functional genomic data, such as expression information.

Annotation Transfer through Homology

In SWISS- PROT, as in most other sequence databases, two classes of data can be distinguished: the core data and the annotation. For each sequence entry the core data consists of the sequence data, the citation information (bibliographical references), and the taxonomic data (description of the biological source of the protein), while the annotation consists of the description of the following items: Functions of the protein Post- translational modifications. For example

carbohydrates, phosphorylation, acetylation, GPI- anchor, etc.

Domains and sites. For example calcium binding regions, ATP- binding sites, zinc fingers, homeobox, kringle, etc.

Secondary structure, Quaternary structure Similarities to other proteins, Diseases associated with

deficiencies in the protein, Sequence conflicts, variants, etc.

Guilt-by-Association

For assessing gene function (although not logically precise): as genes already known to be related do, in fact, tend to cluster together based on their experimentally determined expression patterns. The approach is made more systematic and statistically sound by calculating the probability that the observed functional distribution of differentially expressed genes could have happened by chance.

Experimental Methods

Most prominent method is the two-dimensional electrophoresis to isolate proteins followed by mass spectrometry for protein identification.

Protein chip system, capable of high-throughput screening of protein biochemical activity.

Also sometimes use, transposon insertion methodology.

2-D Electrophoresis

First introduced in 1975. Most commonly used method for protein separation in proteomics.

Proteins are first separated across a gel according to their isoelectric point, then separated in a perpendicular direction on the basis of their molecular weight.

Electrophoresis in which a second perpendicular electrophoretic transport is performed on the separate components resulting from the first electrophoresis. This technique is usually performed on polyacrylamide gels.

Mass Spectrometry

In a typical approach, this technique for measuring and analyzing molecules involves introducing enough energy into a target molecule to cause its disintegration. The resulting fragments are then analyzed, based on their mass/ charge ratios, to produce a "molecular fingerprint.

Transposon Insertion Methodology

A segment of DNA which contains the insertion elements at either end but can contain just about anything in the middle (genes, markers, etc.). These types of transposons tend to be very large, and many of them came about when the inner two insertion elements of two smaller transposons stopped working and only the two at the far ends continue to work, so that when the transposon moves, it takes everything in between the two original transposons with it. Some composite transposons are used in genetics experiments; Tn5 and Tn10 are two such composite transposons which have genes that encode resistance to certain antibiotics.

Noise

Functional Genomics experiments give rise to very complicated data that is inherently hard to interpret

This data is often plagued with noise Both factors can lead to inaccuracies

and conflicting interpretations Average or combine data to obtain

more accurate results

Interrelating Different -Omes

Fundamental approach in genomics is to establish relationships between the different ‘omes

Piecing the individual –omes together, hope to build a full and dynamic view of the complex process that support the organism

Example: How do the proteome and regulome combine to produce the translatome?

Interrelating Different –Omes Cont.

Defining or assigning one ‘ome based on another

Comparing one ‘ome with another to better understand the processes that shift one population into its successor

Calculating “missing” information in one ‘ome based on information in another

Describing the intersection between multiple populations

Final Thoughts

Ultimate goal of genomics: the clarification of the functome, but there are many intermediate steps

By viewing the cell in terms of a list of distinct parts, definition of each part is possible. Determine and categorize functional information for each gene.

Computational and experimental techniques are valuable and complementary

Genomic approaches result in inaccurate and noisy data that must be analyzed further for accuracy

Sources

Cambridge Healthtech Institute; www.genomicglossaries.com

EMBL-EBI: European Bioinformatics Institute; www.ebi.ac.uk/swissprot/Publications/ismb97.html

Greenbaum, Dov. Luscombe, Nicholas. Jansen, Ronald. Qian, Jiang. Gerstein, Mark. “Interrelating Different Types of Genomic Data, from Proteome to Secretome: ‘Oming in on Function”

Rolf Apweiler et. al " Protein Sequence Annotation in the Genome Era: The Annotation Concept of SWISS- PROT + TrEMBL" Intelligent Systems in Molecular Biology, 1997

interrelating different types of genomic data from proteome to secretome: ‘oming in on the...

Documents

cell omes

genome slide

population of molecules

cell secretome

population of gene products

population of proteins

accurate slide

function slide