session ii g1 overview genomics and gene expression mmc-good

15
Microarray Dataset: quick mining and gene profile analysis using online tools Dr. Etienne Z. GNIMPIEBA Sioux Falls, March 2013 [email protected]

Upload: usd-bioinformatics

Post on 11-May-2015

118 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Session ii g1 overview genomics and gene expression mmc-good

Microarray Dataset: quick mining and gene profile analysis using online tools

Dr. Etienne Z. GNIMPIEBA

Sioux Falls, March 2013

[email protected]

Page 2: Session ii g1 overview genomics and gene expression mmc-good

Plan Gene expression measurement Microarray processGene expression data storesData mining / queringData analysisExample: ATP13A2 profile in stress

conditions

Page 3: Session ii g1 overview genomics and gene expression mmc-good

Gene expression measurement

Higher-plex techniques: SAGEDNA microarrayTiling arrayRNA-SeqNGS

Low-to-mid-plex techniques: Reporter geneNorthern blotWestern blotFluorescent in situ hybridizationReverse transcription PCR

Page 4: Session ii g1 overview genomics and gene expression mmc-good

What is a Microarray?

“A DNA microarray is a multiplex technology consisting of thousands of oligonucleotide spots, each containing picomoles of a specific DNA sequence.”

Used to quantitate mRNA or DNAMany applications:

◦mRNA or DNA levels◦SNP identification◦ChIP-on-Chip

Page 5: Session ii g1 overview genomics and gene expression mmc-good

Hypotheses

Microarrays are usually hypothesis-generating:◦ They highlight specific genes or features that are

particularly interesting for follow-up experiments◦ There are many interesting exceptions

Biomarkers Pathway analyses

This does not reduce the importance of experimental design◦ the low statistical power of array studies make good

design even more important and very challenging

Page 6: Session ii g1 overview genomics and gene expression mmc-good

Microarray process (1/3)• Image analysis

(genepix)• Normalization (R)• Pre-treatment• Differential

expression• Clustering• Data mining• Annotation

Page 7: Session ii g1 overview genomics and gene expression mmc-good

Microarray process (2/3)

Page 8: Session ii g1 overview genomics and gene expression mmc-good

Microarray process (3/3)High density

filters(macroarrays)

Glass slides (microarrays)

Oligonucleotides chips

Detail: Detail: Detail:

Size: 12cm x 8cm Size: 5,4cm x 0,9cm Size: 1,28cm x 1,28cm

•2400 clones by membrane•radioactive labelling•1 experimental condition by membrane

•10000 clones by slide•fluorescent labelling•2 experimental conditions by slide

•300000 oligonucleotides by slide•fluorescent labelling•1 experimental condition by slide

Page 9: Session ii g1 overview genomics and gene expression mmc-good

Gene expression data management

DatabaseMicroarray Experiment 

Sets

Sample Profiles Date Reported

ArrayExpress at EBI 24,838 708,914 October 28, 2011

ArrayTrack™ 1,622 50,953 February 11, 2012

caArray at NCI 41 1,741 November 15, 2006

Gene Expression Omnibus - NCBI 25,859 641,770 October 28, 2011

Genevestigator database 2,500 65,000 January 2012

MUSC database ~45 555 April 1, 2007

Stanford Microarray database 82,542 Not reported October 23, 2011

UNC Microarray database ~31 2,093 April 1, 2007

UNC modENCODE Microarray database ~6 180 July 17, 2009

UPenn RAD database ~100 ~2,500 September 1, 2007

UPSC-BASE ~100 Not reported November 15, 2007

SAGEGEOGUDMAP (421)MGIBIOGPS

Page 10: Session ii g1 overview genomics and gene expression mmc-good

Data mining / querying

Problem specificationQueryExtractionStorage LoadPretreat / prepare for analysis

Page 11: Session ii g1 overview genomics and gene expression mmc-good

Data analysis (1/3)Question-Answer

◦ Experimental condition profile: group comparison

◦ Annotation profile: systems biological involved◦ Clustering profile: co-regulation◦ Time course profile: time variation◦ …

Descriptive ◦ Boxplot (SD, MEAN, MEDIAN, )◦ Scatter plot

Predictive / inference (clustering)Modeling (machine learning, simulation)

Page 12: Session ii g1 overview genomics and gene expression mmc-good

Data analysis (2/3)

3 Questions ◦What is the right dataset (experimental condition)?

◦ Is dataset is ready for analysis (quality)?

◦What is the expression profile for a given gene?

◦Significant differential expression in groups comparison

Tools◦ArrayExpress (EBI)

◦Boxplot

◦GEO2R (LIMMA, profile graph,)

◦….

Page 13: Session ii g1 overview genomics and gene expression mmc-good

Data analysis (3/3)

Boxplot

Page 14: Session ii g1 overview genomics and gene expression mmc-good

Example: ATP13A2 profile in stress conditions

Specification: ATP13A2 profile in stress conditions

Data querying: ◦GEO◦Array Express ◦Gene Atlas

Data analysis: ◦Online: GEO2R, Genospace, …◦Desktop: R, ArrayTrack, …

Page 15: Session ii g1 overview genomics and gene expression mmc-good

Significant differential expression !!!

Kerry Bemis slides