dr paul lewis lecturer in bioinformatics lecturer in bioinformatics cardiff university cardiff...

31
Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics & Bioinformatics Unit

Upload: abel-quinn

Post on 02-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Dr Paul LewisDr Paul Lewis

• Lecturer in BioinformaticsLecturer in Bioinformatics

• Cardiff UniversityCardiff University

• Biostatistics & Bioinformatics UnitBiostatistics & Bioinformatics Unit

Page 2: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Biostatistics & Bioinformatics Unit(BBU)

• Bioinformatics resource for Institutions across Wales

• Backing of the Higher Education Funding Council for Wales - £1.5 million grant through the Research Capacity Development Fund

• UWCM, Cardiff University, Aberystwyth

• 13 new posts in statistics & bioinformatics

• MSc/Postgraduate Diploma/Postgraduate Certificate:

Bioinformatics

Genetic Epidemiology and Bioinformatics

Page 3: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

• Brief Overview of Microarray Bioinformatics

• Introduce My Microarray Research Interests

• My Microarray Analysis Software

Page 4: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Experimental Experimental DesignDesign

Differential Differential Gene Gene ExpressionExpression

HybridisationHybridisation

DataData

Pattern Pattern DiscoveryDiscovery

Class Class PredictionPrediction

AnnotationAnnotation

NormalisationNormalisation

Bioinformatics in Microarray ExperimentBioinformatics in Microarray Experiment

Page 5: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

NormalizationNormalization

Remove non-biological influences on data (systematic variation)

3 categories of Normalisation

• Normalisation – transform data to make more like a normal distributionlog, lowess, linlog

• Standardisation – expand or contract distribution so data from different experiments can be compared

calculate Z-scores

• Centralisation – move distribution so its centered around expected meanmean / median / mean trimmed centering

Page 6: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Experimental Experimental DesignDesign

Differential Differential Gene Gene ExpressionExpression

HybridisationHybridisation

DataData

Pattern Pattern DiscoveryDiscovery

Class Class PredictionPrediction

AnnotationAnnotation

NormalisationNormalisation

Bioinformatics in Microarray ExperimentBioinformatics in Microarray Experiment

Page 7: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

With Replicates

Parametric tests

t-test (ANOVA) J. Comput. Biol. 2000 7: 817-838

Bayesian t-test Bioinformatics 2001 17: 509-519.

Mixture modelling & bootstrapping (SAM) P.N.A.S. 2001 98: 5116-5121

Regression modelling Genome Res. 2001 11: 1227-1236.

All give similar results but SAM reduces false positives

Non Parametric Tests

Wilcoxon rank sum test Bioinformatics 2002 18: 1454-1461

Non-parametric t-test Bioinformatics 2002 18: 1454-1461

Ideal discriminator method Bioinformatics 2002 18: 1454-1461

low false positive rate but less power

Find Differentially Expressed GenesFind Differentially Expressed GenesIs fold change significant?Is fold change significant?

Page 8: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Experimental Experimental DesignDesign

Differential Differential Gene Gene ExpressionExpression

HybridisationHybridisation

DataData

Pattern Pattern DiscoveryDiscovery

Class Class PredictionPrediction

AnnotationAnnotation

NormalisationNormalisation

Bioinformatics in Microarray ExperimentBioinformatics in Microarray Experiment

Page 9: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Pattern Discovery & Class PredictionPattern Discovery & Class Prediction

Explore how genes or samples group:

Clustering

Hierarchical Cluster Analysis HIERARCHYK-MeansSelf Organising Maps (SOM) PARTITIONFuzzy ARTPrincipal Components Analysis (PCA)Multidimensional Scaling (MDS) REDUCTIONCorrespondence Analysis (CoA)

Assign genes to known groupings:

Classification

logistic regressionneural networkslinear discriminant analysis

Page 10: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Hierarchical Cluster AnalysisHierarchical Cluster Analysis

Page 11: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Partitioning Clustering MethodsPartitioning Clustering Methods

• Need To Tell Methods Number of Clusters

• Genes Partitioned into Clusters

• What are Relationships Between Clusters?

K-Means & SOM

Page 12: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

2D & 3D Mapping Methods2D & 3D Mapping Methods

CoAMDS

PCA

Data Projected onto 2 or 3 Dimensions

But….What are ClusterBoundaries?

Page 13: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Experimental Experimental DesignDesign

Differential Differential Gene Gene ExpressionExpression

HybridisationHybridisation

DataData

Pattern Pattern DiscoveryDiscovery

Class Class PredictionPrediction

AnnotationAnnotation

NormalisationNormalisation

Bioinformatics in Microarray ExperimentBioinformatics in Microarray Experiment

Page 14: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Online Tools:

ARROGANT http://lethargy.swmed.edu/

DAVID http://apps1.niaid.nih.gov/david/

DRAGON http://207.123.190.10/dragon.htm

EASE http://apps1.niaid.nih.gov/david/

FANTOM http://www.gsc.riken.go.jp/e/FANTOM/

GoMiner http://discover.nci.nih.gov/gominer/

MatchMiner http://discover.nci.nih.gov/matchminer/

Onto-Express http://vortex.cs.wayne.edu/Projects.html

RESOURCERER http://pga.tigr.org/tigr-scripts/magic/r1.pl

Affymetrix GO http://www.affymetrix.com

Databases:

Gene Ontology http://www.geneontology.org/

OMIM http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/

UniGene http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/

AnnotationAnnotation

Page 15: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

My Research InterestsMy Research Interests

Pattern Discovery

Algorithm Development

Biologist-Friendly Software Tools

Take - 2D & 3D Mapping Methods

Methods - Define Cluster Boundaries

Make FUZZY

EAS-I

2D & 3D Visualisation Tools

Page 16: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Cluster BoundariesCluster Boundaries

CoAMDS

PCA

Page 17: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Fuzzy ClusteringFuzzy Clustering

• Differs to standard clust by assigning membership of a gene to all clusters

• Allows you to see the association of each gene within a cluster

• Can calculate the number of clusters in Partitioning methods (Fuzzy ART)

• Helps Combine Clusters

• Helps to clear Ambiguity

Page 18: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Fuzzy MappingFuzzy Mapping

Add Membership values of each gene to clusters

Page 19: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

Fuzzy PartitioningFuzzy Partitioning

K-Means & SOM

Page 20: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 21: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

• Need for Comprehensive Pattern Discovery Software Suite

• Fuzzy Data Analysis Suite

• Visualisation Tools to explore data

• Easy to use

• Free

Microarray Pattern Discovery

BBUnit

• Web based version

• Service by BBU

• Increase traffic to BBU web site

• Establish BBU for microarray

• Cross platform

Page 22: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

INTERFACE

Normalisation DifferentialGene

Expression

Pattern Discovery

Utilities

•Log•Normalise•Mean Centre•Median centre

•T test•ANOVA•Regression

•Hierarchical Cluster Analysis•SOM•K-Means•Fuzzy Art•PCA•MDS•CoA•Fuzzy C-Means

Page 23: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 24: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 25: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 26: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 27: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 28: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 29: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics
Page 30: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

http://bbu.uwcm.ac.uk

[email protected]

ContactContact

Page 31: Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics

• Pete Kille• Alan Clarke• Gareth Hughes (EASI team)• Karen Reed (Data)• Lesley Jones (Data, & EASI Collaborator)• BBU

AcknowledgementsAcknowledgements