a blind search for patterns

22
A blind search for patterns Unravelling low replicate data

Upload: joelle

Post on 10-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

A blind search for patterns. Unravelling low replicate data. ExSpec Pipeline. Data: Structure and variability. Structure Between 500-10,000+ features Each feature has an associate ion count for each sample aligned. Data is not normally distributed. Variability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A blind search for patterns

A blind search for patternsUnravelling low replicate data

Page 2: A blind search for patterns

ExSpec Pipeline

Page 3: A blind search for patterns

Data: Structure and variability

Structure Between 500-10,000+ features

Each feature has an associate ion count for each sample aligned.

Data is not normally distributed.

Variability Up to 30% technical variability

Each feature is effected differently

Page 4: A blind search for patterns

Data Structure and variability

Page 5: A blind search for patterns

Data: Structure and variability

The majority of features that are detected are singletons.

Page 6: A blind search for patterns

Low Replicate data

“Suck it and see” One off project

Pump priming projects

Medical samples Biopsy

Difficult to access Ecological data

Resampling is difficult

Page 7: A blind search for patterns

Methods

Finger printing

PCA

Basic scoring

PDE model

Gradient search

Differential analysis

Page 8: A blind search for patterns

PCA

Very simple

Can be highly informative Depends on the data

Used in pipeline Data quality

Page 9: A blind search for patterns

Bruno Project Samples :

Human biopsy Replication – biopsy cut into

equal parts

PCA Analysis

Page 10: A blind search for patterns

N group Non-cancer biopsy

T group Cancer biopsy

Using PCA clustering we are able to distinguish between healthy and sick patients

PCA Analysis

Page 11: A blind search for patterns

PCA reveled profile similarity which correlated with biological evidencePCA

Analysis

Page 12: A blind search for patterns

PCA Analysis

Human Urine project• 22 patients sampled• 11 healthy and 11 sick

patients • Sample labels dropped

Page 13: A blind search for patterns

PCA Analysis

Ecological Data

Large number of samples without clear replication.

Page 14: A blind search for patterns

PCA Analysis

Cluster pattern: Find the features which hold the cluster pattern

Page 15: A blind search for patterns

PCA Analysis

Using PCA and profile similarity analysis subset of features of interest were found

Page 16: A blind search for patterns

Basic Scoring

Use Z-score to sort data Use this to pull out important features.

Control – Exp With two class problem we can use PDE modelling.

Page 17: A blind search for patterns

Basic Scoring : PDE modelling

Multi class problem

Plants Wild type

act ko mutant

Treatments Normal light

High light

Page 18: A blind search for patterns

Gradient Analysis

Use rate of change of abuandace to Mine data for spesifc trends

Find features of intrest

Use PDE modelling of rates

Page 19: A blind search for patterns

Gradient Analysis

Mining for features which showed rapid increase due to a specific treatment

Page 20: A blind search for patterns

Data Provided by:

Brno Ted Hupp

Rob O’Neill

Urine study Steve Michell

John Mcgrath

Ecological data Dave Hodgson

Nicole Goody

Gradient analysis John Love

Data scoring Nicholas Smirnoff

Mike Page

Page 21: A blind search for patterns

Metabolomics and Proteomics Mass Spectrometry Facility @ The University of Exeter

Nick Smirnoff (Director of Mass Spectrometry) [email protected]

Hannah Florance (MS Facility Manager) [email protected]

Venura Perera (Bioinformatics and Mathematical Support) [email protected]

http://biosciences.exeter.ac.uk/facilities/spectrometry/http://bio-massspeclocal.ex.ac.uk/

Page 22: A blind search for patterns

About me

Background Applied Maths

Untargeted metabolite profiling

Research interests Data driven modelling

Small molecule profiling

Gene regulatory network modelling

Application of mathematical methods

Metabolite identification using LC-MS/MS