many sample size and power calculators exist on-line 1 day 3 sessions 16-21 jcf.pdfpathways vs....

28
http://homepage.divms.uiowa.edu/~rlenth/Power/ Many Sample Size and Power Calculators Exist On-Line

Upload: others

Post on 30-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

http://homepage.divms.uiowa.edu/~rlenth/Power/

Many Sample Size and Power Calculators Exist On-Line

Page 2: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3Session 16: Questions and follow-up….James C. Fleet, PhDDistinguished ProfessorDepartment of Nutrition Science

Pete Pascuzzi, PhDAssistant ProfessorPurdue Libraries

Page 3: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3Session 17: Visualization III: NetworksJames C. Fleet, PhDDistinguished ProfessorDepartment of Nutrition Science

Pete Pascuzzi, PhDAssistant ProfessorPurdue Libraries

Page 4: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Creixell etal.(2015)NatureMethods12:615

Pathways vs. Networks

Pathway: • Small scale• Well-studied• Known linear relationship• Easily visualized and

interpreted

Network: • Large scale• Integration of multiple studies• Hard to visualize and interpret• Contain novel information not

covered in pathways

Both aggregate molecular events across multiple genes . Increases statistical detection threshold by the number of hypotheses tested

Fleet2016

Page 5: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

PatternsofRegulationinGenomicData:GuiltbyAssociation

• Human primary fibroblast cultures

• Serum starvation and refeeding

•9600 transcripts, spotted cDNA array

•Hierarchical clustering

* Genes in common cluster = common molecular regulation?

Iyer etal., 1999,Science283:83 Fleet2016

Page 6: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Creixell et al. (2015) Nature Methods 12: 615

Gene set enrichment

Subnetwork construction and

clustering

Network-based modeling

Simple but discard known biological network information

Fleet2016

Page 7: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Networks Integrate Information

DongandHan(2008)CellRes18:224 Fleet2016

Page 8: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

http://cytoscape.org/

……an open source software tool for integrating, visualizing, and analyzing data in the context of networks.

This does not do primary network building from your dataset.

Fleet2016

Page 9: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Data Format: *.txt or Excel (1st worksheet only)

Valid Expression Value type Expected Values

Ratio (0, +INF)Fold Change (-INF, -1) (1, +INF)LogRatio (-INF, +INF).p-value (0, 1)FDR, q-value (0, 100)Intensity (0, +INF)RPKM/FPKM (0, +INF)

(INF = infinity)

Identifier G1 Value G1 FC G1 FDR G2 value G2 FC G2 FDR

Header Row

Up to 20 observations per treatment/group

* IPA can average

Stats done prior to IPA

Fleet2016

Page 10: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Ingenuity Network AnalysisExpression

Dataset

IPA Knowledge

Base

Network Generation

Network

DEG

Genes in Network

Network Scoring

DEG fit to a probabilistic fit

to networksScored Genes

Associated Functions

DEG fit involved in a

biological function

DEG = differentially expressed genes

Page 11: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Wednesday

BREAK #1

Page 12: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3Session 18: Flexible time for reinforcement…James C. Fleet, PhDDistinguished ProfessorDepartment of Nutrition Science

Pete Pascuzzi, PhDAssistant ProfessorPurdue Libraries

Page 13: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3 Session 19:

Patrick FinneganHardware EngineerPurdue University

Tour of Data Center and Conte Cluster

Page 14: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3Session 20: Introduction to NGSJames C. Fleet, PhDDistinguished ProfessorDepartment of Nutrition Science

Pete Pascuzzi, PhDAssistant ProfessorPurdue Libraries

Page 15: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

GenomicsWGS, WES

Transcript-omics

RNA-Seq

Epigenomics Bisulfite-Seq

ChIP-Seq

Indels

SNP

CNV

Structural

DGE

Fusion

Splicing

Editing

Methyl DNA

Histones

TF binding

Functional effect of

mutation

Network + pathway analysis

Integrative analysis

Discovery and

Application

TechnologyData

Analysis Integration and

interpretation

Modified from Shyr D, Liu Q. Biol Proced Online. (2013)15,4 Fleet2016

Page 16: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Voelkerding et al., J Mol Diagn (2010) 12,539-51.

Library preparation

Library amplification

Parallel sequencing

Read 2

Read 1

NGSSequencingPipelineInput Fragment Add adapter

Fragment library

Reads

Fleet2016

Page 17: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Sims et al. (2014) Nat Rev Genet 15:121

HowMuchSequencingisEnough?

Read Length

# Re

ads/

Rxn

Target Coverage # reads

DNA variation 10-30X N/A

ChIP-seq 100X N/A

RNA-seq (DEG)(rare)

N/A 20 million100+ million

https://genohub.com/next-generation-sequencing-guide/Fleet2016

Page 18: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

http://apps.bioconnector.virginia.edu/covcalc/

Stephen Turner, PhD; Director. University of Virginia Bioinformatics CoreFleet2016

Page 19: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

UnderstandingRNA-seq

Fleet2016

Page 20: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Modified from Wang et al. (2009) Nat Rev Genet 10:57

Correlation between RNA-seqand Microarray Analysis

Analysis from two different S. cervasisiae papers using the same growth conditions

Tilin

g A

rray

(log

2)

RNA-seq (log2)

Zhao et al. (2014) PLOS One Activated T cells

Arr

ay (l

og2)

RNA-seq (log2)Fleet2016

Page 21: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

https://bioinfomagician.wordpress.com/2014/01/28/rna-seq-vs-microarray-what-is-the-take/

RNA-seq vs.Microarray:Whichis“better”?

Issue Microarray RNA-seq

Reproducibility High High

Dynamic Range Modest Wide

Sensitivity Low/Medium High

Accuracy High High (but better for FC)

Cost Low High

Complexity of analysis

Low High

Species Limited to available platforms

Any species possible

Fleet2016

Page 22: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Wednesday

BREAK #2

Page 23: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Day 3Session 21: Visualization IV: IGVJames C. Fleet, PhDDistinguished ProfessorDepartment of Nutrition Science

Pete Pascuzzi, PhDAssistant ProfessorPurdue Libraries

Page 24: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Fleet2016

Page 25: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Data for Visualization Lives Here……

Fleet2016

Page 26: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

File type Description

SAM Tab-delimited text file of sequence alignment data (i.e. primary read data)

BAM Binary version of the SAM file

Bedgraph Display of continuously valued data (e.g. transcriptome)

Wiggle (Wig) Display of continuously valued data (e.g. transcriptome)

bigWig Displays dense continuous data from Wig or bedgraphfiles for faster viewing

BED Tiled data file that defines a feature track

TDF Binary tiled data file that has been preprocessed for faster displays in IGV (e.g. for ChIP- and RNA-seq data)

narrowPeak Called peaks of signal enrichment based on pooled, normalized data

Types of Files Commonly Used

http://www.broadinstitute.org/igv/FileFormats Fleet2016

Page 27: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

BroadPeaksNarrowPeaksbigWig

Bam

Refseq genes

ChromosomalLocation

IGV Displays Various File TypesMouse Large Intestine DNAse-Seq Data from ENCODE

Fleet2016

Page 28: Many Sample Size and Power Calculators Exist On-Line 1 Day 3 Sessions 16-21 JCF.pdfPathways vs. Networks Pathway: • Small scale • Well -studied • Known linear relationship •

Visualizing RNA-seq Data

Cell line

Tissue

?