michael reich, genomespace workshop, fged_seattle_2013

64

Upload: functional-genomics-data-society

Post on 10-May-2015

225 views

Category:

Technology


0 download

DESCRIPTION

GenomeSpace Workshop: Intro, tools and recipes, Integrative analysis exercise, development

TRANSCRIPT

Page 1: Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Page 2: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Outline

•  Introduction to GenomeSpace •  GenomeSpace Tools and Recipes •  GenomeSpace User Interface •  Integrative analysis exercise •  Other GenomeSpace Tools •  GenomeSpace development •  Q and A

Page 3: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

The  vision:  Integra0ve  Transla0onal  Genomics  

GenePattern Cytoscape IGV/UCSC Genomica

Network

Compendium

Expression

Alterations

atcgcgtttattcgataagg!atcgcgttttttcgataagg!!

CMAP

Add Transcription Factor track from UCSC

6

Looks close to p53 site

7 Test for similarity of

p53 and gene location

8

Extract module

ii

Learn p53 site/score on

promoter iv

Load compendium Show module

map i

Show Chromosome

5

Expand +1 (include

neighbors)

4

Show network

3

Differentially Expressed

Genes

1

Idea

GSEA test enrichment

2

iii

Arrests G2/M

Conclusion vi

Pathway activation

Added to GenePattern

v

Page 4: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Driving  Biological  Projects  

lincRNAs    

Cancer  stem  cells    

Pa1ent  Stra1fica1on    

Outreach  to  new  DBPs  

Seed  Tools  

Cytoscape  Galaxy  

GenePa9ern  Genomica  

IGV  UCSC  Browser  

 Outreach  to  new  tools  

Online community to share diverse computational tools

www.genomespace.org

Page 5: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

•  Support  for  all  types  of  resource:  Web-­‐based,  desktop,  etc.  

•  Automa1c  conversion  of  data  formats  between  tools  

•  Easy  access  to  data  from  any  loca1on  •  Ease  of  entry  into  the  environment  

GenomeSpace: a connection layer between integrative analysis tools  

Page 6: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

GenomeSpace-Enabled Tools

Integrative Genomics Viewer Cytoscape Galaxy GenePattern

GenomeSpace Components

Authentication and Authorization

Genome Space Server

Data Manager Analysis and Tool Manager

GenomeSpace Project Data

GenomeSpace Server

Page 7: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Register  

www.genomespace.org

Page 8: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Register  

Page 9: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Register  

Page 10: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Register  

Page 11: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Login  

Page 12: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Login  

Page 13: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

GenomeSpace  UI  

Page 14: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Tools  and  Recipes  

Focus  on  Kitchen  Skills  

Page 15: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Agenda  

•  Review  of  GenomeSpace  tools  in  the  first  exercises  

•  Basic  recipes  for  using  GenomeSpace  – Launching  tools  – Uploading  data  to  GenomeSpace  – Sending  data  to  tools  

Page 16: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

GenomeSpace  Tools  

ArrayExpress  

Galaxy  

Cistrome  

Cytoscape  

GenePa9ern  

Genomica  

ISAcreator  

geWorkbench  

Gitools  

IGV  

InSilicoDB  

UCSC  Table  Browser  

MSigDB  

 

Page 17: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Cytoscape  Cytoscape  is  an  open-­‐source  bioinforma1cs  soSware  plaTorm  

for  visualizing  molecular  interac1on  networks  and  biological  pathways,  and  integra1ng  these  networks  with  annota1ons,  gene  expression  profiles,  and  other  state  data.    

Page 18: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Galaxy  Galaxy  is  an  open-­‐source,  scalable  framework  for  tool  integra1on  that  allows  

users  to  analyze  mul1ple  alignments,  compare  genomic  annota1ons,  and  profile  metagenomic  samples,  among  many  possible  analyses;  workflows  allow  the  linking  together  of  analyses.  

Page 19: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Genomica  Genomica  is  an  analysis  and  visualiza1on  tool  for  genomic  data  

that  can  integrate  gene  expression  data,  DNA  sequence  data,  and  gene  and  experiment  annota1on  informa1on.  

Page 20: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

GenePa9ern  GenePa9ern  is  a  powerful  genomic  analysis  plaTorm  that  provides  access  to  

more  than  150  tools  for  gene  expression  analysis,  proteomics,  SNP  analysis,  flow  cytometry,  RNA-­‐seq  analysis,  and  common  data  processing  tasks.  A  web-­‐based  interface  provides  easy  access  to  these  modules  and  allows  for  the  crea1on  of  mul1-­‐step  analysis  pipelines  that  enable  reproducible  in  silico  research.  

Page 21: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

ArrayExpress  ArrayExpress  is  a  repository  of  over  30,000  func1onal  genomics  

experiments  comprising  nearly  1  million  assays.  Users  can  query  and  retrieve  data  in    a  number  of  different  formats  including  the  MIAME  and  MINSEQE  standards.  

Page 22: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

geWorkbench  geWorkbench  is  an  open-­‐source  bioinforma1cs  plaTorm  that  offers  a  

comprehensive  and  extensible  collec1on  of  tools  for  the  management,  analysis,  visualiza1on,  and  annota1on  of  biomedical  data.  For  microarrays,  there  are  tools  for  filtering  and  normaliza1on,  basic  sta1s1cal  analyses,  clustering,  network  reverse  engineering,  as  well  as  many  common  visualiza1on  tools  

Page 23: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Cistrome  In  addi1on  to  the  standard  Galaxy  func1ons,  Cistrome  has  29  ChIP-­‐chip-­‐  and  

ChIP-­‐seq-­‐specific  tools  in  three  major  categories,  from  preliminary  peak  calling  and  correla1on  analyses,  to  downstream  genome  feature  associa1on,  gene  expression  analyses,  and  mo1f  discovery.  

Page 24: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Gitools  •  Gitools  is  a  framework  for  analysis  and  visualiza1on  of  

genomic  data  using  interac1ve  heatmaps.  

Page 25: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Integra1ve  Genomics  Viewer  (IGV)  

The  Integra0ve  Genomics  Viewer  (IGV)  is  a  high-­‐performance  visualiza1on  tool  for  interac1ve  explora1on  of  large,  integrated  genomic  datasets.  It  supports  a  wide  variety  of  data  types,  including  array-­‐based  and  next-­‐genera1on  sequence  data,  and  genomic  annota1ons.  

Page 26: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

InSilicoDB  InSilico  DB  is  a  web-­‐based  genomics  data  manager  containing  

thousands  of  curated  public  datasets.    The  datasets  can  be  exported  to  analysis  tools  and  GenomeSpace.  

Page 27: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

UCSC  Table  Browser  The  Table  Browser  allows  you  to  retrieve  data  associated  with  a  track  in  text  

format,  to  calculate  intersec1ons  between  tracks,  and  to  retrieve  DNA  sequence  covered  by  a  track.    ASer  you  select  the  op1ons  for  your  output  file,  you  can  opt  to  send  your  output  file  to  your  GenomeSpace  cloud  storage.  

Page 28: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Basic  GenomeSpace  recipes  

•  Uploading  data  •  Launching  tools  •  Transi1oning  across  tools  

Page 29: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Uploading  Data  

1  2  

3  

Page 30: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Launching  tools  

Click on the tool’s icon

Page 31: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Launching  tools  

Open the tool’s context menu

Then click on Launch (or Launch on File)

Page 32: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Then click on one of The files to get the Launch menu and pick Your tool

Launching  Tools  

Click the checkbox for one (or more) files

Page 33: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Launching  tools  Then click the Launch button

Click  and  drag  a  File  onto  a  tool    

icon  

Page 34: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Transi1oning  across  tools  

1.  Launch  Genomica  -­‐  Load  (shared)  data  from  GenomeSpace  -­‐  Save  it  back  to  a  new  folder  

2.  Launch  GenePa9ern  on  your  data  -­‐  Do  a  simple  processing  step  -­‐  Save  it  back  to  GenomeSpace  -­‐  Send  it  to  IGV  

3.  Visualize  the  procesed  data  IGV  

Page 35: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Launch  Genomica  

•  Using  one  of  the  op1ons  you  saw  earlier  –  Click  on  the  icon  –  or  use  the  context  menu  –  or  use  the  launch  menu  

•  Load  data  from  GenomeSpace    

Home  ▸    Public  ▸    SharedData  ▸    Demos  ▸    Scenario  ▸    step3  ▸    80_module.gxp  Or  

Home  ▸    Shared  to  <your  id>  ▸  mmr  ▸    FGED  ▸    80_module.gxp    

Page 36: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Loading  into  Genomica  Home ▸ Shared to <your id> ▸ mmr ▸ FGED ▸ 80_module.gxp

Page 37: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Saving  Back  to  GenomeSpace  

Page 38: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Launching  GenePa9ern    

•  You  can  do  this  from  within  Genomica  or  also  from  the  GenomeSpace  interface  

•  Select  “PreprocessDataset”  in  the  send  to  module  

Page 39: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Process  the  data  •  Run  PreprocessDataset  with  default  parameters  

Page 40: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Save  the  result  

Use the context menu for the file on either the job result page …

Page 41: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Save  the  result  …or the context menu for the file on the GenePattern home page.

Page 42: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Saving  to  GenomeSpace  Click “Save to GenomeSpace” from the context menu and then select a targetdirectory

Page 43: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Send  to  IGV  •  In  the  GenomeSpace  interface,  launch  IGV  

– Open  the  ‘GenomeSpace’  menu  and  ‘Load  from  GenomeSpace’  

Page 44: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Select  your  file  (from  GenePa9ern)  

Page 45: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Visualize  in  IGV  

Page 46: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

GenomeSpace  UI  

A  detailed  tour  of  the  GenomeSpace  User  Interface  

Page 47: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Agenda  

•  File  Management  •  File  opera1ons  •  Sharing  with  others  •  Organizing  your  tools    

Page 48: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

File  Management    

•  Move  a  file  or  directory  •  Copy  …  •  Dele1ng  …  •  Crea1ng  subdirectories  •  Recent  uploads  

Page 49: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

File  Opera1ons  

•  Previewing  a  file    •  Extrac1ng  rows  and/or  columns  •  Format  conversion  

Page 50: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

File  Preview  

Page 51: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Extrac1ng  Rows  and/or  Columns  

Page 52: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Extrac1ng  rows  and/or  columns  •  Check the columns you want to include •  Provide a first (and optionally last) row index to include •  Edit the file name and ‘Save’

Page 53: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Sharing  with  others  

•  Sharing  files  with  –  Individuals,  groups  

•  Crea1ng  groups  for  sharing  

•  Sharing  links  – With  other  GenomeSpace  users  – To  people  without  GenomeSpace  accounts  

Page 54: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Organizing  tools  

Drag and drop tools in the list to reorder them

Uncheck the tool To remove it from The toolbar

Page 55: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Other  GenomeSpace  Tools  

ArrayExpress  

Galaxy  

Cistrome  

Cytoscape  

GenePa9ern  

Genomica  

ISAcreator  

geWorkbench  

Gitools  

IGV  

InSilicoDB  

UCSC  Table  Browser  

MSigDB  

 

Page 56: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

ArrayExpress  •  Repository  of  over  30,000  gene  expression  and  other  

func1onal  genomics  experiments  comprising  nearly  1  million  assays.    

•  Query  and  retrieve  data  in    a  number  of  different  formats  including  MIAME  and  MINSEQE.  

Page 57: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Cistrome  29  ChIP-­‐chip  and  ChIP-­‐seq  tools,  including:  •  Preliminary  peak  calling  •  Correla1on  analyses  •  Downstream  genome  feature  associa1on  •  Gene  expression  analyses  •  Mo1f  discovery  

Page 58: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Cytoscape  •  Visualize  molecular  interac1on  networks  and  biological  

pathways  •  Integrate  networks  with  annota1ons,  gene  expression  profiles,  

and  other  data  

Page 59: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Galaxy  Galaxy  is  an  open-­‐source,  scalable  framework  for  tool  integra1on  that  allows  

users  to  analyze  mul1ple  alignments,  compare  genomic  annota1ons,  and  profile  metagenomic  samples,  among  many  possible  analyses;  workflows  allow  the  linking  together  of  analyses.  

Page 60: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

geWorkbench  Analysis,  visualiza1on,  and  annota1on  of  biomedical  data,  including:  •  Microarray  filtering,  normaliza1on,  clustering,  network  reverse  

engineering  •  Basic  and  advanced  sta1s1cal  methods  •  Regulator  analysis  •  Common  visualiza1on  tools  •  Links  to  databases  

Page 61: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

Gitools  Analysis  and  visualiza1on  of  genomic  data,  including:  •  Interac1ve  heatmaps  •  Enrichment  analysis  (e.g.  of  Gene  Ontology  terms)  •  Import  from  Web-­‐based  data  sources  (IntOgen,  BioMart)  

Page 62: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

InSilicoDB  Web-­‐based  genomics  data  portal  containing  thousands  of  

curated  public  datasets,  including  all  of  the  Gene  Expression  Omnibus  (GEO).    

Page 63: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

UCSC  Table  Browser  •  Query  and  retrieve  genomic  sequence  data  in  text  format  •  Send  data  to  GenomeSpace  and  other  analysis  and  visualiza1on  tools  •  Calculate  intersec1ons  between  genome  tracks  

Page 64: Michael Reich, GenomeSpace Workshop, fged_seattle_2013

MSigDB    Molecular  Signatures  Database  

•  Query  and  retrieve  a  large  compendium  of  gene  sets,  including  regulatory,    metabolic,  and  genomic  pathways,  genomic  posi1on-­‐based  gene  sets,  etc.  

•  Send  data  to  GenomeSpace  and  other  analysis  and  visualiza1on  tools  •  Calculate  overlap  sta1s1cs  between  gene  sets