vizbi 2014 - visualizing genomic variation
DESCRIPTION
This talk was given at the VizBi 2014 conference. See vizbi.org/2014TRANSCRIPT
Visualizing Genomic Variation
Prof Jan AertsFaculty of Engineering - ESAT/STADIUSiMinds Medical ICT DepartmentKU [email protected]://visualanalyticsleuven.be
What is genomic variation?
Aerts & Tyler-Smith, In: Encyclopedia of Life Sciences, 2009
“copy number variation”
transitionstransversions
Effects of variation on phenotype
• change in protein abundance
• level of transcription or translation (loss/gain)
• stability
• change in protein structure (partly deleted, fusion genes, …)
What are we interested in?
• multiple samples
• show all affected genes (or functional units)
• cluster individuals
• functional effect of structural variation
• gene-centric instead of positionally ordered: coordinate-free view
• high-level annotations (pathways, GO-terms)
• uncertainty (statistical & positional) and underlying evidence
DNA sequencingread mapping
variant callingwhat is effect of variant?
check signal
QC QC
variant filtering
Single Nucleotide Polymorphisms
General approach: reference-based
UCSC
Ensembl
Ferstay et al, IEEE InfoVis, 2013
Variant Viewsequence variants in gene context
Integrative Genome Viewer (IGV)
Sequence logo
Sequence Diversity Diagram
Structural Variation
dotplot
Pevzner & Tessler, Genome Research, 2003
read depth information: arrayCGH and next-generation sequencing
Xie & Tammi, BMC Bioinformatics, 2009
next-generation sequencing: read-pair information
Medvedev, Nature Methods, 2009
Stephens et al, Cell, 2011
Integrate read-depth and read-pair information
Pavlopoulos et al, Nucleic Acids Research, 2013
Stephens et al, Cell, 2010
Meander
From data generation to data interpretation: understanding the effect of structural variation
linearity of reference chromosome broken by structural variation, but still using the reference for comparison
!
!
=> domain expert needs to try and “wrap his head around” the data
=> need to lessen the cognitive load in interpretation: change a cognitive task into a perceptual one
UCSC Genome Browser
Nielsen & Wong, Nat Methods, 2012
represent the chromosome as it is in vivo (=~ FISH)
Feuk, Nature Reviews Genetics, 2006
reconstruct rearranged chromosome based on graph structure of segments
breakpoint graph
Pevzner & Tessler, Genome Research, 2003
focus on functional impact - Pipit
Sakai et al, submitted
Challenges
• visual and interaction scalability
• genome size: HSA1 = 240Mb = 240,000 screens at 1pixel/bp = 72km
• deep sequencing => very high depth per track
• high-dimensional data: many tracks (n=98!)
• compare multiple samples
• computational scalability
• how to compute fast enough to make interactivity possible? (e.g. switching between data resolutions)
Challenges
Thank you
• Authors of papers mentioned
• Bioinformatics/Visual Analytics Leuven
• Ryo Sakai
• Raf Winand
• Thomas Boogaerts
• Toni Verbeiren
• Georgios Pavlopoulos
• Data Visualization Lab (datavislab.org)
• Erik Duval
• Andrew Vande Moere
�33
Questions?