the big data challenges of connectomics jeff w lichtman, hanspeter pfister nir shavlt presented by...

21
The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Upload: lester-perry

Post on 20-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

The big data challenges of connectomicsJEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT

PRESENTED BY YUJIE LI, OCT 21TH,2015

Page 2: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Connectomics• The study of the structural and functional connections among brain cells.

• Product is the “connectome,” a detailed map of those connections.

• Significant to understanding of the healthy and diseased brain.

• “I am my connectome” -- Sebastian Seung

Page 3: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Neuron structures

http://www.ncbi.nlm.nih.gov/books/NBK21535/

http://science.kennesaw.edu/~jdirnber/Bio2108/Lecture/LecPhysio/PhysioNervous.html

Page 4: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

How many neurons in a human brain?

100 billion neuronsHow many neurons in a Drosophila?100,000 neurons.~ 107 synapses

A video to appreciate the challenge faced with connectomics

Page 5: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Brainbow Technique

Page 6: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

A Voyage Into the Brain http://ngm.nationalgeographic.com/2014/02/brain/voyage-video

Page 7: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Acquisition

Page 8: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Analytical problems stand between the acquired image and having access to the data in a useful form• Alignment

• Reconstruction

• Feature detection

• Graph generation

Page 9: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Alignment

sections collected on a belt may rotate.

Page 10: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

ReconstructionChallenges for automatic segmentation:• Irregular neuron objects• Lateral resolution is several-fold finer

than thickness• Under/over segmentation

Goal : Obtain saturated reconstructions of very large (1mm3)brain volumes in a fully automatic way, with minimal errors and reasonably short time.

Human tracers, cursive handwritings recognition

Page 11: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Feature detection Subcellular features: mitochondria, synaptic vesicles etc…

Difficult to find cell boundaries

Irregular shape

Reduce error and analysis time

Page 12: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Graph generation•Data turned in to a form that represents the wiring diagram.

•Data reduction step

• How much of original data to retain?

• How to store the graph?• Skip Oct-trees.

Page 13: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Common theme: Dehumanizing the pipeline An irony is that humans are especially good at these tasks…. If we know how our brain wires, would be easier to develop tools to automate these processes.

Page 14: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Big data challenges of connectomics• Data size

• Data rate

• Computational complexity

• Parallel computing

• Compute system

• A heterogeneous hierarchical approach

• Data management and sharing

Page 15: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Data size 1mm3 rat cortex image = 2 million gigabytes = 2 petabytes

A complete rat cortex 500mm3 = 1,000 petabytes

(Walmart database manages a few petabytes of data)

A complete human cortex ~1000 larger than rodents = 1,000 * 1000 petabytes = 1 zetabyte

(All information recorded globally today)

Page 16: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Data rate - Imaging task distributed to different labs - Complete connectome of a human cortex is the goal! - Maybe start with substructures.

Page 17: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Data management and sharing - Assumed we obtained the data, do we store it?

◦ Yes, image and graph.

- How to move from microscope to the computer system? Transfer bandwidth◦ Placing computer near the microscope.◦ 500 standard 4-core 3.6 GHz processors would suffice. $1 million.

- Where to store?◦ Disk or tapes.

- How to share?◦ Internet Current achievable data rates: 300 megabites/second◦ Central sharing sites◦ Reconstructed layout graph is easier to deal with.

Page 18: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Computational complexity The goal of many big data system is more than to simply allow storage and access to large amounts of data. Rather, it is to discover correlations within data.◦ Sampling◦ Parallel computing

◦ Image segmentations and feature extraction are embarrassingly parallel.

Page 19: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

A heterogenerous hierarchical approach

Combines bottom-up information from the image data with top-down information from the assembled layout graph, to dynamically decide on the appropriate computation level of intensity to be applied to a given sub-volume.

1) Initially apply the lowest cost computations to small volume. 2) The sub-graphs will be tested for consistency. 3) If discrepancies are found, more expensive computation used.4) The process will continue hierarchically, growing the volume of merged segments.

Page 20: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

Prospects - The field needs a significant investment to advance. - Commercial values in connectomics

◦ Treating brain diseases◦ Appling lessons learnt to making computer smarter

- Challenges beyond the horizon: still big data problem

Page 21: The big data challenges of connectomics JEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT PRESENTED BY YUJIE LI, OCT 21TH,2015

CommentsNo address on the EM technical limitations:  • Samples post-mortem, not in vivo• Physical damage during section, potential distortion.• Lack functional information

No comparison with the current popular approaches to the problem• Two photon, confocal, brightfiled images• Neuron-labelling approaches (physical dye, genetic approach)

Big data is not only about handling the super large dataset.• It is also about finding a smart way to fuse data from different modalities

and different sources to obtain a comprehensive understanding