Download - Mason abrf single_cell_2017
Cells that fell into wells with stories to tell
Christopher E. MasonAssociate Professor
Department of Physiology and Biophysics,The Feil Family Brain and Mind Research Institute (BMRI) &
The Institute for Computational Biomedicine (ICB) & the Meyer Cancer Center of Weill Cornell Medicine (WCM)Fellow of the Information Society Project, Yale Law School
March 28th, 2017.@mason_lab
(0)
Background
Genetic alterations can be selected for and potentially drive a tumor’s progression.
Alizadeh et al., “Toward understanding and exploiting tumor heterogeneity.” Nature Medicine, 2015
What else?
Li S, Garrett-Bakelman F, et al., Distinct Evolution and dynamics of epigenetic and genetic heterogeneity in AML. Nature Medicine, 2016.
Li S, et al., “Dynamic Evolution of Clonal Epialleles Revealed by Methclone.” Genome Biology, 2014.
Mosaicism increases with age
Wang Y et al. Maternal mosaicism is a significant contributor to discordant sex chromosomal aneuploidies associated with noninvasive prenatal testing. Clinical Chemistry 2014; 60(1):251-9.
% o
f cel
ls w
ith
X Ch
rom
osom
e Lo
ss (X
CL)
http://2014hs.igem.org/Team:TAS_Taipei/project/abstract
Epigenetic Drift in Twins
Mario F. Fraga et al. PNAS 2005;102:10604-10609
Horvath S. “DNA methylation age of human tissues and cell types.” Genome Biology. 2013;14(10):R115.
Prediction and Precision
Predictive Medicine
DiseasePrecision Medicine
https://www.nasa.gov/content/nasas-journey-to-mars
ETA: 2035
(1)
Single Cell Revolution
Rapid and Efficient Microfluidics•Partition 100-10,000+ cells
per channel in < 7 minutes•Run 1 to 8 channels in
parallel•No lower size limit on cells•Recovers up to 65% of all
loaded cells, including:–T cells, B cells, PBMCs and cell
lines–FACS-isolated cells–MACS MicroBead-enriched cells
•Low doublet rate: 0.9% per 1,000 cells
Assay Scheme for 5’ Barcoding and V(D)J Enrichment
• RT enzyme and poly(dT) primer delivered to all GEMs as part of master mix
• Barcoded template switch oligo delivered to GEMs from Gel Beads
• RT reaction generates unbiased cDNA with a sequencing adapter, a cell barcode and a UMI on the 5’ end
• PCR with one primer for the 5’ adapter and one or more primers for the desired TCR/Ig constant regions.
• Fragmentation and sequencing optimized for assembly of the full V(D)J sequence (5’ UTR to constant regions) from short reads on a cell-by-cell basis.
Sc - options
Many options for single cells
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5359768/pdf/jbt.17-2801-006-jbt.17-2801-006.pdf
Composite measurements and molecular compressed sensing for highly efficient transcriptomics
http://biorxiv.org/content/early/2017/01/02/091926
http://biorxiv.org/content/early/2017/01/02/091926
http://biorxiv.org/content/early/2017/01/02/091926
data = opportunities1. Quantify heterogeneity
2. Visualize relationships between single cell transcriptomes3. Identify
signatures of response
4. Explore variable isoform expression
5. variant calling from sc-RNAseq
How can we study the vast transcriptome?
exon1 exon2 exon3
exon1-exon2exon1-exon3exon2-exon3
exon1-exon2-exon3
6 63 15
7 127 21
8 255 28
3 7 3
4 15 6
5 31 10
1 1 0
2 3 1
Exons Variants Junctions
2n-1
Exon 1 Exon 2 Exon 3
Exon 1 Exon 2 Exon 3
n(n-1) 2
Exon4
Exon4
exon4 exon1- exon4exon2-exon4exon3-exon4
exon1-exon2-exon4exon1-exon3-exon4exon2-exon3-exon4
exon1-exon2-exon3-exon4
8x1083 theoretical transcript combinations8x1080 atoms in the universe
(159 atoms/star, 111 stars/galaxy, 110 galaxies)
Li and Mason, “The Pivotal Regulatory Landscape of RNA Modifications.” Annual Review of Genomics and Hunan Genetics, 2014
What are the differences between cells?Exon 1 Exon 2 Exon 3
Exon 1 Exon 2 Exon 3
Exon4
Exon4
Need a tool to characterize broken up reads…
Since we already had Jitterbug for Transposable Element Insertions (TEIs)…
DISCODistributions of Isoforms in Single Cell
Omics
https://pbtech-vc.med.cornell.edu/git/mason-lab/disco/tree/master
Pipeline
1. Align reads to reference genome (STAR, two-pass)2. Use MISO’s probabilistic framework to assign reads to isoforms
and estimate relative abundance of each isoform in each cell3. Run DISCO to
– filter miso results for coverage, presence of isoform in a minimum number of cells, etc.
– compare 2 groups of single cells (or any RNA-seq samples) using Kolmogorov-Smirnov tests
– visualize significant shifts
disco <miso_filelist.txt> <group1> <group2>
DISCO: Distributions of Isoforms in Single Cell Omics
positional arguments: SampleAnnotationFile filename of tab separated text, no header, with columns: <path to miso summary file> <sample name> <group name> Group1 must match a group name in sample annotation file Group2 must match a group name in sample annotation file
optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit --outdir Output directory (default: ./disco_output/) --pkldir Directory to store intermediate data processing files (default: ./pkldir) --group1color Color in plots for group 1; can be {y, m, c, r, g, b, w, k} or html code (default: r) --group2color Color in plots for group 2; can be {y, m, c, r, g, b, w, k} or html code (default: b) --group1file output file for sample group 1. If not specified, will save to <outdir>/<group1name>_alldatadf.txt (default: None) --group2file output file for sample group 2. If not specified, will save to <outdir>/<group2name>_alldatadf.txt (default: None) --geneannotationfile Mapping of Ensembl gene IDs to HGNC symbol and gene descriptions (default: None) --transcriptannotationfile Mapping of Ensembl transcript IDs to isoform function (ex. protein coding, NMD, etc) (default: None) --maxciwidth Maximum width of confidence interval of PSI estimate (default: 1.0) --mininfreads Minimum number of informative reads to include PSI estimate (default: 0) --mindefreads Minimum number of definitive reads to include PSI estimate (default: 0) --minavgpsi Do not run statistical tests for isoforms with average PSI in both groups less than minavgpsi (default: 0.0) --minnumcells Do not run statistical test for isoform if less than minnumcells have information (default: 0) --minmedianshift Do not run statistical test for isoform if shift in median between the two groups is less than minmedianshift (default: 0) --stattest Which test to run? options: {KS, T} (default: KS)
usage: disco [-h] [-v] [--outdir] [--pkldir] [--group1color]
[--group2color] [--group1file] [--group2file] [--geneannotationfile] [--transcriptannotationfile]
[--maxciwidth] [--mininfreads]
[--mindefreads] [--minavgpsi] [--minnumcells] [--minmedianshift]
[--stattest] SampleAnnotationFile Group1 Group2
Disco, run-time options
(2)
MDS
Myelodysplastic Syndromes (MDS)
• class of bone marrow failure disorders
Myelodysplastic Syndromes (MDS)
• class of bone marrow failure disorders• accumulation of abnormal hematopoietic
stem cells (HSCs) --> ineffective hematopoiesis
- Pang et al., 2013; Woll et al., 2014 - HSC = Lin-CD34+CD38-
CD90+CD45RA-
HSCs
Progenitors
Normal MDS
Myelodysplastic Syndromes (MDS)
• class of bone marrow failure disorders• accumulation of abnormal hematopoietic
stem cells (HSCs) --> ineffective hematopoiesis
HSCs
Progenitors
Normal MDS heterogeneity
response totherapy
disease progression
?
? ?
Myelodysplastic Syndromes (MDS)
• 30% patients progress to acute myeloid leukemia (AML)
• current therapies (ex. decitabine) produce partial or complete remissions in some patients but disease re-emerges in 100% of patients
Experimental DesignDecitabine responders
Decitabine non-responders
Untreated Normal
Pre-Rx / Untreated serial 1
Post-Rx / Untreated serial 2
Purify HSCs with FACS
single cell processing with Fluidigm C1RNA-seq! (2 x 100)
max 96 cells per run
Lin- CD34+ CD38- CD90+ CD45RA-
Samples
Patients Pre Post Response1 79 - UT2 63 85 UT3 19 71 NR4 56 31 R5 68 - R6 33 27 NR7 - 17 R9 61 - R
norm1 55 - -norm2 82 - -UT = Untreated
NR = Non-responderR = Responder
Pre = pre-decitabine in R and NR, serial time point 1 in UTPost = post-decitabine in R and NR, serial time point 2 in UT
number of cells
differentially expressed genes between MDS (pre-treatment) and normal
Z-score of log2(FPKM+1) of DEGs at FDR 0.01
Pathways enriched in DEGs
Lineage markers
Lineage Negative PositiveB-cell 3 9
Erythroid/Megakaryocytic 0 5
HSC 15 90Lymphoid 4 10Myeloid 0 7
T-Cell 7 10
number of genes
MEIS1CXCR4HLFMECOMNR4A2
MPOCEBPA
Genes that overlap with MDS v Norm DEGs
Clouds of patient groups
Assign cells to stem cell or myeloid states
Semi-supervised pseudotime ordering based on 80 marker genes
monocle R package
Normal HSCs exclusively occupy the lower end of the pseudotime lineage tree
***
Differences in cell state distributions between disease groups
Effect of treatment on cell states
Effect of treatment on cell states
Normalized expression of genes differentially expressed with pseudotime (FDR 0.01)
Enriched pathways in genes differentially expressed with pseudotime
Pathways differentiating branch 1 and 2
branch 1
branch 2
Branch 1
Branch 2
Participatory medicine with twin astronauts
Planets are really just big cells
Conclusions
• Single cell sequencing reveals a complex and heterogeneous transcriptional landscape in hematopoietic stem cells
• Lineage ordering of HSCs based on stem cell and myeloid marker genes reveals distinct cell states between:– MDS and normal HSCs– Decitabine responders and non-responders– Pre- and post-treatment
• Decitabine does not eradicate all (or even most) MDS-specific cell states, suggesting therapies that target these cells may have better long-term success
• Relevant functional processes altered in these cells include ribosome function, p53 signaling, Rap1 signaling, B cell receptor signaling, etc.
• We need a planetary-size capture and sequencing system
These People are Awesome @mason_lab
Thanks to the Swabbing Teams! www.pathomap.org/people/
Deep Gratitude to Many People:
IlluminaGary SchrothMarc Van Oene
Univ. ChicagoYoav Gilad
FDA/SEQC/Fudan Univ.Leming Shi
NIH/UDP/NCBIJean & Danielle Thierry-Mieg
BaylorJeff Rogers
MSKCCDanwei HuangfuChristina LeslieRoss LevineAlex Kentsis
HudsonAlphaShawn LevyBraden Boone
Mason LabEbrahim AfshinnekooSofia AhsanuddinNoah AlexanderPradeep AmbroseDaniela BezdanMarjan BozinoskiDhruva ChandramohanChou ChouTim DonahoeFrancine Garrett-BakelmanJonathan FooxElizabeth Hénaff Alexa McIntyreCem MeydenNiamh O’HaraRachid OunitLenore PipesJake ReedHeba ShabaanPriyanka VijayDavid Westfall
Cornell/WCMScott BlanchardSelina Chen-KiangOlivier ElementoSamie JaffreyAri MelnickMargaret RossEpigenomics Core
DukeStacy HornerNandan Gokhale
Icahn/MSSMEric Schadt, Andrew Kasarskis,Joel Dudley, Ali Bashir, Bobby Sebra
ABRF George GrillsDon BaldwinCharlie Nicolet
MiamiMaria E Figueroa
AMNHGeorge AmatoMark Siddall
@mason_lab
NYUMartin BlaserJane CarltonJulia MaritzChris Park
MIT Media LabKevin SlavinDevora NajjarRegina Flores
RockefellerJeanne GarbarinoCharles Rice
NASAAaron BurtonSarah Castro-WallaceKate RubinsGraham ScottCraig Kundrot
Jackson LabsSheng Li
UCSFCharles Chiu
XMP/MGRGScott TigheKen McGrathRuss CarmicalScott Jackson