third generation long-read sequencing of hiv-1 transcripts discloses cell type specific and temporal...
TRANSCRIPT
Third generation long-read sequencing of HIV-1 transcripts discloses cell type specific and
temporal regulation of RNA splicing
Frederic Bushman
International AIDS MeetingWashington DC, 2012
Splicing factors prominent in genome-wide siRNA screens
HIV RNAs spliced to yield at least 40 mRNAs
Sensitivity suggests unexploited opportunity for intervention?
Relevant ORFs remain to be discovered?
Bushman et al. 2009 PLoS Path
Why Study HIV Splicing?
Approach
• Amplification:18 primer pairs
• Canonical splicing• Rare splicing
• New splicing
cDNATemplate Mix
Break Emulsion
Sequence
RainDance Technologies: Single Molecule Droplet PCRTewhey et al., Nature Biotechnology, 2009RainDance Technologies
b cDNA prep from infected cells
a Primer Library
Overlapping primer pairs amplify cDNA maintaining ratios
PrimerLibrary
PCR
Pacific Biosciences: Single molecule sequencing
Fixed polymerase
Phosphate-labeled nucleotides
High throughput single molecule real-time sequencing provides long reads, maintaining linkage between exons
Error mitigated by
1. Alignment to 10kb HIV genome
2. SMRTbell approach…
www.pacificbiosciences.com
930,294 HIV sequences of up to 2629 bp
Pacific Biosciences: Sequence Output
Cell Type Mappable Reads
Median Raw Read-length
Longest HIV Sequence
HOS (18,24,48hpi) 88,975 3678 bp 2105 bp
Primary CD4T (7 donors triplicate,
48hpi)841,319 2595 bp 2629 bp
2 Novel Splice Donors
Scott Sherrill-Mix
11 Novel Splice Acceptors
Scott Sherrill-Mix
Novel Splice Sites
Genetic Map
Exons
SD Splice DonorSA Splice Acceptor* site does not adhere to consensus
Complete message population of HIV-189.6 in CD4+ T cells
• 77 complete message structures
• Evidence for 36 additional transcripts from partial reads
• Total: 113 mRNAs
• 19 novel transcripts including a new completely spliced class (~1kb)
Scott Sherrill-Mix
Novel Acceptor A8c
Novel splice acceptor A8c creates new ORFs in HIV-189.6
Dynamic Transcript PopulationsMutually exclusive acceptors :
Temporal, cell-type and intra-human variability
Dynamic Transcript Populations
Conclusions
Long read single molecule sequencing works well to delineate HIV message populations
At least 113 different HIV-1 transcripts
1 kb class of RNAs prominent in HIV89.6
Differential splicing by cell type, time after infection, and among cells from human subjects
CreditsBushman laboratory Former Bushman Lab CollaboratorsTroy Brady Gary WangCharles BerryKyle Bittinger Brett BeitzelSumit ChandaRohini Sinha Mary LewinskiJohn YoungScott Sherrill-Mix Astrid Schroder Renate KoenigFrances Male Angela Ciuffi Joe EckerChristian Hoffmann Heather Marshall Rose Craig HydeNirav Malani Jeremy Leipzig Mark YeagerBrendan Kelly Matt CulybaKushol GuptaYoung Hwang Rick MitchellGreg Van DuyneStephanie Grunberg Tracy Diamond Masahiro YamashitaSerena Dollive Emily CharlsonMike EmermanAlexandra Bryson Shannah Roth Francis CollinsSam Minot Karen OcwiejaPhilippe LeboulchSpencer Barton Keshet Ronen Alain FischerAubrey Bailey Greg Peterfreund Marina Cavazzana-Calvo
Rithun MukherjeeSalima Hacien-Bey-Abina
Jennifer HwangRik Gijsbers
Kristine YoderZeger Debyser
Rebecca Custers-Allen
RNA in infected cells is 14% viral.
Ratios among HIV message forms
HIV infection associated with intron retention in cellular genes
Solexa/Illumina Hi Seq
100 base paired end reads
2 uninfected samples3 infected samples HIV89.6 in human T-cells
~ 1 Billion sequence readsBoth human and HIV