better computing for better bioinformatics › conferences › finishfuture › pdfs...8_ff0019...

19
THE WORLDS FIRST HYBRID-CORE COMPUTER. THE WORLDS FIRST HYBRID-CORE COMPUTER. Better Computing for Better Bioinformatics George Vacek, PhD, MBA Director, Convey Life Sciences [email protected] www.conveycomputer.com/lifesciences/

Upload: others

Post on 05-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • THE WORLD’S FIRST HYBRID-CORE COMPUTER. THE WORLD’S FIRST HYBRID-CORE COMPUTER.

    Better Computing for Better Bioinformatics

    George Vacek, PhD, MBA Director, Convey Life Sciences [email protected]

    www.conveycomputer.com/lifesciences/

  • •  Convey –  Kirby Collins –  Wesley Hart –  Mark Kelly –  David Soper –  John Amelio –  Mike D’Jamoos –  Mike Ruff –  Tony Brewer

    •  BWA –  Heng Li, The Broad

    •  Velvet –  Daniel Zerbino, UCSC

    •  Virginia Bioinformatics Institute –  Bob Settlage –  Lauren McGiver

    •  Joint Genome Institute –  Alex Copeland –  Alex Sczyrba –  Zhong Wang

    •  Monsanto Company –  Yili Chen –  Xuefeng Zhou

    •  The Jackson Laboratory –  Glen Beane

    Acknowledgements

    9/6/12 2

  • •  Better Bioinformatics –  High Performance de novo Assembly –  Screening Reads Instead of Contigs –  High Throughput Resequencing

    •  Better Computing –  Convey Computers –  Hybrid-Core Computing

    Agenda

    9/6/12 3

  • Convey’s Hybrid-Core Server Delivers

    •  Higher Performance –  5x to 25x application gains

    •  Energy Saving –  Up to 90% power reduction

    •  Easy to use, program, manage –  Standard Linux ecosystem –  Management / Scheduling –  Programming environment

    9/6/12 4

    “Speed and power consumption were our top reasons for selecting the Convey system.”

    Dr. Guilherme Oliveira, Director Center for Excellence in Bioinformatics

  • HIGH PERFORMANCE DE NOVO ASSEMBLY

  • 0

    2,000

    4,000

    6,000

    8,000

    10,000

    12,000

    14,000

    Velvet 1.2.01 12C x86

    CGC/Velvetg HC-2

    Run

    Tim

    e (S

    econ

    ds)

    JGI 40Gbp Cow Rumen velvetg graph roadmap unify velveth

    11,623 sec 111 GB RAM

    864 w-hr

    2,139 sec 21 GB RAM 444 w-hr

    Reduced Memory Usage, Accelerated Performance - Enables Large Genomes •  5.4x speed up depends on –  Data set size

    –  Kmer space complexity •  RAM reduced 79%

    –  Data types / structures –  Automated roadmap

    partitioning •  1.9x Power Performance

    9/6/12 6

    HC-2: 2 Intel X5670 2.93GHz processors (12 cores total), stripe 4 @ 600GB SATA disks 96GB DDR3 (host), 16GB SG (coprocessor)

    X86: host only

    “Convey’s GraphConstructor offers a new approach to help researchers … to achieve better assemblies or look at bigger jobs such as metagenomic or mammalian genome samples”

    Daniel Zerbino, author of Velvet

  • Convey GraphConstructor for de novo Assembly •  Tackle previously

    impractical genomes •  Higher quality assemblies •  Lower cost

    •  Interface for Velvet/Oases •  Stability, ease of use,

    optimized workflow

    •  Very fast Kmer Counter –  parameter optimization based on roadmap statistics –  select best kmer length and coverage cutoff

    9/6/12 7

    “Convey is solving a big problem here – de novo assembly has been very difficult… Convey has made a significant accomplishment!”

    Dr. John Castle, head of Bioinformatics/Genomics,

    University of Mainz, TrOn

  • SCREENING READS INSTEAD OF CONTIGS

  • Quickly Identify Reads Associated with Proteins of Interest

    •  Translated Search with Smith-Waterman

    •  7.2x faster than BLASTx •  2.9x more matches

    –  SWSearch 1081219 hits –  BLASTx 372344 hits

    •  BLAST heuristic filter

    9/6/12 9

    153

    1095

    0

    200

    400

    600

    800

    1000

    1200

    12C x86 HC-2ex

    Read

    s Pe

    r Sec

    ond

    1M Illumina Reads Against 5K Patented Proteins (pataa)

    BLASTx (NCBI v2.2.25)

    SWSearch

    HC-2ex: 192GB (host), 64GB (coproc), stripe 4 @ 600GB SATA Dell r610: 2 Intel X5680 3.33GHz processors (12 cores total), 96GB of

    1333MHz DDR3 memory, stripe 3 @ 146GB SAS

    19 query: AARTPKPTAPDSPEMMRG ::: : :: : :: :: sbjct: AARLPAPTGPGSPFAGRG 161

    23 query: PGGTLFSSSP ::::::: : sbjct: PGGTLFSTXP 13

  • HIGH THROUGHPUT RESEQUENCING

  • Workflow Performance for Human

    •  BWA 0.5.10 workflow –  2 @ aln + sampe –  8.8x - 9.4x over x86 –  62 - 67 K Reads/Sec

    •  Reference G1k v37 –  3.1 G bases

    •  Reads –  HG00124 SRR189815_(1,2) –  242 M reads, ~100 bp –  24.7 G bases

    9/6/12 11

    X86: host only from HC-2ex; Intel X5670 2.93GHz processors (12 cores total), stripe 4 @ 600GB SATA disks

    HC-1: 128GB (host), 64GB (coproc), stripe 2 @ 1TB SATA disks HC-1ex: 128GB (host), 64GB (coproc), stripe 2 @ 1TB SATA HC-2: 96GB DDR3 (host), 16GB SG (coprocessor) HC-2ex: 192GB DDR3 (host), 64GB SG (coprocessor)

    7,114

    25,327 27,249

    62,323 66,869

    0

    10,000

    20,000

    30,000

    40,000

    50,000

    60,000

    70,000

    12C x86

    HC-1 HC-1ex HC-2 HC-2ex

    Read

    s Pe

    r Sec

    ond

    SRR189815 aligned to human reference

  • 0

    100

    200

    300

    400

    500

    600

    sam

    tool

    s -b

    S

    inte

    grat

    ed -b

    sam

    tool

    s -b

    S

    inte

    grat

    ed -b

    sam

    tool

    s -b

    S

    inte

    grat

    ed -b

    SRR002968 SRR002813 SRR002987

    Elap

    sed

    Tim

    e (S

    econ

    ds)

    Paired-End Data Mapped to Human Reference G1k v37

    samtools -bS sampe aln2 aln1

    Further Workflow Optimization Integrated BAM format generation

    9/6/12 12 HC-2ex: 192GB DDR3 (host), 64GB SG (coprocessor), stripe 4 @ 600GB SATA disks

    •  SAM to compressed BAM –  samtools vs. integrated

    sampe •  3.9 - 9.8x speed up •  Even greater savings for

    slow file systems

    •  Reference G1k v37 –  3.1 G bases

    •  Paired-end Reads –  SRR002813, SRR002987,

    SRR002968

  • The Jackson Laboratory

    •  Mutations vary in size –  E.g. translocation breakpoint –  Want reads that span

    breakpoint •  Run BWA with varying

    parameters –  Get more of these mutations

    •  Too slow on 32-core servers •  HC-2ex is 11.3x faster

    –  Afford to adjust parameters –  Quickly perform multiple runs –  Achieve better results

    9/6/12 13

    BWA 0.5.10 X86: 4 x 8-core AMD Magny Cours 2.4GHz Opteron HC-2ex: 2 x 6-core Intel X5670 2.93GHz, coprocessor

    “We found GPUs weren’t a good fit for alignment… the performance isn’t that compelling. Other FPGA system vendors didn’t have the number of tools Convey does or the system wasn’t as easy to use. Also a developer community is evolving around the Convey systems where we could share third-party tools.”

    Glen Beane The Jackson Laboratory

  • CONVEY HYBRID-CORE COMPUTING

  • BWA Personality

    • Implemented in hardware on coprocessor FPGAs • Highly parallel—up to 2,048 simultaneous alignment operations

    • 64 alignment units each operate on 32 sequences simultaneously • Leverages Convey HC highly parallel memory

    9/6/12 15

  • Bioinformatics Applications on HC

    9/6/12 16

    Organization Application Usage

    Conv

    ey B

    ioin

    f Sui

    te

    Convey Bioinformatics Suite BWA Reference Mapping

    Convey Bioinformatics Suite Velvet/CGC De Novo Assembly

    Convey Bioinformatics Suite Kmer Counter Read Analysis for Assembly

    Convey Bioinformatics Suite SWSearch Smith-Waterman Search

    Convey Bioinformatics Suite BLAST(p,x) Protein Database Search

    Avai

    labl

    e CLC bio CLC Genomics platform Analysis, workflows, visualization

    Michigan Technological University PCAP Overlap-based Assembly

    University of California San Diego InsPect Protein Assembly with PTMs

    Perfo

    rman

    ce P

    rove

    n

    BlueSpec Memocode Burrows-Wheeler Aligner

    Iowa State University RMAP, Shepard Short-read Mapping

    Technical University Crete BLASTn Nucleotide Sequence Search

    University of California Los Angeles Fluid Registration Medical Imaging

    University of California Riverside BowTie/FHAST Burrows-Wheeler Aligner

    University of South Carolina Mr Bayes Phylogenetics

    Proj

    ect I

    nitia

    ted

    Boston University BLAST(p,x) Protein Database Search

    Free University of Berlin SeqAn Sequence Analysis Library

    Technical University Darmstadt GROMACS Molecular Dynamics

    Bielefeld University SARUMAN Short-read Mapping

    University of Paderborn Suffix Tree Short-read Mapping

    University of Washington BFAST Short-read Mapping

    Virginia Bioinformatics Institute Various Mol Dynamics, Bioinformatics

  • •  In-house development and collaborations –  Customers and partners –  Software vendors –  Instrument manufacturers –  Cloud services

    •  Addressing many facets of bioinformatics –  primary analysis –  de novo assembly and reference mapping –  sequence alignment and search –  annotation, other downstream analysis

    •  www.conveycomputer.com/lifesciences/

    High Throughput Bioinformatics

    9/6/12 17

  • THANK YOU

  • END