04/23/2003 massively parallel solutions for molecular sequence analysis prabhakar r. gudla cmsc 838t...

50
04/23/200 3 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

Post on 21-Dec-2015

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003

Massively Parallel Solutions for Molecular Sequence Analysis

Prabhakar R. GudlaCMSC 838T Presentation

Page 2: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 2

Outline

Motivation Smith-Waterman Algorithm

Parallelization

High Performance Computing Hybrid Architecture Fuzion 150

Performance Evaluation Conclusions and Comments

Page 3: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 3

Motivation

Discovered sequences are analyzed by comparison

with databases

Complexity is proportional to the product of query size

times database size

☞ Analysis too slow on sequential computers

Page 4: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 4

Sequence Alignment

Two possible approaches Heuristics, e.g. BLAST, FASTA, but the more efficient the

heuristics, the worse the quality of the results Parallel Processing, get high-quality results in reasonable time

BLAST, FASTA, Smith-Waterman (S-W)

BLAST

FASTA

Smith-Waterman

Slower

Faster

SearchSpeed

DataQuality

Lower Higher

Page 5: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 5

Outline

Motivation Smith-Waterman Algorithm

Parallelization

High Performance Computing Hybrid Architecture Fuzion 150

Performance Evaluation Conclusion and Comments

Page 6: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 6

Parallelization of S-W

matrix cells along a single diagonal are computed in parallel

comparison is performed in l1+l21 steps on l1 PEs

GTCTATC

A T C T C G

l2

l1

P1 P2 P6

0 0 0 0 0 0 00000000

00 00

0 00 20

0 02 1

00

1

0 01 2

02

12

4

0 22 1

2

1

2

2

4

33

1

043

236

6545

4554

344456

A T C T C G

GTCTATC

GTCTATC

T GCTATC

TATC

C T GT C T GATC

TC

A T C T GC T A T C T G CTATCTG

Page 7: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 7

Parallel Architectures

Embedded Massively Parallel Accelerators

Fuzion 150: 1536 processors on a single chip

Other accelerators: Decypher, Biocellerator, GeneMatcher2, Kestrel, SAMBA, P-NAC, Splash-2, BioScan

Systola 1024: PC add-on board with 1024 processors

Page 8: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 8

Outline

Motivation Smith-Waterman Algorithm

Parallelization

High Performance Computing Hybrid Architecture Fuzion 150

Performance Evaluation Conclusion and Comments

Page 9: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 9

Previous Applications

Volume Visualization [Schmidt `00] Automatic Visual Quality Control (Automobile

Industry) Computer Tomography [Schmidt, Schimmler, and Schröder

`98] Video Compression [Schmidt and Schimmler `99] Range of Transforms (Fourier, Wavelet, Hough,

Radon) [Schmidt, Schimmler and Schröder `99] Image Processing [Schimmler and Lang `96, Lenders and

Schröder `90, Jiang Edirisinghe, and Schröder `97]

Page 10: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 10

Hybrid Architecture

High speed Myrinet switchHigh speed Myrinet switch

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

Systola1024

combines SIMD and MIMD paradigm within a parallel architecture Hybrid ComputerHybrid Computer

Page 11: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 11

Architecture of Systola 1024

Interface processors

ISA

RAM NORTH

host computer bus

Controller

RAM WEST

program memory

Instruction Systolic Array: 32 32 mesh of

processing elements wavefront instruction

execution

Page 12: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 12

Mapping onto Systola 1024

a30a31 a0

a63 a62 a32

a992a1022a1023

bk….b1b0bk….b1b0…c1c0 X

bb: subject sequence

aa: query sequence (equal to 1024)

Subject sequences can be pipelined with only step delay k steps for subject sequence of length k

Efficient routing on the ISA: Row Ringshift and Broadcast

Page 13: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 13

Fuzion 150 Architecture

0.25-m, single-chip, SIMD architecture 1536 PEs @ 200 MHz 300 GOPS 600 GB/s on-chip, 6.4 GB/s off-chip bandwidth multithreading (control units interact via semaphores) developed by Clearspeed Technology (UK) for graphics, networking processing

Linear SIMD Array1536 PEs

each with 2 Kbytes DRAM

Linear SIMD Array1536 PEs

each with 2 Kbytes DRAM

FUZION BusFUZION Bus

32-bit EPU(ARC)

32-bit EPU(ARC)

VideoI/O

VideoI/O

DisplayDisplay

Instruction FetchInstruction Fetch

SIMD ControllerSIMD Controller

Local MemoryLocal

Memory

1,2 or 4 Channels (6.4 GB/s)

HostHost AGP Rambus

Page 14: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 14

Fuzion 150 Architecture

PE(0,0)

PE(0,1)

PE(0,255)

Fuz

ion

Bus

PE(1,0)

PE(1,1)

PE(1,255)

PE(5,0)

PE(5,1)

PE(5,255)

Local MemoryLocal

Memory

Block 5

Block 1

Block 0

ALU(8 bits)

Register file32 Bytes

PE Memory2 KByte DRAM

Right PE

Instructions

Block I/O Channel

Left PE

Page 15: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 15

Mapping onto the Fuzion 150Block 5

Block 1

Block 0

bb: subject sequence

bk….b1b0bk….b1b0

a1a0 a255

a511 a510 a256

a1280a1534a1535aa: query sequence (equal to 1536)

…c1c0 X

No fast global communication 2-step local communication Subject sequence can be pipelined with only step delay

Page 16: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 16

Contents

Motivation Smith-Waterman Algorithm

Parallelization

High Performance Computing Hybrid Architecture Fuzion 150

Performance Evaluation Conclusion and Comments

Page 17: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 17

Performance Evaluation

Scan times in seconds for TrEMBL 14 (351’834 Protein Sequences) for various query sequence lengths

Parallel implementation scales linearly with sequence lengthComputing time dominates data transfer time

Query sequence length 256 512 1024 2048 4096

Fuzion 150speedup to PIII 1Ghz

1288

2297

42102

82105

162106

Systola 1024speedup to PIII 1Ghz

2944

5774

11374

22414

46114

Cluster of 16 Systolasspeedup to PIII 1GHz

2053

3856

7358

14260

29059

Fuzion 150 is 25 times faster than a single Systola 1024; difference in CMOS technology (0.25 vs 1.0)

Page 18: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 18

Performance Evaluation

Time comparisons for a 10 Mbase search on different parallel architectures with different query length

1

10

100

SAMBA Fuzion 150 Kestrel 16K-PEMasPar

Sec

on

ds 512

1024

2048

4faster than 16K-PE MasPar 6faster than Kestrel 5faster than SAMBA (special-purpose 3-board architecture)

Page 19: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 19

Performance Evaluation

USparc : Sun Ultrasparc 140 MHz

B-SYS: 470-PE ISA

Alpha: DEC Alpha – 433 MHz

1K MP2: 1K-PE MasPar

Paragon: 32-node Paragon

Decy-1: 1-board Decypher-II*

Merc1: 1-board Mercury+

Bcll-1: Biocellerator*

Samba: 2-board Samba+

16-MP2: 16K-PE MasPar

FDF-3: 5-Board Paracell FDF+

Kestrel: 1-board Kestrel

Decy-15: 15-board Decypher-II*

+ (single purpose); * (FPGA) Source: Dahle et. al, PDPTA, 1243-1249, 1999

Page 20: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 20

Outline

Motivation Smith-Waterman Algorithm

Parallelization

High Performance Computing Hybrid Architecture Fuzion 150

Performance Evaluation Conclusions and Comments

Page 21: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 21

Conclusions

Demonstrated how fine-grained and hybrid parallel architectures can be applied efficiently for Comparative Genomics

Significant runtime savings for full genome comparisons and database searching

Same systems can be used for accelerating other bioinformatics applications, e.g. Hidden Markov Models

Page 22: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 22

Comments

☞ With hardware support, is S-W as fast as BLAST?

Search Tools

(against Swiss-Prot

DB)

Sequence Under Test

ELVIS (5) Metr (276) Arp_arath (536)

Time taken for the search (seconds)

FASTA 3.3 4.3 20.0 25.0

BLAST 2.2 1.0 4.0 10.0

SSearch (SW) 6.0 240.0 565.0

H’Ware Accl. 3.2 16.8 29.7

Comparative search speeds on 600 MHz 21264A Alpha machine (comparable MCUPS as Hybrid System and Fuzion 150)

* Source: Shane Sturrock, SCS, 2(1), April 2002

Page 23: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 23

Comments

☞ Is it feasible to use S-W as the default ? Currently offered as a default option at EBI (European

Bioinformatics Institute), handles 15K queries per month w/ full implementation of S-W

Depends on the “objectives” of the search

☞ Just how much more accurate is S-W ? 5-10% more “sensitive” towards divergent matches than

BLAST (Shpaer et. al., Genomics 38, 179-191, 1996) BLAST will retrieve most biologically significant similarities,

but will miss a few and will include some chance similarities

Page 24: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 24

Comparison of S-W VS BLAST

Source: Shpaer et.al., Genomics 38(2), pp.179-191, 1996

☞ Is there a real difference in the results ? YES

Page 25: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 25

Comparison of S-W, FASTA, and BLAST

Note: The numbers in the table show for how many protein SF the method in the column performed better than the one in the row

Page 26: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 26

Acknowledgements

Dr. Bertil Schmidt

Dr. Chau-Wen Tseng

Page 27: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 27

Q&A

Page 28: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 28

Extra Slides

Page 29: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 29

Full Genome Comparison

related Organisms, but Tuberculosis causes a disease find common and different parts

16106 pairwise sequence comparisons

3918 ProteinSequences1.329.298

AminoAcids

4289 ProteinSequences1.359.008

AminoAcids

Page 30: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 30

Smith-Waterman Algorithm

Optimal local alignment of two sequences Performs an exhaustive search for the optimal

local alignment Complexity O(nm) for sequence lengths n and m

Based on the 'dynamic programming' (DP) algorithm Fill the DP matrix using a substitution (mutation) matrix Find the maximal value (score) in the matrix Trace back from the score until a 0 value is reached

Page 31: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 31

Smith-Waterman Algorithm

Aligning S1 and S2 of length l1 and l2 using recurrences:

1 2

0

( , )( , ) max ,1 , 1

( , )

( 1, 1) ( 1 , 2 )i j

E i jH i j i l j l

F i j

H i j Sbt S S

0),0(),0(

0)0,()0,(

jFjH

iEiH

),1(

),1(max),( ,

)1,(

)1,(max),(

jiF

jiHjiF

jiE

jiHjiE

Calculate three possible ways to extend the alignment by one aminoacid (AA) in each sequence by one AA in the first sequence and align it with a gap in the second by one AA in the second sequence and align it with a gap in the first

Page 32: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 32

Smith-Waterman Algorithm

Align S1=ATCTCGTATGATGATCTCGTATGATG S2=GTCTATCACGTCTATCAC

GTCTATCAC

A T C T C G T A T G A T G

0 0 0 0 0 2 1 0 0 2 1 00000000000

0 0 0 0 0 0 0 0 0 0 0 0 02

0 2 1 2 1 1 4 3 2 1 1 3 20021021

1224321

4323654

3654554

4554657

3444556

3546545

3475576

2569876

1458876

03677

109

2258799

2147788

108

97

534

2

0

else 1

)( if 2),(

yxyxSbt

=1, =1

A T C T C G T A T G A T GA T C T C G T A T G A T G

G T C G T C T A T C A CT A T C A C

)2,1()1,1(

1)1,(

1),1(

0

max),(

ji SSSbtjiH

jiH

jiHjiH

Page 33: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 33

Principles of the ISA

.......

...

Page 34: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 34

Principles of the ISA

Communication- Register

Page 35: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 35

Interface Processors

Interface Processors Interface Processors NorthNorth

Interface Interface Processors WestProcessors West

ISA

. . . ..

. .

.

Page 36: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 36

Instruction Systolic Array

+

row selectors

columnselectorsinstructions

*

-

+

-

*-

+*+

+*-+

+*

* +-+

+*-

+* +*

+*-

++*

*-*-+

+*

+*

-

-

-

+*

+*- +*- -

wavefront instruction execution fast accumulation operations (e.g. row sum, broadcast, ringshift)

Page 37: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 37

Advantage of ISA’s: Performing Aggregate Functions

• Row Broadcast

• Row Sum

• Row Ringshift

C := C[WEST]C := C[WEST]

C := CW

C = 234 C = 0 C = 0 C = 0234

C := C + C[WEST]C := C + C[WEST]

noop

C = 1 C = 2 C = 3 C = 4

C := C[WEST]; C:=C[EAST]C := C[WEST]; C:=C[EAST]

noop

C = 1000 C = 1 C = 1 C = 1

C = 234 C = 234 C = 0 C = 0234

C := CW

C = 1 C = 3

C:=C+CW

C = 3 C = 4

C := CW

C = 1 C = 1000 C = 1 C = 1

C:=CWC := CWC:=CE

C = 234 C = 234 C = 234 C = 0234

C := CW

C = 1 C = 3

C:=C+CW

C = 6 C = 4

C := CW

C = 1 C = 1 C = 1000 C = 1

C:=CWC := CW C:=CE

C = 234 C = 234 C = 234 C = 234234

C := CW

C = 1 C = 3 C = 6

C:=C+CW

C = 10

C := CW

C = 1 C = 1 C = 1 C = 1000

C:=CWC := CW C:=CE

Page 38: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 38

Data Transfer

In Systola 1024, input of new character (bj) into the lower western IP, and

when l1 > 2048, the input of previously computed H, E, and F

cells and output of H, E, and F cells

For Fuzion 150, during the 16 new H-cells in each PE, one new character is input via Fuzion bus

Page 39: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 39

Instruction Counts

Instruction Count (IC) to update 2 and 16 H-cells in Systola 1024 and Fuzion 150, respectively:

Operations in each PE per iteration step Systola Fuzion

Get H(i – 1, j), F(i – 1), bj, maxi-1 from neighbor 20 22

Compute t = max{0, H(i – 1, j – 1) + Sbt(ai, bj)} 20 576

Compute F(i, j) = max{H(i – 1, j} – , F(i – 1, j) – } 8 336

Compute E(i, j) = max{H(i, j – 1} – , E(i, j – 1) – } 8 448

Compute F(i, j) = max{t, H(i, j}, F(i, j)} 8 368

Compute maxi = max{H(i, j), maxi-1} 4 184

Sum 68 1934

Page 40: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 40

Maximum Characters/PE

The memory per PE on Systola is 32 (16-bit) registers 2 characters per PE is the maximal possible (2 chars x 20 AAs substitution row x 8-bit per substitution

value = 20 registers)

The memory per PE on Fuzion is 2Kb maximum chars per PE is 16 restricted due to “indirect addressing” per PE

Page 41: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 41

Indirect Address

An addressing mode found in many processors' instruction sets where the instruction contains the address of a memory location which contains the address of the operand (the "effective address") or specifies a register which contains the effective address

Page 42: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 42

Myrinet - Overview

Myrinet is a cost-effective, high-performance, packet-communication and switching technology that is widely used to interconnect clusters of workstations, PCs, servers, or single-board computers

Conventional networks (e.g., ethernet) can be used to build clusters, but do not provide the performance/features required for HPC or high-availability clustering

Page 43: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 43

Myrinet - Characteristics

Full-duplex 2+2 Gigabit/second data rate links, switch ports, and interface ports

Flow control, error control, and "heartbeat" continuity monitoring on every link

Low-latency, cut-through, crossbar switches, with monitoring for high-availability applications

Switch networks that can scale to tens of thousands of hosts, and that can also provide alternative communication paths between hosts

Host interfaces that execute a control program to interact directly with host processes ("OS bypass") for low-latency communication, and directly with the network to send, receive, and buffer packets

Page 44: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 44

lq processors: Hybrid

Query sequence = M, Number of processors

in ISA = N2, assuming M = k x N:

1. k N: Each k x N subarray computes the alignment of the same query sequence with different subject sequences

2. k ≥ N :• k/N = 2: load 2 chars per PE• k/N > 2: split query sequence into k/2N passes and load 2N2

chars in each pass

Page 45: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 45

lq processors: Fuzion 150

Length of query sequence = M, Number

of processors = 1536:

1. k x M = 1536: k alignments of same query sequence w/ different subject sequences carried out in parallel

2. k x 1536 = M:• Split into k passes – requires I/O of intermediate results in each

step

• Data transfers can be minimized by assigning k/M chars per PE – currently 16 chars per PE is the limit

Page 46: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 46

Concept of true and false hits

The following cases were distinguished: true positives, alignments between proteins of similar

structure that fall above a given threshold (defined by the sequence alignment method)

false positives, alignments between proteins of dissimilar structure that fall above a given threshold of the sequence alignment

true negatives, alignments between proteins of dissimilar structure that that fall below a given threshold

false negatives, alignments between proteins of similar structure that fall below a given threshold

Page 47: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 47

Guidelines

When to use S-W ? if you are looking for a protein distantly related to your query

sequence (e.g., you have a known protein sequence and you want to find possible distant homologues)

if you are looking for the protein encoded in your low-quality DNA query sequence (e.g., you have a badly sequenced cDNA clone)

if you are looking for a DNA sequence corresponding to your protein query sequence (e.g., you want to identify potential homologues of your protein in the EST databases)

When to use BLAST ? if you are looking for close matches and you don't mind missing

lower homology sequences if you want a quick answer

Page 48: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 48

Performance Evaluation of SAMBA

Query sequence length 10 30 100 300 1000 3000 10000

Time in seconds

Samba 25 25 26 30 40 77 210

DEC-Alpha – 150 Mhz

Speed up

57

2.3

120

4.8

350

13.5

1041

34.7

3468

86.7

11510

150

38450

183

SUN-Sparc 5 – 110 MHz

Speed up

95

3.8

239

9.5

746

28.6

2215

7.4

7300

183

24269

315

80300

382

DEC 5000/250 – 40 MHz

Speed up

182

7.3

548

22

1407

54

4054

135

12920

323

41169

534

131193

625

Source: Jamet and Laveneir, CABIOS, 12(7), 609-615, 1997

☞ The longer the query length, the better the speed-up

Page 49: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 49

Performance Evaluation of Kestrel

USparc : Sun Ultrasparc 140 MHz

B-SYS: 470-PE ISA

Alpha: DEC Alpha – 433 MHz

1K MP2: 1K-PE MasPar

Paragon: 32-node Paragon

Decy-1: 1-board Decypher-II*

Merc1: 1-board Mercury+

Bcll-1: Biocellerator*

Samba: 2-board Samba+

16-MP2: 16K-PE MasPar

FDF-3: 5-Board Paracell FDF+

Kestrel: 1-board Kestrel

Decy-15: 15-board Decypher-II*

+ (single purpose); * (FPGA) Source: Dahle et. al, PDPTA, 1243-1249, 1999

Page 50: 04/23/2003 Massively Parallel Solutions for Molecular Sequence Analysis Prabhakar R. Gudla CMSC 838T Presentation

04/23/2003 CMSC 838T – Presentation 50

Performance Evaluation of Splash-2

Hardware Specifics MCUPS

Splash-2 Unidir; 16 boards 43,000

Splash-2 Bidir; 16 boards 34,000

Splash-2 Unidir; 1 board 3,000

Splash-2 Bidir; 1 board 2,100

Splash-1 Bidir; 746 PE’s 370

SPARC 10/30 GX gcc –O2 1.2

VAX 6620 VMS; CC 1.0

SPARC-1 gcc –O2 0.87

486DX-50 PC DOS; gcc –O2 0.67

Source: Hoang, IEEE-CMM, 185-191, 1993