2002 bioi final project 1 a distributed dna search database system
TRANSCRIPT
1
2002
BIOi Final Project
A Distributed DNA Search Database System
2
Motivation
• Bioinformatics is an important issue in the next century.
• DNA sequence queries and comparisons are more and more common.
• Due to the long sequence of a DNA , each DNA query takes too much time.
3
Goal
• To accelerate the searching time , we create a stable and scalable platform for distributed computing.
• provide an easy and simple graphical user interface for DNA search user.
• dynamic joining.
4
Architecture
End user
Coordinator
Host
Host
HostCoordinator
Host
Host
Coordinator
5
System Roles
• End user
• web portal
• Coordinator
• Host
6
End user’s view
• Submit a legal-formatted DNA query.
• Wait for a period of time.
• Receive result pages.
7
Coordinator’s view
CoordinatorCoordinator
Coordinator
HostHost
8
Coordinator’s View
• Accept an end-user’s request query to search in a range of database.
• Allocate a sub-range of database to its group hosts.
• Communicate with other groups’ coordinators.
• Return its group result to the requesting end-user.
9
Host’s View
Coordinator
Host
Host Host
10
Host’s View
• Each host belongs to a group and maintain a database partition.
• In each query , a host accepts a group coordinator’s allocation to search in a range of DNA database.
11
Single Host’s Job
1. Search in one part of Database.
2. Share the Loading of alignment.
12
Communication Protocol
1. Coordinating Module
2. Job Communicating Module
3. Query Processing Module
4. Database Module
5. Status Communicating Module
13
Communication Protocol
14
Robustness
• Fault Tolerance– Fail-safe mechanism for handling failures of both
coordinators and hosts.– Allow a fixed number of failed hosts.
• Recovery– Recover the previous state of crashed coordinators
and hosts.– Job of failed hosts can be taken over by other hosts.
• Scalability– New hosts can join this system to enhance the
computing power.– For local area networks only.
15
Implementation
• Pure JAVA runtime environment
• Web site: Apache server + JSP
• JAVA RMI
• MySQL™
16
Potential Application
• Mathematics.
• Message Encryption & Decryption.
• Distributed Datamining
17
Implementation Phase
Database Module MySQL setup
DNA sequence data collection
Job Module Job assignment
Query Module DNA comparision implementation
Status Module Handle host failure exception
Handle coordinator failure exception
Coordinator Module Get global and group view
WEB Application module
Design Web interface
GUI Application interface
Realtime response insterface
18
Project participants
• 蔡景祥 Chin Hseung Tsai– B88502119 資管四
• 徐蔚倫 Wei Lun Hsu– B87801009 公衛四
• 張均合 Chun Ho Chang– R 91725026 資管所研一
• 羅文興 Wen Hsin Lo– R 91725034 資管所研一
• 劉智雄 Chi Hsiung Liu– R 91725040 資管所研一
19
Question?
20
• ThanQ