wmu cs6260 parallel computations ii spring 2013 presentation #2 professor: dr. de doncker name:...

33
DETAILS ABOUT PARALLEL MOTIF FINDING ALGORITHMS FOR BIOINFORMATICS WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

Upload: arthur-sutton

Post on 29-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

DETAILS ABOUT PARALLEL MOTIF FINDING ALGORITHMS FOR BIOINFORMATICS

WMU CS6260 Parallel Computations II Spring 2013 Presentation #2Professor: Dr. de DonckerName: Xuanyu Hu March/11/2013

Page 2: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 3: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 4: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

QUICK REVIEW- SEQUENCING DNA

Pattern in DNA Function Drug target identification and new drug

discovery

Page 5: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

REVIEW - GENE MUTATION

Page 6: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

REVIEW - SOLUTION

Solution in a Sequential Way Greedy Algorithm Brute Force Branch And Bound

Parallelize the serial program with MPI Loading balance in parallel program

Page 7: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 8: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

THE BEGINNING OF BIOINFORMATICS

Over the past decade there has been a dramatic increase in the number of completely sequenced genomes resulting from the race of multibillion-dollar genome-sequencing projects.

Page 9: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

GENBANK

The results of these achievements have led to a flood of data in genome sequence databases such as Genbank and EMBL, which has caused them to double in size almost every year.

Page 10: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 11: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

THE BEGINNING OF BIOINFORMATICS

This flood of sequence data requires a system of representing, organising, manipulating, distributing, maintaining and finally using the information (Computer Simulation).

Bioinformatics(bridge) Computer Science work with Biology Computer Science work for Biology

Page 12: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 13: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

HOW TO USE GENBANK

Page 14: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

HOW TO USE GENBANK

Example: Protein consensus pattern to DNA RegEx

Page 15: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 16: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

PROBLEMS More and more DNA

sequence Not enough memory for

DNA sequence If we don’t have the

super-computer with lots of processors Can I find the results with

normal computers with the same performance of parallel computation?

Page 17: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 18: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

60 MB GENOME FILE: CHROMO20.FA

Page 19: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

250 MB GENOME FILE: CHROMO1.FA

Page 20: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

1072 MB GENOME FILE: CHROMO1-5.FA

Page 21: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 22: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

THE RESULTS OF OUR PROJECT

1 2 3 4 5 6 7 8 90

2

4

6

8

10

12

14

16

18

#Processes

Tim

e (

Secs.)

N = 55, T = 20, L = 9

Page 23: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 24: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

FUTURE(SOLUTION FOR PROBLEMS)

A far more practical and effective approach incorporates the usage of parallel clusters of workstations.

Cloud Computing

Page 25: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 26: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

CONCLUSION

Details about Parallel Motif Finding Algorithms for Bioinformatics Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future(Solution for problems)

Download Transformation Find the Pattern Bioinformatics give the information biologists need

Page 27: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

THANK YOU FOR LISTENING

Page 28: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 29: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

NOTHING IS IMPOSSIBLE

Page 30: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References

Page 31: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

QUESTIONS?

Page 32: WMU CS6260 Parallel Computations II Spring 2013 Presentation #2 Professor: Dr. de Doncker Name: Xuanyu Hu March/11/2013

OUTLINE

Quick Review About Last Presentation More Details

Genbank The Beginning Of Bioinformatics How to use genbank Problems

Good Performance In Bioinformatics The Results From Real DNA The Results Of Our Project Future

Conclusion Nothing Is Impossible Questions References