wmu cs 6260 parallel computations ii spring 2013 presentation #1 about semester project feb/18/2013...

25
WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu Hu Implementation and Analysis of Parallel Motif Finding Algorithms for Bioinformatics

Upload: laureen-may-oliver

Post on 14-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

WMU CS 6260 Parallel Computations II

Spring 2013

Presentation #1 about Semester Project

Feb/18/2013

Professor: Dr. de Doncker

Name: Sandino VargasXuanyu Hu

Implementation and Analysis of Parallel

Motif Finding Algorithms for Bioinformatics

Page 2: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Our Team of This Semester ProjectProject Topic Background: BioinformaticsParallel Algorithms in BioinformaticsProblem We Want to SolveSolution in a Sequential WayDemo of Sequential ProgramHow to Parallelize It? (paper)ConclusionReferenceQuestions?

Outline

Page 3: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Members of our team:Sandino VargasXuanyu Hu

We have taken the same coursesWMU_CS6030_Bioinformatics (Summer II

2012)WMU_CS5260_Parallel Computation IWMU_CS6260_Parallel Computation II (Spring

2013)Our professor, Dr. de Doncker, will be

teaching the interesting course CS6030 "Biomedical Informatics“ again, in the next semester: Summer I (2013).

Our team of Semester Project

Page 4: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Bioinformatics is an interdisciplinary field that develops and improves upon methods for storing, retrieving, organizing and analyzing biological data.

A major activity in bioinformatics is to develop software tools to generate useful biological knowledge.

Project Topic Background: Bioinformatics

Page 5: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Mapping DNASequencing DNAComparing SequencesPredicting GenesFinding SignalsIdentifying ProteinsRepeat AnalysisDNA ArraysGenome RearrangementsMolecular Evolution

Subjects in Bioinformatics

Page 6: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

After we read the materials and searched the internet, we both agreed that Sequencing DNA is the best object for implementation and analysis of parallel computations.

There will be 3 major questions:What is Sequencing DNA?Why we need Sequencing DNA?How to Sequencing DNA?

Sequencing DNA

Page 7: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule. It includes any method or technology that is used to determine the order of the four bases—adenine, guanine, cytosine, and thymine—in a strand of DNA.

The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

What is Sequencing DNA?

Page 8: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

This is a picture of a DNA model2 strands and made of 4 kinds of pairsA-T != T-A

A – TT – AG – CC – G

One Thing We Need to Know

Page 9: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

DNA sequencing may be used to determine the sequence of individual genes, larger genetic regions, full chromosomes or entire genomes.

Depending on the methods used, sequencing may provide the order of nucleotides in DNA or RNA isolated from cells of animals, plants, bacteria, or any other source of genetic information.

Why we need Sequencing DNA?

Page 10: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

The resulting sequences may be used by researchers in molecular biology or genetics to further scientific progress or may be used by medical personnel to make treatment decisions or aid in genetic counseling.

Function = DNA PatternDNA Function:

Estimate the function of a new kind of virusKill the virus’s function by starvation

Why we need Sequencing DNA?

Page 11: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

In genetics, a mutation is a change of the nucleotide sequence of the genome of an organism, virus, or extra-chromosomal genetic element.  

Mutations may or may not produce changes in the observable characteristics of an organism.

Mutations play a part in both normal and abnormal biological processes, including evolution, cancer, and the development of the immune system.

Another Thing We Need to Know

Page 12: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Gene Mutation

Page 13: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Lots of methods:Maxam-Gilbert sequencingChain-termination methodsShotgun sequencingBridge PCRPolony sequencing454 PyrosequencingIon semiconductor sequencingDNA nanoball sequencing

How to Sequencing DNA?

Page 14: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Parallel Computing in Bioinformatics

Page 15: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Web Site URL: http://www.gpugrid.net/

Page 16: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

We have several DNA strands, and some of them might have mutation.

They have the same function. Or they are from the same species.

We want to find the DNA pattern that make them to have the same function.

DNA is made of 2 strands, each stain is made of A, T, G, C. If we know one of the 2 stains, we can easily know another one.

Problem We Want to Solve

Page 17: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

ExamplesATGCAACT is the DNA

pattern we want to find.Small letter means

mutation.

Page 18: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Greedy Algorithm

Solution in a Sequential Way

Page 19: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Brute Force

Solution in a Sequential Way

Page 20: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Branch And Bound:

Solution in a Sequential Way

Page 21: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Demo of Sequential Program

Page 22: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Some materials about my semester projectBioinformaticsSequencing DNAParallel Algorithms in BioinformaticsProblem We Want to SolveSolution in a Sequential WayHow to Parallelize It (paper)

Conclusion

Page 23: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

I would be very happy to answer any questions you have.

Questions?

Page 24: WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu

Thank You