institute of biomedical sciences (icb) malaria nucleus institute of mathematics and statistics (ime)...

38
Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics for Tropical Disease Research. University of São Paulo, Brazil Feb 18th – March 2nd - 2002

Upload: claud-bishop

Post on 03-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Institute of Biomedical Sciences (ICB)

Malaria Nucleus

Institute of Mathematics and Statistics (IME)

BIOINFO-USP Nucleus

Latin American Course on Bioinformatics for Tropical

Disease Research. University of São Paulo, Brazil

Feb 18th – March 2nd - 2002

Page 2: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Latin American Course on Bioinformatics for Tropical Disease Research.

University of São Paulo, Brazil

Workshop Computer Room

Page 3: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

DB lab (IME) Conference Room (ICB)

Latin American Course on Bioinformatics for Tropical

Disease Research.

University of São Paulo, Brazil

Page 4: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Philosophy

Page 5: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Join elements of different natures

Tropical Diseases + Molecular Biology + Quantitative Techniques

Biological Sciences + Mathematical Sciences

Page 6: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Organizers

Biological Sciences: 3 (Hernando del Portillo, Arthur Gruber, Bianca Zingales)

Mathematical Sciences: 2 (Alan Durham, Junior Barrera)

Page 7: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Students

10 Biological Sciences (i.e., doctors, biochemists, molecular biologists, etc.)

5 Mathematical Sciences (i.e., physicians, computer scientists, engineers, statisticians, etc.)

Page 8: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Instructors and Invited Speakers

17 Biological Sciences (i.e., doctors, biochemists, molecular biologists, etc.)

11 Mathematical Sciences (i.e., physicians, computer scientists, engineers, statisticians, etc.)

Page 9: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

BIOINFORMATCS

Theory + Practice

Page 10: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Module 1: Perl + Unix + DB

Alan Durham

Page 11: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Staff

Alan Durham (perl) Marco Dimas Gubitoso (Linux) Joao Eduardo Ferreria (Databases)

• Marcio,Luciano (DB Workshop)

Page 12: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Issues addressed

most biologists have experience using the Microsoft environment but not an Unix one

there are many similarities between Unix (Linux) shell and DOS command line

perl tutorials implicitly assume previous programming knowledge

databases are essencial for handling lots of data but very hard to design

time limitation

Page 13: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Considerations

since it is not viable to teach to program we need to convey a basic understanding of the programming activity

it is not possible to teach database design, but we can make students understand database structure

teaching linux should concentrate on similarities with DOS and tools

Page 14: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Module objectives Unix:

• teach set of Linux command that can enhace researcher productivity

• teach use of command line for calling programs• give basics of emacs

Perl• teach basic perl commands• teach use of perl for elementary scripts• enable student to develop 5-10 line programs• give basis for futher studies using tutorials

DataBases• understand need of DataBases• teach basic understanding of DB structure• enable students to make small SQL queries

Page 15: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Teaching approach

Style• class/workshop: students repeated instructors demonstrations and

extra exercises

• instructors + floaters

Timings:• Unix: 4,5 hours

• Perl: 9,5 hours

• Databases: 5 hours

Page 16: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Quick Analysis

about 40% of students were in the projected profile (except in Databases)

approach proved effective floaters proved to be essencial but.... ideal timing should be increased to 8+ mornings

• Unix 7 +hours (2 mornings)

• Perl 14+ hours (4 mornings)

• Databases 7+ hours (2 mornings)

timeframe (2 weeks) restricts results

Page 17: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Building a Bioinformatics Linux Station

Alan Durham

Page 18: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Staff

Alan Durham Chuong Huynh

Page 19: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Considerations

students are subject to great amount of information in 2 weeks

very little time to think on subject in a global perspective

the notion of what should be used in internet and what should be kept local is important

most people will need eventually to install some software

Page 20: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Module objectives

to give students an understanding of how to set up an environment for their research

to think on bioinformatics on thecontext of their own research

understand the advantages and dificulties they will face when going back home

to foster interaction between the students in general and of students of differente backgrounds in particular

Page 21: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Teaching approach

Style• students separated by interest in particular TDR

organisms

• groups of 3 people, one with more informatics background

• each group asked to design an Internet page for research in a chosen topic in their organism

Timing: 6 hours (2 hour sessions)

Page 22: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Quick Analysis

pages: some groups distracted by the internet page issue

small time available project need more previous discussion session

Page 23: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Analysis Tools

Arthur Gruber

Latin American Course on

Bioinformatics for

Tropical Disease ResearchSão Paulo – February 17th to March 1st 2002

Faculty of Veterinary Medicine and Zootechny

University of São PauloBRAZIL AG-FMVZ-USPAG-FMVZ-USP

Page 24: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Teaching Strategy

Theoretical introductory lectures - fundamentals of the the different tools available for sequence analysis

Workshops - practical session witth real-life problems

Topics order – lectures and workshops followed the same order that is usually employed for sequence analysis

AG-FMVZ-USPAG-FMVZ-USP

Page 25: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Teaching Strategy

AG-FMVZ-USPAG-FMVZ-USP

DNA assemblyand finishing

Similaritysearching/databases

Multiplesequencealignment

Phylogeneticreconstruction

AnnotationGeneprediction

GenomeComparison

Page 26: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

DNA assembly and finishing - Arthur Gruber• Phred/Phrap/Consed package, Staden Package, CAP3

Workshop - Arthur Gruber and Jessica Kissinger• Tutorial on Phred/Phrap/Consed using demo files,

cosmids and viral genomes

AG-FMVZ-USPAG-FMVZ-USP

Page 27: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Fundamentals of sequence analysis - Cathal Seoighe• Fasta, Blast

Workshop - Cathal Seoighe and Jessica Kissinger• Tutorial on similarity searching tools: Search33

(Smith-Waterman algorithm, Fasta and Blast• Tutorial on PlasmoDB

AG-FMVZ-USPAG-FMVZ-USP

Page 28: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Multiple Sequence Alignment - Cathal Seoighe• Fundamentals of sequence alignment, ClustalX

Workshop - Cathal Seoighe• Tutorial on ClustalX

AG-FMVZ-USPAG-FMVZ-USP

Page 29: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Phylogenetic Reconstruction - Cathal Seoighe• Parsimony, Maximum likelihood, etc.

Workshop - Cathal Seoighe• Tutorial on Phylogenetic Reconstruction using

ClustalX/Phylip

AG-FMVZ-USPAG-FMVZ-USP

Page 30: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Artemis and gene prediction software - Neil Hall• Genome annotation, public databases, Pfam, Artemis

Workshop - Neil Hall• Tutorial on Artemis I

AG-FMVZ-USPAG-FMVZ-USP

Page 31: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Gene Finding - Neil Hall• Gene finding, protein domains, annotation pipeline

Workshop - Neil Hall• Tutorial on Artemis II

AG-FMVZ-USPAG-FMVZ-USP

Page 32: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Comparative Genomics - Neil Hall• Artemis Comparison Tool

Workshop - Neil Hall• Tutorial on Artemis Comparison Tool

AG-FMVZ-USPAG-FMVZ-USP

Page 33: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

Applications - All

Country Applications

Argentina 2Brasil 24Chile 2Colombia 5Cuba 5Mexico 3Peru 4Venezuela 2total 47

Tropical Disease Scientists 10. Computer Scientists 5.

Selected Participants

59%

7% 7% 13%

7%

7%

Brazil

Mexico

Venezuela

Colombia

Peru

Argentina

Page 34: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics
Page 35: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

0

2

4

6

Orga Styl Audi Time Lab Know

linux, perl, db

Page 36: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics
Page 37: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics
Page 38: Institute of Biomedical Sciences (ICB) Malaria Nucleus Institute of Mathematics and Statistics (IME) BIOINFO-USP Nucleus Latin American Course on Bioinformatics

TDR/USP Bioinformatics CourseGroup Picture