workshop in bioinformatics 2010 class # 75321 class 8 march 2010

17
Workshop in Bioinformatics 2010 Class # 75321 Class 8 March 2010

Post on 21-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Workshop in Bioinformatics2010

Class # 75321

Workshop in Bioinformatics2010

Class # 75321

Class 8 March 2010

(Molecular) Bioinformatics (Molecular) Bioinformatics

Development (Computational Biology)

New algorithms and statistics to assess biological information

Tools for access and management different types of information (DB, Models, Maps..)

User view (applied Bioinformatics)

Analysis: interpretation of various types of data (nt, aa, domains, RNAs, signatures, 3D-structures)

Statistical methods Applying predictions methods (from the shelf) Applying tools (web, programs…)

What do you mean by “data”…

Raw (primary)

Processed (analyzed)

Integrated (systems)

What do you mean “raw data”…

Raw (primary) Data Sequences

RNA (many types)DNA Genome

VariationsSNPMutationsCopy Number Variations (CNV)

What do you mean “raw data”…

Rate - Raw (primary) Data

HM1: Identify 2 areas with a similar (?) trend

The level of resolution:the ‘basic unit’ example DNA

Nucleotide to Genome level

NOTE: Not always the technology identify nucleotidein the DNA

HW2: provide an example…list the technology

The level of resolution:the ‘basic unit’ example DNA

Nucleotide to Genome level

Technology: DNA / RNA SEQUENCING

Nucleotide level

DNA Sequencing

“Classical” (Sanger)

5’

5’ Primer

3’ TemplateG C A T G C

dATPdCTPdGTPdTTPddATP

dATPdCTPdGTPdTTPddCTP

dATPdCTPdGTPdTTPddTTP

dATPdCTPdGTPdTTPddGTPddGTP

GddC

GCATGddC

GCddA GCAddT ddG

GCATddG

5’

5’ Primer

3’ TemplateG C A T G C

dATPdCTPdGTPdTTPddATP

dATPdCTPdGTPdTTPddATP

dATPdCTPdGTPdTTPddCTP

dATPdCTPdGTPdTTPddCTP

dATPdCTPdGTPdTTPddTTP

dATPdCTPdGTPdTTPddTTP

dATPdCTPdGTPdTTPddGTPddGTP

dATPdCTPdGTPdTTPddGTPddGTP

GddC

GCATGddC

GCddA GCAddT ddG

GCATddG

The level of resolution:the ‘basic unit’ example DNA

TECNOLOGY definitions:

Nucleotide (1)

Run (400-800) ABI –Sanger fluorescence capillary

Reads (35-400) SOLEXA and 454 (WGS) – BUT millions in parallel

ABI 3700 96x700 bases

Scale:capillary tech

RUNs

New technology

“454” technology

(not in this class)

G

C

T

A

+

_

+

_

G

C

A

T

G

C

short

long

G

C

T

A

+

_

+

_

G

C

A

T

G

C

short

long

The level of resolution:the ‘basic unit’ example DNA

FUNCTIONAL / Assembly definition (more)

SNP (1)

Indel (1-1000s) human genome -average 5 bp

EST (300-700)

Gene (1000 bacteria; 100,000 human; huge range)

Contig- 500,000, huge range / BacTig >106

Chromosome: Bacteria: 4*106, Human: 50- 250*106

Genome: HUGE ranges from virus 5*103 to frog 5*1010

Shotgun Sequencing

IsolateChromosome

ShearDNAinto Fragments

Clone intoSeq. Vectors:

PhagemidBAC

SequenceIsolate

ChromosomeShearDNA

into FragmentsClone into

Seq. Vectors:Phagemid

BAC

Sequence

1995-2006

SequenceChromatogram

Send to Computer AssembledSequence

SequenceChromatogram

Send to Computer AssembledSequence

“classical” Sequencing 1995-2006

Deep Sequencing

New fast and cheap methods for sequencing

Replacing ? ‘classical’ Sanger sequencing

(covered if time allows…)

The level of resolution:the ‘basic unit’ example DNA

Is genome is the “largest unit” ?

2008:

MetaGenomics

The BIOME project: A community of genomes in human body

SEARCH for numbers…

How many nucleotides in chromosome 7 of human….

1.Search “Google like”

2.Search “PubMED like”

3. Your way

HM3: the value, the source

The SOLUTION

Dedicated resources (=Databases) that are focused for the genome ‘building blocks’

1.NCBI

2.EBI

3. more…

Centralized Achieve for DB, Tools, Protocols, Books, Drugs, education etc