from dna to genomics: the rise of bioinformatics - catherine abbott

Post on 16-Apr-2017

1.981 Views

Category:

Health & Medicine

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

From DNA to genomics: the rise of bioinformatics

Catherine AbbottCathy.abbott@flinders.edu.auMonday 2 DecemberBioInfoSummer 2013

1

NB. Most images in this presentation are viaGoogle images.

Outline of talk

• Introduce bioinformatics• Very very basic introduction

to – DNA-molecular biology– Genes– Genomes

• Human Genome Project• Genomics• The challenges and the

future2

What is Bioinformatics?• Bioinformatics is the

field of science in whichbiologyinformatics: computer

science, information technology, mathematics, statistics and other sciences

3

Central paradigm of Molecular Biology

5

DNA RNA Protein PhenotypeGuanine- GAdenine- AThymine- TCytosine- C

Guanine- GAdenine- AUracil- UCytosine- C

G Glycine Gly

P Proline Pro

A Alanine Ala

V Valine Val

20 amino acids

Central paradigm of Molecular Biology

6

What is a gene?• The gene is the basic physical unit of

inheritance

7

http://www.bbc.co.uk/schools/gcsebitesize/science/add_aqa_pre_2011/celldivision/celldivision1.shtml

8

DNA Sequences- threebases and stop codons

9http://www.genome.gov/EdKit/bio2b.html

Reading frames

10

http://www.genome.gov/EdKit/bio2e.html

Exons and Introns

11

http://www.genome.gov/EdKit/bio2i.html

1977: Sanger Sequencing

• used chemically altered "dideoxy" bases to terminate newly synthesized DNA fragments at specific bases (either A, C, T, or G).

• Was awarded two nobel prizes 1958 and 1980 (shared with Gilbert and Berg)

13

Evolution of Sequencing Technology

1865 : Mendal shows inheritance in peas

1953 : Watson and Crick structure of DNA

1977 : Era of sequencing begins1980 : Shotgun sequencing

coined1982 : GenBank founded1983 : Kary Mullis and

colleagues develop PCR1986 : First commercial ABI

sequencer launched 1990 : Blast algorithm developed

at NCBI1991 : EST strategy developed

Evolution of Sequencing Technology

1995 : Cycle Sequencing by Amersham; Applied Biosystems releases capillary electrophoresis system Prism 310; output 5000 bases per day

1997 : MegaBACE 1000 capillaries;output 250,000-500,000 bases per day

1998 : Pyrosequencing developed2001 : Draft human sequence2005 : Launch of Genome Sequencer 20

System by 454 Life Sciences based on Pyrosequencing technology; output 20 million bases per run

Fihlo JS Breast Cancer Research 2009

17

Traditional Sequencing vs 454 Technology

Genbank:

18

Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7.

Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7.

What is Genomics?

• An organism's complete set of DNA is called its genome

• Genomics is the study of the entire genome of an organism

• investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion.

21

The Race

22

20th July 1969 26th June 2000

President Clinton 26th June 2000

• “We are here to celebrate the completion of the first survey of the entire human genome. Without a doubt, this is the most important, most wondrous map ever produced by humankind.”

23

Prime Minister Blair26 June 2000• “……a revolution in medical

science whose implications far surpass even the discovery of antibiotics…... And every so often in the history of human endeavor there comes a breakthrough that takes humankind across a frontier and into a new era. ……a breakthrough that opens the way for massive advances in the treatment of cancer and hereditary diseases, and that is only the beginning.”

24

February 2001

25$2.7 billion US $300 million US

Cost of Private effort-13 years ago

• 300 machines running night and day for over a year

• $30,000,000 to buy• $2 M a month in

electricity• $4 M a month in

chemicals• Fits on 5 CDs

26

27

Human Genome Project• The biggest bioinformatics project of

its time• So what have we learned so far

– 3.2 billion bases in the human genome– Just over 20,000 protein coding genes– Humans vary 1/1000bp

• 3.2 million differences between non-relatives

• Almost as much information as in the entire genome of E.coli (4.6 million bases)

28

29

Bishop Desmond Tutu2010

Craig Venter2001-2003

James D Watson2007

CompletedHuman Genomes

James D. Watson• June 2007• 454 Sequencer• Took 4 months• Cost <$1 Million

30

Richard Carson/Reuters

31

2005 2007 2008

2009 2010

19 August 2011• Baylor College of Medicine

Human Genome Sequencing Center and the AGRF in Melbourne, Australia.

• WGS and Sanger sequencing• 2 x coverage• 5.9 x coverage on ABI SOLID• 2,574 Megabase

32

Renfree et al. Genome Biology 2011, 12:R81

Complete Genomes-Nov 2010 • http://www.ncbi.nlm.nih.gov/

Genomes/• There are now over 1000

complete Prokaryotic Genomes available in Entrez Genome

• All three main domains of life - bacteria, archae and eukaroytic- are represented, as well as many viruses and organelles

• Humans, mice, rats, worms and flies have been completed

33http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/org.html

34

http://www.1000genomes.org/about

http://www.icgc.org/

DIY genomics

Summary and Challenges Ahead

• DNA sequencing is becoming faster and cheaper at a pace far outstripping Moore’s law (the rate at which computing gets faster and cheaper).

• the ability to determine DNA sequences is starting to outrun the ability of researchers to store, transmit and especially to analyze the data.

http://infoproc.blogspot.com/2011/11/dna-data-deluge.html

Summary and Challenges Ahead

• Data handling is now the bottleneck

• It costs more to analyze a genome than to sequence a genome.

• The cost of sequencing a human genome — all three billion bases of DNA in a set of human chromosomes — plunged to $10,500 last July from $8.9 million in July 2007

Summary and Challenges Ahead

• Storage and access to data causes issues– Not all data in Genbank or in a format that can be easily

accessed• Demand from non-scientists for tools to visualize, understand

and interpret their own genomic data

http://www.missionmassimo.com/?page_id=8

Personalized Medicine: the future

BioInfoSummer 2013 program

• Monday-Background to Biology and Statistics

• Tuesday- Evolution Biology• Wednesday- Systems Biology• Thursday-Next Generation Sequencing

(NGS)• Friday- Programing for Bioinformatics

43

•Thank You!

44

top related