topics in (nano) biotechnology human genome project lecture 12

45
TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12 27th November, 2006 PhD Course

Upload: tal

Post on 21-Jan-2016

61 views

Category:

Documents


0 download

DESCRIPTION

PhD Course. TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12. 27th November , 200 6. Public Consortium. Celera Genomics. http://www.pbs.org/wgbh/nova/genome/program.html. Nuclear. Human Genome. Mitochondrial. Remember what the genome is?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

TOPICS IN (NANO) BIOTECHNOLOGYHuman Genome Project

Lecture 12

27th November, 2006

PhD Course

Page 2: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Celera GenomicsPublic Consortium

Page 3: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

http://www.pbs.org/wgbh/nova/genome/program.html

Page 4: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12
Page 5: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Human Genome organisation – Human genome contains ~ 40,000 genes– Nuclear genome 3000 Mb– 30,000 to 40,000 structural genes– 24 different types of DNA duplex– 22 autosomes, 2 sex chromosomes

Remember what the genome is?

Human Genome

Nuclear

Mitochondrial

Page 6: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• DEFINITION: The entire genetic makeup of the human cell

nucleus.

Includes non-coding sequences located between genes, which makes up the vast majority of the DNA in the genome (~95%)

Let’s define it.

Page 7: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• DEFINITION: The Human Genome Project is a multi-year effort to find

all of the genes on every chromosome in the human body and to determine their biochemical nature.

• SPECIFIC GOALS: – Identify all the genes in human DNA– Determine the sequences of the 3 billion bps– Save the information in databases– Improve tools for data analysis– Transfer related technologies to the private sector– Address the ethical, legal and social issues that may arise

from the project

What is the Human Genome Project?

Page 8: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Sequencing the Human Genome

Page 9: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Sequence Genomes

Find Genes

EstablishFunction andDisease Mechanism

Genetic Mapping,Mutation Detection

DrugCandidates Gene Therapy

Diagnostics/Prognostics

Cure

Page 10: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

DNA Sequence ‘Reads’ - Need 60,000,000 of them!!

“Sequence the 3 billion (+) base pairs of human DNA and identify all genes contained in the human genome”

Page 11: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Why are genome projects important? – The key to continued development of molecular biology, genetics

and molecular life sciences– a catalogue containing a description of the sequence of every gene

in a genome is seen as immensely valuable, even if the function is not known

– aid in isolation and utilisation of new genes– stretch technology to its limits

What is the potential impact?– Improved diagnosis/therapy of disease– prokaryotic genomes: vaccine design, exploration of new microbial

energy sources– plant and animal genomes: enhance agriculture

Importance and Impact

Page 12: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The Whitehead Institute for Biomedical Research (Eric Lander, Massachusetts, USA)

• The Sanger Centre (Cambridge, GB)• Baylor College of Medicine (Richard Gibbs, Houston,

USA)• Washington University (Robert Wayerston, St. Louis,

USA)• DoEs Joint Genome Institute, JGI (Trevor Hawkins,

Walnut Creek, California, USA)

• …and other genome centres worldwide...

The primary HGP sequencing sites

Page 13: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

The Human Genome Project- Timelines -

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

1st HumanChromosomeSequenced

CongressRecommends15 year HGP

Project

HGPOfficiallyBegins

LowResolution

LinkageMap of HGPublished

High ResolutionMaps ofSpecific

ChromosomesAnnounced

E.coliGenome

Completed

CeleraGenomicsFormed

Conferenceon HGP

Feasibility

S. cerevisiaeGenome

CompletedC. elegansGenome

Completed

FlyGenome

Completed

HumanGenome

Published

President announcesgenome working draft completed

Science (Feb. 16, 2001) - CeleraNature (Feb. 15, 2001) - HGP

Page 14: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• 1983 Los Alamos Labs and Lawrence Livermore National Labs, both under the DOE, begin production of DNA cosmid libraries for single chromosomes

• 1986 DOE announces HUMAN GENOME PROJECT

• 1987 DOE advisory committee recommends a 15-year multi-disciplinary undertaking to map and sequence the human genome. NIH begins funding of genome projects

• 1988 Recognition of need for concerted effort. HUGO founded (Human Genome Organisation) to coordinate international efforts DOE and NIH sign the Memorandum of Understanding outlining plans for co-operation

History of Human Genome Project

Page 15: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• 1990 DOE and NIH present joint 5-year Human Genome Project to Congress. The 15 year project formally begins

• 1991 Genome Database (GDB) established

• 1992 Low resolution genetic linkage map of entire human genome published, High resolution map of Y and chromosome 21 published

• 1993 DOE and NIH revise 5-year goals– IMAGE consortium established to co-ordinate efficient mapping and

sequencing of gene-representing cDNAs (Integrated Molecular Analysis of Genomes and their Expression)

History of Human Genome Project

Page 16: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• 1994 Genetic-mapping 5-year goal achieved 1 year ahead of schedule – Genetic Privacy Act proposed to regulate collection, analysis, sorage and use

of DNA samples (endorsed by ELSI)

• 1994-98 Tons of stuff happens that continues to advance the project

• 1998 Celera Genomics formed– New 5-year plan by DOE and NIH

• 1999 First chromosome completely sequenced (Chromosome 22)• 2000 June 6, HGP and Celera announce they had completed ~

97% of the human genome.• 2003 April 25, HGP finally completed

History of Human Genome Project

Page 17: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• James Watson Original Head of HGP

• Francis Collins

• Craig Venter

People of Human Genome Project

Page 18: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The Sanger dideoxy termination method (remember?)– Nucleotide analogs (ddNTP) are incorporated into DNA during its synthesis

together with normal nucleotides (dNTP) - when a ddNTP is inserted, the reaction stops = chain termination

• Radioactively labeled ddNTPs– four different reactions are performed, each reaction contains ddA, ddG, ddC,

ddT– Autoradiography enable analysis of different fragment lengths which correspond

to different termination points

• Fluorescently labeled ddNTPS– one reaction carried out, all four ddNTPs are incorporated but each ddNTP is

labelled with a different fluourescent dye– automated DNA sequencers interfaced with computers determine the order of

the dyes and hence the DNA sequence

DNA sequencing

Page 19: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The Gene Linkage Map

• Identifies position of genes by locating marker base sequences associated with RFLPs

• Based on how close together two genes are– the closer together two genes are, the less likely they are to separate during

meiotic recombination in germ cells– the frequency of recombination between two genes can help to decipher the

distance between them on a gene linkage map– genes separated by more than 50cM (50 million bps) are not considered linked

• Studies of families affected by genetic disease have proven useful for genetic linkage analysis

Mapping the Human Genome: Low Resolution Mapping

http://library.thinkquest.org/20465/g_linkagemap.html

Page 20: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The Physical Map

• Provides the actual distances in bps between genes on a given chromosome

• Prepared by aligning the sequences of adjacent DNA fragments from small overlapping clones to form a contiguous map (a contig map)

• Sequence tag sites (STS) mark sites on chromosomes and help to locate adjacent segments of DNA– if two DNA fragments share an STS they overlap and are contiguous

Mapping the Human Genome: High Resolution Mapping

Page 21: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Sequence Tagged Sites (STS)– Sequences occurring only once in the

human genome

– Help to map locations

– 52,000 STS in Humans• ~ 1 every 62,000 bases

Mapping the Human Genome: High Resolution Mapping

Page 22: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Hierarchical (Clone-based) Approach

• Know location of 30,000 – 100,000 bp region• Break into 500-700 bp fragments• Sequence Fragments• Assemble based on similarity• ~8-10x coverage

• Current Price: $0.09 / base

Page 23: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Hierarchical (clone-based) approach

• generate overlapping set of clones• select a minimum tiling path• shotgun sequence each clone

Page 24: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Hierarchical (clone-based) approach

• DISADVs– map generation requires resources, time and

money– Some regions not cloned

• ADVs– easier to assemble smaller pieces– less chance for assembly error

Page 25: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The aim, obviously, is to determine the entire genome sequence

• A sequence has to be constructed from a series of shorter fragments

• Shotgun technique– break molecule into smaller fragments– determine sequence of each one– use a computer to search for overlaps and build a master

sequence

Determining genome sequences

Page 26: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Shotgun Sequencing Approach

• Developed 1991 TIGR– Craig Venter, Hamilton Smith

• Break genome into millions of pieces– Sequence each piece– Reassemble into full genomes

Page 27: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Whole Genome Shotgun Approach

• reads generated directly from a whole-genome library

• assemble the genome all at once

Page 28: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Whole Genome Shotgun Approach

• DISADVs– more prone to assembly error– computationally intensive– cannot effectively handle repeats

• ADVs– Less overhead time up front

http://www.teachersdomain.org/resources/tdc02/sci/life/gen/sequencingrace/index.html

Page 29: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Analysis of DNA sequences of chromosomes by extending the sequenced region a little bit further each time until the tips of the chromosome are reached

• The next round of sequencing is based on the results of the previous round by synthesising appropriate DNA primers to extend further

Chromosome walking

Page 30: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Base calling and Assembly Software

• PHRED and PHRAP Developed (1988)– PHRED: Base calling software– PHRAP: Assists in assembly of sequenced data

Page 31: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Available Assemblers

• SEQAID (Peltola et al., 1984)

• CAP (Huang, 1992)

• PHRAP (Green, 1994)

• TIGR Assembler (Sutton et al., 1995)

• AMASS (Kim et al., 1999)

• CAP3 (Huang and Madan, 1999)

• Celera Assembler (Myers et al., 2000)

• EULER (Pevzner et al., 2001)

• ARACHNE (Batzoglou et al., 2002)

Page 32: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Growth of GenBank

• 1982: 600,000 Bases

• 2002: 28.5 Billion Bases

Image source: www.ncbi.nlm.nih.gov

Page 33: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The International Human Genome Sequencing Consortium published their results in Nature, 409(6822):860-921, 2001– Initial Sequencing and Analysis of the Human Genome

• Celera Genomics published their results in Science, 291(5507), 1304-1351, 2001– The Sequence of the Human Genome

Results of Human Genome Project

Page 34: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The Human genome contains 3146.7 million bases

• The average gene size is 3,000 bases

• Total number of genes is between 30-40,000

• The order of 99.9% of the nucleotides is the same in all people

• Of the discovered genes, the function for more than half is unknown

• > 30 genes have already been associated with human disease (e.g. Cancer, blindness)

Results of Human Genome Project

Page 35: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• About 2% of the genome encodes instructions for the synthesis of proteins

• Repeated sequences make up 50% of the genome

• There are urban centres that are gene rich: stretches of C and G bases repeats (CpG islands) occur adjacent to gene rich areas

• Chromosome 1 has 2,968 genes; the Y has 231

• Humans:– only twice number of genes of the fly– 3 times as many proteins as fly or worm– share the same gene families as fly or worm

Results of Human Genome Project

Page 36: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Microbial genomes– Haemophilus influenzae– Escherichia coli– Bacillus subtilus– Helicobacter pylori– Streptococcus pneumonaie– Saacharomyces cerevisiae– Archaeglobus fulgidus– Methanbacterium thermoautotropicum– Methanococcus jannaschil– Mycobacterium tubercolosis– Staphylococcus aureus

• and more…..

• Insect genomes– Arabidopsis thaliana– Drosophilia melanogaster– Mus musculus

Completed genomes

Page 37: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Organism Genome Size (Bases) Estimated Genes

Human (Homo sapiens) 3 billion 30,000

Laboratory mouse (M. musculus) 2.6 billion 30,000

Mustard weed (A. thaliana) 100 million 25,000

Roundworm (C. elegans) 97 million 19,000

Fruit fly (D. melanogaster) 137 million 13,000

Yeast (S. cerevisiae) 12.1 million 6,000

Bacterium (E. coli) 4.6 million 3,200

Human immunodeficiency virus (HIV) 9700 9

Results of Human Genome Project

http://www.teachersdomain.org/resources/tdc02/sci/life/gen/hgp/index.html

Video_1

Page 38: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• The DOE and the NIH spend between 3-5% of their annual HGP budgets toward studying the ELSI associated with availability of genetic information

• This budget is the world’s largest bioethics program, and has become a worldwide model

• Examples of ELSI are:– privacy legislation– gene testing– patenting– forensics– behavioural genetics– genetics in the courtroom

Ethical, legal and societal issues

Page 39: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Who should have access to this information?– Employers– Insurers– Schools– Courts– Adoption agencies– Military

• Philosophical Implications– Human responsibility– Free will versus genetic determinism

• Who owns and controls genetic information?– How is privacy and confidentiality managed?

• Psychological impact and stigmatisation– Effects on the individual– Effects on society’s perceptions and expectations of the individual

Societal Concerns

Page 40: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Clinical Issues– Growing demand to educate health care workers – Public needs to gain scientific literary and understand the capabilities, limitations

and risks– Standards need to be established including quality controls to ensure accuracy

and reliability– Regulations?

• Genetic Counselling– Informed consent for complex procedures– Counseling about risks, limitations and reliability of genetic screening techniques– Reproductive decision making based on genetic information– Reproductive rights

• Multifactorial diseases and environmental factors– Genetic predispositions do not mandate disease development– Caution must be exercised when correlating genetic tests with predictions

Clinical Issues

Page 41: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Who owns genes and DNA sequences?– The person (or company) who discovered it, or the

person whose body it came from– Should genetic information be the property of

humanity?– Is it ethical to charge someone for access to a

database of genetic information?

• Is it time to raise the bar concerning patents?– Will patent protection slow the advance of research

and be detrimental to society as a whole in the long run

Commercialisation and patents

Page 42: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

Medicine

Bioinformatics

Biotechnology

DNA chip technology

Gene therapy applications

Diagnostic & therapeutic applications

Medicine & pharmaceutical industries

Agriculture & Bioremediation Industries

Microarray Technology

Proteomics

Pharmacogenomics

Preventative measures

Developmental Biology

Evolutionary & Comparative Biologists

Benefits of Human Genome Project

Page 43: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• These occur when a single nucleotide in the genome sequence is altered (1 bp difference)

• 66% of SNPs involve a C to T change and they occur every 100-300 bases in either coding or non-coding regions

• Evolutionary stable, there are between 2 and 3 million SNPs in the human genome

• Many SNPs have no effect on cell function, but: – some SNPs could be responsible for variations in how many humans

respond to disease, environmental factors, drugs and other therapies– SNPs may help identify multiple genes involved in complex diseases

Single nucleotide polymorphisms

Page 44: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• SNPs are NOT the same things as alleles (or so we believe so far)

• Researchers have found that most SNPs are not responsible for a disease state

– They serve as markers for pinpointing a disease on the human genome map, being located near a gene found to be associated with a certain disease

– Occasionally, SNPs may actually cause a disease and can to be used to search for and isolate the disease-causing gene

– SNPs travel together - i.e. Variations in DNA are linked

• To date, Celera & Orchid Biosciences have largest databases

Single nucleotide polymorphisms

Page 45: TOPICS IN (NANO) BIOTECHNOLOGY Human Genome Project Lecture 12

• Goals:• Develop large scale technologies• Identify common variants in the coding regions• Create a SNP of at least 100,000 markers• Develop the intellectual foundation for studies of sequence variation• Create public resources of DNA samples and cell lines

• SNP Consortium:• Ten large pharmaceutical companies and the UK Wellcome Trust• Headed by Arthur Holden• Find and map 300,000 common SNPs• Generate a widely accepted, high-quality, publically available map

Single nucleotide polymorphisms

http://www.learner.org/channel/courses/biology/units/genom/images.html