genome biology

78
Genome biology Genome biology

Upload: jonah

Post on 15-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Genome biology. Topics. Definitions The structure of the genome The function of the genome Methods of genomics. 1.Definitions. Definitions- 1. Genome: definition 1. The information coded in the material of inheritence of an organism - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Genome biology

Genome biologyGenome biology

Page 2: Genome biology

TopicsTopics1.1. DefinitionsDefinitions

2.2.The structure of the genomeThe structure of the genome

3.3.The function of the genomeThe function of the genome

4.4.Methods of genomicsMethods of genomics

Page 3: Genome biology

1.Definitions1.Definitions

Page 4: Genome biology

Definitions- 1Genome: definition 1. The information coded in the material of inheritence of an organism

definition 2. The haploid DNA (of a cell) of an organism

1. Nuclear genome 2. Mitochondrial and chloroplast genomes

Transcriptome: 1. Full transcriptom: - the total amount of mRNAs of an

organism 2. Cellular transcriptome:- the total amount mRNAs of a cell of an

organism in an experimental situation

Proteome: 1. Full proteome: - the total ammount of proteins of an organism 2. Cellular proteome: - the total ammount of proteins of a cell in an

experimental situation

Page 5: Genome biology

Definitions- 2Genomics (genome biology)

1. Structural genomics, def: genetic mapping and comparison of individuals

a. determination of the genomic sequence (human, mouse, chimpanzee, etc.)b. genome variability: intraspecific polimorfismc. genome evolution: interspecific polimorfism

2. Functional genomics:def 1: examination of the transcriptomedef 2: examination of the function of the

genes

2/1 Functional genomics-I: transcriptomics 2/2 Functional gemomics-II: proteomics

Scope: - collecting of cDNAs - measuring the differentions in mRNA expression: transcriptomics - measuring the differentions in protein expression : proteomics

Page 6: Genome biology

Definitions- 3

Alternative grouping:

• Genomics

• Functional genomics (transcriptomics)

• Proteomics

Page 7: Genome biology

Definitions-4 Other „omics”:

• Phosphorylomics:The interaction between kinases and their substrates

• Metilomics: The methylation markings of the full DNA (3-5% in

mammals)

• Metabolomics: The interactions between enzimes and their substrates

• involved in metabolism

• Interactomics: Interaction between genes

• Lipidomics: The collection of lipids

Omics: system biological approach

Page 8: Genome biology

The phosphorylome of the yeast

Red dots: kinasesBlue dots: substratesGreen lines: connections

Page 9: Genome biology

Metabolome

Page 10: Genome biology

2. The structure 2. The structure of the genomeof the genome

2a. The structure of the DNA

2b. The variation of the DNA

2c. The evolution of the DNA

Page 11: Genome biology

2a.The structure of the 2a.The structure of the DNADNA

Genome programs

The human genome

Page 12: Genome biology

Genome programs- history

1990 The genomes of the viruses

1995 The first prokaryotic genome – H. influenzae

1996 The first eukaryotic genome – yeast

1998 The first multicellular genome – C. elegans (string worm)

2000 Drosophila melanogaster, Arabidopsis thaliana (goose-weed)

2001 Human genome: draw version (90%): 30-35,000 gene

2002 Mouse genome: draw version

2004 Human genome: full version (99%): 20-25,000 gene

2005 Chimpanzee genome: draw version

Page 13: Genome biology

Genome programes- active ( 300)

a. Non mammals: Lot of viruses: small genomeE. coli: model organismOther bacteria: H. influenzae, etcAmoeba: different genomeString worm (C. elegans): model organism Fruit fly: model organismBee: livestock, intelligent insect3 wasp speciesTripanosoma + malaria gnat: health care Triboleum castaneum: pest, modell animal of beetlesSea star: modell animal Goose-weed (arabidopsis) modell, rice + coffee: agriculture

b. Mammalshuman: vanity, self-study, health caremouse, rat: model organismbovine: livestockdog: huge number of genetic variants, homogenic races (in-bred breeds)chimpanzee: relativeorangutan: Rhesus monkeyWallaby (kanguru)Marmoset (monkey)

Page 14: Genome biology

Genome programs – the competition

director, NIH National Human Genome Research Institute

Craig Venter

Francis Collins

Bill Clinton

Head of the Celera Genomics

President: USA

Page 15: Genome biology

The set-up of the human genome

–What did they found?I. Not 100,000 - 150,000 genes, but: 20,000 - 25,000 - barely more, than fruit fly and the C. elegans, but the proteome is ~10x

as big

II. The bigger part of the genome is non-coding: waste - or selfish DNA? –maybe functional?

III. Nearly all insect and string-wormal genes are inside of us as well.

IV. This is not true conversely: immunity genes: antibodies, MHC, cytokinines; apoptotic genes, etc.

V. Numerous proteins from one gene: a human gene codes for an avarige of 2,6 protein: alternative splicing

VI. More transcription factors

VII. Huge enhancer region

VIII. More complex domain structure

Page 16: Genome biology

The human genome

Total genome45% Transposable elements 25% introns + UTR20,7% other intergenic sequences5%

Simple repeats (microsatelites; VNTR-s)

Protein coding sequences

large duplications

3%

1,2%

21% LINE 13% SINE 8% 3%

Non-LTR retrotransposonsLTR - retrotransposons

DNA transposons

53% repetitive sequences

LTR:long terminal repeat (regulatory role)LINE: long interspersed elements;SINE: short interspersed elements:

Page 17: Genome biology

Total genome45% Transposable elements 25% introns + UTR5%

21% LINE 13% SINE 8% 3%

Retrotransposons DNS transposons

LTR retrotransposones(Retroviruses, and other functioning retroposons) (450,000 copies)

Non-LTR retrotransposons(degenerated viral genes)(850,000 LINE, 1500,000 SINE)

Protein coding sequences

1,2%

„Copy and paste”

„cut and paste”

The human genomeSimple repeats

3%

LINE: long interspesed elements; SINE: short interspersed elements:

20,7% other inter-genic sequences

large duplications

Page 18: Genome biology

The human genome -transposable elements

CP NC Pr RT RNáz H Int

gag pol env

LTR capsid nucleocapsid proteinase ribonuclease H envelope LTR

Reverse transcriptase integrase

CP NC Pr RT RNase HInt

gag pol

RT RNase H

gag? pol

A B

ORF

transposase

I. class

II. class

Retroviruses (1%)

SINE-s (Alu)

Retrogenes

DNA transposons

LTR retro-transposones

polyA

polyA

polyA

LINE-s (pl. L1)

I. Class: retotransposons I/1. LTR transposons I/2. Non-LTR transposons II/21. LINE-s II/22. SINE-s II/23. Retrogenes

II. Class: DNA transposons

3%

33%

IR IR

LTR: long terminal repeat

LTR LTR

7%

Page 19: Genome biology

The human genome-retroviruses

CP NC Pr RT RNase HInt

gag pol env

LTR capsid nucleocapsid proteinase ribonuclease H envelope LTR

Reverse transcriptase integrase

LTR: long terminal repeat

gag: capsid (structural elements)pol: polimerase: reverse transcriptase, integrase, proteinase, RNase H env: envelope (structural elements)

Low copy number (10-1000 copies) human endogene retroviruses are presentin 1% of the genome

Page 20: Genome biology

The human genome- Retroviral infection

Page 21: Genome biology

The human genome - retrotransposons

CP NC Pr RT RNaseHInt

gag pol

RTRNase H

gag? pol

A B

ORF I. class

SINE-s (Alu)

Retrogenes

LTR retro-transposons

polyA

polyA

polyA

LINE-s (pl. L1)

LTR retrotransposones: from human endogene retroviruses, 10 – 1000 copies

LINE-s: in human L1 is the most common; present in 100,000 copies, but Lots of them are degenerated pseudogenes (non perfect reverse transcription).The 3,500 full length (6,1 kb) L1 –s 1% have promoter and two intact ORFs. LINE mobilisation in germ line and somatic cells as well.

SINE-s: 500,000 – 900,000 Alu copies (the most succesful transposone in human). All Alu element was created from a 280 bp + polIII promoter containing 7SL RNS gene. AluI restriction enzyme recognition sites are present in them.

Reverse Transcription!

LTR LTR

I. Class: retotransposons I/1. LTR transposons I/2. Non-LTR transposons II/21. LINE-s II/22. SINE-s II/23. Retrogenes

II. Class: DNA transposons

Page 22: Genome biology

The human genome - DNA transposons

transposase II. class

DNA transposons

DNA transposons:

- The transposase responsible for the flip: how does it multiply?- More than 60 families: Charlie, mariner, Tigger, THE1, etc -The mariner family is similar to the the transposones present in insects: horizontal gene transfer?

IR IR

IR: inverted repeat

21% LINE 13% SINE 8% 3%

Retrotransposons DNA transposons

I. Class: retotransposons I/1. LTR transposons I/2. Non-LTR transposons II/21. LINE-s II/22. SINE-s II/23. Retrogenes

II. Class: DNA transposons

Page 23: Genome biology

Total genome45% Transposable elements 25% introns+ UTR5%

Simple repeats (microsatellites; VNTR-s)

Large duplications: minisatellites and macrosatellites

3%

The human genome - microsatellites, minisatellites, macrosatellites

Microsatellites: small 4 base pair long or shorter repeats: 1 – 15 kilobasepairs long-CA/TG repeats in the 0.5% of the genom – yet their function is not known, „replication slippage” - AAAAs and TTTTTs- trinucleotid repeats CAA (Glu), ACA (ala): neuronal disorders; transcription factors in dogs

Minisatellites: 1 – 15 kbs repeats: like telomer: 15 kb: TTAGGG hexamer -the telomerse forges to the end of the chromosomes

Macrosatellites: several hundred kbs repeats

Exons

1,2%

Satellites: highly repetitve sequencesDuplications: importance in evolution

DNAsatellites

20,7% other inter-genic sequences

Page 24: Genome biology

Total genome45% Transposable elements 25% introns + UTR5%

Simple repeats Protein coding sequences

Large duplications 3% 1.3%

The human genome - exons and introns

Exons: - protein coding DNA sequences + UTRs

Introns: - cut out- alternative splicing, other alternative processes.- are they functional?

leaderleader

E1E1

I1I1 E2E2 I2I2 trailertrailer

E3E3

Coding sequenceAUG Stop

pre-mRNS

polyA signal

5’-UTR 3’-UTR

20,7% other inter-genic sequences

Page 25: Genome biology

Total genome45% Transposable elements 25% introns + UTR20,7% other inter-genic sequences5%

Simple repeats

Protein coding DNA sequences

Large duplications

3%

1.3%

The human genome- other intergenic sequences

1. Unidentifyable degenerated transposones2. Pseudogenes: 2 types (reverse transcripted RNA, duplicated DNA)3. Regulatory elements: promoters, enhancers, silencers4. others

Page 26: Genome biology

2b. The variability of 2b. The variability of the DNAthe DNA

- intraspecific variability- intraspecific variability

Human genom diversity programes

The genetic code of the phenotypic variability – coding vs regulatory sequences

Page 27: Genome biology

Human genome diversity programs

-From 1990 programs to map the polimorfism of the human genome. Importance: genealogic, medical

-From 2005 Genographic Project (National Geographic)

-mtDNA-Y chromosome

Genetic markers

The practicability of the data:

- Two theories on the origin of Homo sapiens (From Homo erectus):

multiregional theory – African origin (mitochondrial Éva)

- The wanderings of modern man.

Page 28: Genome biology

Inheritence

somatic chromosomes

Page 29: Genome biology

Inheritance

somatic chromosomesY chromosomeMitochondrial DNA

Page 30: Genome biology

Inheritance

somatic chromosomesY chromosomeMitochondrial DNA

Page 31: Genome biology

Genes STR*-s

Genetic markers on the Y chromosome

STR: short tandem repeats

Page 32: Genome biology

16,569 nukleotide

Genes on the mitochondrial DNA

Hiper variable region

Page 33: Genome biology

„Common” origin„Common” origin

100.000 years ago

Homo sapiens Homo sapiens Homo neanderthalensis

Homo erectus African Homo erectus

Asia

n H

omo

erec

tus

European Homo erectus

1.8 million years ago

Hypotheses: Multiregional Origin Out of Africa------------------

Page 34: Genome biology

67,000 yr

African originAfrican origin

13,000 yr40-60,000 yr

20,000 yr

130,000 yr

40,000 yr

Comparison of mitochondrial DNAs: winning of „Out of Africa” hypothesis over „Multiregional Origin” hypothesis.

yr

Page 35: Genome biology

The genetic base of the phenotypic variability: coding vs regulatory sequences

1. Genes and proteins functional variance

2. The theory of neutrality

3. Intragenic variability in the regulatory regions

4. Variance in the coding region of the regulatory genes

Page 36: Genome biology

Genes and proteins - functional variance

Traditional concept

The different gene-products are responsible for the phenotypic variance in efficiency and function

Page 37: Genome biology

The theory of neutrality

The gene variants (alleles) are functionally the same !

Motoo Kimura

- The majority of gene substitutions are not responsible for amino-acid changes (sinonim changes)

- The vast majority of aminoacid substitutions do not changes the function of the protein (chemically similar aminoacid substitutions: conservative change)

- A The function of the genes did not changed through evolution, Gene variability do not cause phenotypic variability. These are true for the most genes.- except for defective genes

Page 38: Genome biology

Of Mice and ManSignificant polimorfism in the regulatory sequences- variability: expression level and tissue-specifity

Intraspecfic variability in the regulatory regions

Page 39: Genome biology

Intraspecfic variability in the regulatory regions

P gene A

P gene A

P gene A

P gene A

enhancers promotersindividuals

1

2

3

4

The gene regulation theory: variability in gene expression

Variants of gene „A” with identical function but differing controlling regions

Page 40: Genome biology

Coding variance: number of triplet repeats (number of glutamine and alanine repetitions)

1931

1976

Q19A14

Q19A13

. . . CAACAAGCACAAGCAGCA . . .

Q Q QA A AQ: glutamineA: alanine

bull terrier

Harold Garner and John W. Fondon Revival of gene function theory:

Variability in the sequence of transcription factors

runx-2 gene

Intraspecfic variability in the regulatory regions

Page 41: Genome biology

2c. The evolution of the 2c. The evolution of the DNADNA

- interspecific variability- interspecific variability

Differences between genomes

The chimpanzee in us

Page 42: Genome biology

Exaples Size (bp) Length (m)

HIV-1 9.8x103 10-6

fage 4.8x104 10-5

T4 fage 1.7x105 10-4

E. Coli 4.6x106 10-3

Drosophila 1.8x108 10-1

Mouse 3.5x109 1Dog 3.4x109 1Horse 3.3x109 1Human 3.4x109 1Corn 5.0x109 1Lilie 3.6x1010 10Amoeba 2.9x1011 100

The genome size differs between species and is not in relation with phenotypic complexity

The differences between genomes - Genome size

Page 43: Genome biology

The differences between genomes - Number of the genes compared with the total number of the cells

Number of the genes: number of the cells:

Human: 20 – 25,000 1014

Fruit fly: 13,500 -C. elegans: 19,100 959

Page 44: Genome biology

Differences between genomes Sctructural and functional differences

DNA similarity:

Human - chiken: 60%Human – mouse: 88%Human – chimpanzee: >98%

Same proteins

Ascidian –human: 80%Human – fruit fly: 40%Fruit fly – human: 61%C. elegans – human: 43%Yeast – human: 46%

Same domains:

Human – fruit fly, C. elegans > 90%,Exon shuffling in human , 2x as much gene

Rather „copy and paste”, than „cut and paste” mechanism

Page 45: Genome biology

4 theories about the differences:

1. Evolution of regulatory proteinsa. FOXP2 gene: mutation: disorder in speech and articulation; 2 amino acid differences between human and chimp b. ASPM and MCPH1 genes mutation: microcephaly; their expression levels are higher in neural precursor cells

2. Evolution of regulatory sequences A general increase in the expression level of genes in brain; it is difficult to detect it at the level of DNA

3. Retainment of juvenile characters Lack of body fur, higher brain/body weigh ratio, the form of out skull is similar to that of chimp kid

4. Neotenia theoryCompare to the rest of the body, the development of head accelerated

Human genome - chimp genome

1. Chromosome number: 23 vs. 24

2. Genetic alterations:

3. More Alu and L1 sequences in human (short repetitive sequences)

- Point mutations (complete genome)… 1.23%- Point mutations (coding sequences).... 1% - Duplications: ……………………….. 2.7%- Insertions, deletions:………………… 3.0%- Several Chromosomal rearrangements

The chimp in usThe chimp in us

Page 46: Genome biology

3. The function of the 3. The function of the genomegenome

3a. Gene expression

3b. Genome expression

3c. An astonishing RNA world

Page 47: Genome biology

cytoplasmcytoplasmnucleusnucleus

ERER

GolgiGolgi

pol-II

pre-mRNA

mRNApolyA

cap

DNADNA

protein

ribosome

RNA polimerase-II

Gene expression

Page 48: Genome biology

Alternative gene usage - instrument of complexity

Alternative……

- …promoter-usage- …splicing- …polyadenylation- …gene expression

Page 49: Genome biology

P1P1 Ex1Ex1

Coding region

pre-mRNA

mRNA

protein

TT

Alternative promoter usage

Ex2Ex2 Ex3Ex3I2I2I1I1Promoter 1 terminator

ribosome

Ex: exonI: intron

P2P2Promoter 2

Epidermal cells

pol

Page 50: Genome biology

Coding region

pre-mRNA

mRNA

protein

TTEx1Ex1 I2I2I1I1 Ex3Ex3Ex2Ex2terminator

Ex: exonI: intron

Alternative promoter usage

P1P1Promoter 2

P2P2

Neuronal cellsPromoter 1

pol

Page 51: Genome biology

Alternative RNA splicing

E2E2PP E1E1 E4E4

E1E1 E2E2 E4E4

E3E3E1E1 E3E3 E4E4

E1E1P1P1 E3E3 E4E4E2E2P2P2

E1E1 E3E3 E4E4

E2E2 E3E3 E4E4

E2E2PP E1E1 E4E4

E1E1 E2E2 E4E4

E3E3E1E1 E3E3 E3E3

PA2PA2PA1PA1

DNADNA

DNADNA

DNADNA

Alternative splicingAlternative splicing

Alternative splicing + promoter usage Alternative splicing + promoter usage

Alternative splicing + polyadenylationAlternative splicing + polyadenylation

Page 52: Genome biology

gene 2

Nerve cells

Neuron-specific enhancer

P2 TEx2 Ex3 Ex4I3I2

Cell-specific gene expression

gene 1

P1 TEx1 Ex2 Ex3I2I1

Ex1 I1

Skin cell-specific enhancer

TF

gene 2

P2 TEx2 Ex3 Ex4I3I2

gene 1

P1 TEx1 Ex2 Ex3I2I1

Ex1 I1

Skin cell-specific enhancer

S

Neuron-specific enhancer

TF

Skin cells

Cell-specific gene expression: histone pattern!

1. Type of transcription factor expressed in a cell

2. The accesability of the regulatory region of a gene

Page 53: Genome biology

The expression of the genome

Expression changes for a lot of genes:

1. Different tissues2. Between humans3. Diseased - healthy

Transcriptome

Page 54: Genome biology

Astounding RNA world

1. The larger half of the genome is transcribed

- The ncRNA coding part of the genome is 50 times longer, than the part which encodes coding RNA (genome tiling arrays, cDNA cloning)

2. There are regulatory antisense RNAs everywhere

- a significant portion of the genes are under the control of trans-asRNAs (miRNAs):

1 miRNA – more gene; 1 gene – more miRNA (transcriptome analysis)

- a significant portion of the genes are under the control of trans-asRNAs (miRNAs): (EST analysis)

- Two discoveries

Page 55: Genome biology

1. New functions of RNAs

- Traditional function: information transmission from DNA to proteins (mRNA), and conribution in this process (tRNA, rRNA)

- New functions:●RNAs are independent information carriers●RNAs regulate the expression of genetic information

2. Genetic regulation

- Traditional theory: the genetic regulation is achieved by theinteraction between transcription factors and the cis regulatoryelements (promoter, enhancer, silencer)

– Present theory: essential rule of regulatory ncRNAs

An astonishing RNA world - Two surprises

Page 56: Genome biology

coding sequences non-coding sequences non-coding/coding Species protein-coding gene genome size MB % MB % sequence ratio (db) (MB)Complete genome Human 20-25 000 2851 34 1,2 1619 57 47,5 : 1 Mouse 20-25 000 2490 31 1,3 1339 54 42,5 : 1 Fruit fly 13 500 120 22 18 53 44 2,4 : 1 C. elegans 19 000 100 26 26 33 33 1,3 : 1

Non-repetitive sequences Human 1455 33 2,3 867 60 26,1 : 1 Mouse 1422 29 2,0 811 57 28,5 : 1 Fruit fly 109 21 20 48 44 2,2 : 1 C. elegans 86 25 29 26 33 1,1 : 1

Coding – non coding

Page 57: Genome biology

Types of RNAs

Types of RNAs

RNAs

Coding RNA Non-coding RNA

mRNA tRNA rRNA aoRNA miRNA snRNA snoRNA

Transcription RNA

siRNA

Regulatory RNA

messenger transfer ribosomal small interfering micro small nuclear

antisense overlapping small nucleolar

33.

Page 58: Genome biology

a. trans-antisense RNAs: imperfect homology

b. cis-antisense RNAs: perfect homology

gene

antisense RNA

cis trans

DNA

Regulating antisense RNAs

Page 59: Genome biology

Micro RNAs

Micro RNAs

nucleus

Droshatranscription

pri-miRNA

pre-miRNA

DICER

exportin-5

RISC

mature miRNA

blocked mRNA

1 23

4

5

cytoplasm

(trans-antisense RNAs)

Discovered in 2000

34.

Page 60: Genome biology

degradation translation block

pre-miRNA

miRNA

The mechanisms of miRNA action

The mechanisms of miRNA action

The function of miRNAs: - Ontogenesis (timing), apoptosis, cell proliferation, oncogenesis

35.

Page 61: Genome biology

cis-antisense RNAs - overlapping antisense RNAs

5’3’DNA

mRNA

5’ 3’

5’ 3’

coding strand

non-coding (antisense) strand

over-lapping

RNAs

5’3’5’ 3’

5’3’5’ 3’

5’3’5’ 3’

Complete overlap

Convergent

Divergent

Page 62: Genome biology

5’3’5’ 3’

5’3’5’ 3’

1. Blocking of tranlation elongation

3. RNA interferencie (?)

RISC

RNáz RNáz

ribosome

FUNCTION

5’3’5’ 3’

FactorFactor: translation, regulation of half-life

2. Blocking of translation initiation

Overlapping RNAsOverlapping RNAs ((ciscis-antisense -antisense

RNAs)RNAs)

Page 63: Genome biology

4.Tools of Genomics4.Tools of Genomics4a. The structure of the DNA 1. cloning – genome library

construction 2. sequencing

4b. The expression of the DNA

1. cDNA library, EST library 2. DNA mikroarrays 3. Protein arrays

4c. Bioinformatics

Page 64: Genome biology

Gene libraries

Genom library: collection of clones, in wich every pieces of the genome of a particular organism can be found.

Usage: sequencing (genome projects), isolation of genes.

cDNA library: (cDNA: copy DNA) The cDNA library contains a cDNA copy of each mRNA of an organism (tissue or cell type). It represents the transcriptome.

Usage: gene structure determination, isolation of cDNSs (intronless gene).

EST library: ‘Expressed Sequence Tag’: either the 5’ or the 3’ end of a cDNA. It also represents the transcriptome of an organism (tissue or cell type), however, because of the smaller clone sizes, it can be handled, sequenced easier and faster.

Usage: determination of the transcription of a cell or tissue type (What kind of genes are expressed in this tissue?).

EST libraries were used for the rapid sequencing of the „active” genome.

Page 65: Genome biology

VectorsSeveral different vectors can be used for theconstruction of gene libraries, like:

PlasmidBacteriophage (lambda)CosmidBAC (Bacterial Artificial Chromosome)YAC (Yeast Artificial Chromosome)

Vector Maximal clone size Numb. of clones requiredfor a complete library

Plasmid 10 Kb 107

Bacteriophage 20 Kb 5 x 105

Cosmid 45 Kb 2 x 105

BAC 500 Kb 5 x 104

YAC 1 Mb 104

Page 66: Genome biology

Structure of vectors

Plasmid: replication origin, antibiotic resistance gene, multi cloning site with many unique restriction sites.

BAC: Bacterial Arteficial Chromosome. In fact, it is a plasmid, with a replication origin, that can be found on native giant plasmids (like F-factor).

Page 67: Genome biology

Construction of genome librariesFragmenting the DNAThe aim is to produce overlapping DNA fragments, that have an appropriate lenght, according to the vector content. We can use restriction endonuclease with 4pbs long recognition site, or we can physically shear the DNA (e.g. pressing through a thin capillary).The animation shows, how the overlapping DNA fragments are generated by partial digestion with an endonuclease.

Page 68: Genome biology

1 2 3 4 5 6

Constructing genome libraries2.: ligation into plasmid vector1.: partial digestion with restriction endonuclease

1

2 53 4 6

1

2 5

3

4

6

Page 69: Genome biology

3.: transforming into E. coli

3

1

2

4

5

6

3

1

2

4

5

6

Constructing genome libraries2.: ligation into plasmid vector

Page 70: Genome biology

Screening of gene librariesHybridization

5’ 3’

5’3’G T G C A C

5’3’G T G C A C

C A C G T G

probetarget

Page 71: Genome biology

Southern blotAgaroze gel electrophoresis

transfer

Nitrocellulose or plasticmembrane

Hybridization with a labeledDNA probe

Page 72: Genome biology

Screening of gene librariesColony hybridization

lysis

transfer onto membrane

hybridizationwith a labeledprobe

Page 73: Genome biology

cDNA-librariesHow can we produce cDNA?

AAAAAAAAA 3’mRNA 5’

TTTTTTTTT 5’3’cDNSfirst strand

3’ CCCCC

5’ GGGGGcDNASecondstrand

3’

1. RNA (mRNA) purification: we can use total RNA or mRNA extract

2. Reverse transcription: by using of oligo dT primers and reverse transcriptase (RNA-dependent DNA polimerase) the first strand of cDNA is synthesized

3. RNase treatment

4. Linker synthesis: the terminal deoxinucleotidil transferase (DNA polimerase, which doesn’t require any template) adds the C linker to the 3’ end

5. Second strand synthesis: oligo dG primers are added and the DNA polimerasesynthesizes the second strand of cDNA.

Page 74: Genome biology

The DNA chip (microarray)

It is for measuring the expression pattern of a large number of genes at the same time. In fact, the DNA chip is an inverted Southern blot: the known probes are covalently bound onto the glass surface. One chip contains 6-10000 gene specific probes. There are:

-cDNA-oligonucleotide and-sequencing chips.

Page 75: Genome biology

1. Preparing the chip:- printing- in situ synthesis

2. Collection of tissue samples

control treated

3. RNA purification

4. Reverse transcription (fluorescent labeling)

5. Hybridization 6. Reading

Page 76: Genome biology

The DNA chipReading and evaluation of data

Page 77: Genome biology

Protein chipHow does it work?

control treated

Protein purification

labeling

Immune reaction

detection

Page 78: Genome biology

Protein chipLabeling strategies

1. Direct labeling

2. Sandwitch method