bs312 bioinformatics antonio marco - wordpress.com · recommended readings gene expression...

Post on 01-Jun-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Gene expression analysisBS312 Bioinformatics

Antonio Marco

School of Biological SciencesUniversity of Essex

10-Nov-15

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

The ’central dogma’ of molecular biology

DNA makes RNA makes Protein

Francis Crick (1956)

”I just didn’t know what dogma meant”

The ’central dogma’ of molecular biology

DNA makes RNA makes Protein

Francis Crick (1956)

”I just didn’t know what dogma meant”

A more realistic scenario of gene expression

Analyzing Gene Expression: overview

Analyzing Gene Expression: overview

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Northern blot

The ’IS IT THERE?’ approach

Microarrays

The ’IS ANY OF THESE?’ approach

RNA-Seq

The ’WHAT’S IN THERE?’ approach

RNA-Seq

DNA sequencing is done in small fragments

RNA-Seq

RNA-Seq is a quantitative technique

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Phred Score and FASTQ format

• Phred score measures the probability of a sequencing error

• The FASTQ format includes Phred scores in a one-letter code

@SEQUENCE NAME

CATGGCTAGCTGCTAGCTAGCTAGACATTCATCGAAATCGCTAGCCTAGCTACGA

+

!’’*((((***+))%%%++)(%%%%).1***-+*’’))**55CCF>>>>>>C%%%

Phred Score and FASTQ format

• Phred score measures the probability of a sequencing error

• The FASTQ format includes Phred scores in a one-letter code

@SEQUENCE NAME

CATGGCTAGCTGCTAGCTAGCTAGACATTCATCGAAATCGCTAGCCTAGCTACGA

+

!’’*((((***+))%%%++)(%%%%).1***-+*’’))**55CCF>>>>>>C%%%

Read quality

Normalization: principles

• Bilbo believes his sword is big, five ’hands’ in length

• Gandalf thinks Bilbo’s sword is rather short

• Who’s right?

Normalization: principles

• Bilbo believes his sword is big, five ’hands’ in length

• Gandalf thinks Bilbo’s sword is rather short

• Who’s right?

Normalization: RPKM

Normalization: Smear plots

• ’Smear plots’: average # of reads (X) and fold difference (Y)

• Average difference (red line) should be about 0

• Normalization does this!

Normalization: Smear plots

• ’Smear plots’: average # of reads (X) and fold difference (Y)

• Average difference (red line) should be about 0• Normalization does this!

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Hierarchical clustering

• In previous lectures youlearnt that phylogenetictrees reflect sequencesimilarity

• Likewise, trees can be builtto reflect expressionsimilarity

• Most frequent algorithm isUPGMA

Hierarchical clustering

• In previous lectures youlearnt that phylogenetictrees reflect sequencesimilarity

• Likewise, trees can be builtto reflect expressionsimilarity

• Most frequent algorithm isUPGMA

HC and Heatmaps

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Time-course series

Drosophila (fruit fly) development

Adapted from: Graveley et al. (2011) Nature 471:473

Cancer

MicroRNAs in different cancer cells

Volinia et al. (2012) PNAS 109:3024

Reconstruct transcripts

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Practical overview

Characterize the transcription profile of the Down’s SyndromeCritical Region

• Check the quality of the reads

• Mapping reads to a reference genome (human)

• Assemble reads into transcripts

• Visualize annotated transcripts

Recommended readings

Gene expression analysis:

• Pevsner J (2009) Bioinformatics and Functional Genomics.John Wiley & Sons. Chapter 9

• Mutz K-O et al. (2013) Transcriptome analysis usingnext-generation sequencing. Curr Op Biotech 24:22-30

Web resources:

• RNA-Seq at http://en.wikipedia.org/wiki/RNA-Seq

top related