universtity of auckland - bioinformatics institute base ... · universtity of auckland -...

27
Introduction Phred Base calling project Problems and outlook Universtity of Auckland - Bioinformatics Institute Base calling project Gabriele Härtinger Supervisors: David Bryant, Jari Kaipio 10/12/09 Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base callin

Upload: others

Post on 18-Oct-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Universtity of Auckland - Bioinformatics InstituteBase calling project

Gabriele Härtinger

Supervisors: David Bryant, Jari Kaipio

10/12/09

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 2: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Guide

IntroductionPCRSanger method

PhredOverviewAlgorithms

Base calling projectNon-negative least-squaresMatlab

Problems and outlook

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 3: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

PCRSanger method

PCR

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 4: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

PCRSanger method

PCR

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 5: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

PCRSanger method

Sanger method

source:http://departments.oxy.edu/biology/Stillman/bi221/092200/lecture_notes.htm

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 6: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

PCRSanger method

Sanger method

source: http://dnasequencing.wordpress.com/

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 7: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

PCRSanger method

Chromatogram

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 8: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred

I standard program for base calling

I reads DNA sequencer trace data

I calls bases

I assigns quality values to the bases

I writes the base calls and quality values to output files

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 9: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred

I standard program for base calling

I reads DNA sequencer trace data

I calls bases

I assigns quality values to the bases

I writes the base calls and quality values to output files

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 10: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred

I standard program for base calling

I reads DNA sequencer trace data

I calls bases

I assigns quality values to the bases

I writes the base calls and quality values to output files

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 11: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred

I standard program for base calling

I reads DNA sequencer trace data

I calls bases

I assigns quality values to the bases

I writes the base calls and quality values to output files

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 12: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred

I standard program for base calling

I reads DNA sequencer trace data

I calls bases

I assigns quality values to the bases

I writes the base calls and quality values to output files

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 13: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks

locations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 14: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks

locations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 15: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks

locations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 16: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.

I Dynamic programming algorithm match the predicted peakslocations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 17: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks

locations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 18: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

OverviewAlgorithms

Phred - Algorithms

I With simple Fourier methods Phred examine the four basetraces.

I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.

I Examine each trace to find the centers of the actual, orobserved, peaks.

I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks

locations with the observed peaks.Some peaks will be exclude and some will splitted.

I Unmatched observed peaks will compared with the predictedpeak and if found inserted.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 19: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Base calling

project:base calling in another way

Deconvolve the peaks with thenon-negative least-squares method.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 20: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Base calling

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 21: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Non-negative least-squares

I least-squares

y = A x + e ⇒ e = y − A x

min ||e||2 = min ||y − A x ||2 =1N

N∑k=1

e2k ← minimize

I non-negative least-squares method

minx≥0||e||2 = min

x≥0||y − A x ||2 =

1N

N∑k=1

e2k ← minimize

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 22: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Non-negative least-squares

I least-squares

y = A x + e ⇒ e = y − A x

min ||e||2 = min ||y − A x ||2 =1N

N∑k=1

e2k ← minimize

I non-negative least-squares method

minx≥0||e||2 = min

x≥0||y − A x ||2 =

1N

N∑k=1

e2k ← minimize

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 23: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Matlab - LS

0 20 40 60 80 100 120−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Blue: true xGreen: ChromatogramRed: regularized LS

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 24: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Non-negative least-squaresMatlab

Matlab - NNLS

0 20 40 60 80 100 120−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Blue: NNLSGreen: ChromatogramRed: regularized LS

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 25: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Problem

0 50 100 1500

500

1000

1500

2000

2500

3000

3500

4000

4500Scaling of A: 1

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 26: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Outlook

0 50 100 1500

1000

2000

3000

4000

5000

6000

7000

8000Scaling of A: 0.8

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project

Page 27: Universtity of Auckland - Bioinformatics Institute Base ... · Universtity of Auckland - Bioinformatics Institute Base calling project GabrieleHärtinger Supervisors: David Bryant,

IntroductionPhred

Base calling projectProblems and outlook

Acknowledgement

A big THANKS to Allen, David, Jari, Peter

and to all of the Bioinformatics Institute.

Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project