universtity of auckland - bioinformatics institute base ... · universtity of auckland -...
TRANSCRIPT
IntroductionPhred
Base calling projectProblems and outlook
Universtity of Auckland - Bioinformatics InstituteBase calling project
Gabriele Härtinger
Supervisors: David Bryant, Jari Kaipio
10/12/09
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Guide
IntroductionPCRSanger method
PhredOverviewAlgorithms
Base calling projectNon-negative least-squaresMatlab
Problems and outlook
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
PCRSanger method
PCR
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
PCRSanger method
PCR
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
PCRSanger method
Sanger method
source:http://departments.oxy.edu/biology/Stillman/bi221/092200/lecture_notes.htm
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
PCRSanger method
Sanger method
source: http://dnasequencing.wordpress.com/
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
PCRSanger method
Chromatogram
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred
I standard program for base calling
I reads DNA sequencer trace data
I calls bases
I assigns quality values to the bases
I writes the base calls and quality values to output files
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred
I standard program for base calling
I reads DNA sequencer trace data
I calls bases
I assigns quality values to the bases
I writes the base calls and quality values to output files
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred
I standard program for base calling
I reads DNA sequencer trace data
I calls bases
I assigns quality values to the bases
I writes the base calls and quality values to output files
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred
I standard program for base calling
I reads DNA sequencer trace data
I calls bases
I assigns quality values to the bases
I writes the base calls and quality values to output files
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred
I standard program for base calling
I reads DNA sequencer trace data
I calls bases
I assigns quality values to the bases
I writes the base calls and quality values to output files
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks
locations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks
locations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks
locations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.
I Dynamic programming algorithm match the predicted peakslocations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks
locations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
OverviewAlgorithms
Phred - Algorithms
I With simple Fourier methods Phred examine the four basetraces.
I Check the region surrounding each point in the data set inorder to predict a series of evenly spaced predicted locations.
I Examine each trace to find the centers of the actual, orobserved, peaks.
I Control every of the four traces separate, many peaks overlap.I Dynamic programming algorithm match the predicted peaks
locations with the observed peaks.Some peaks will be exclude and some will splitted.
I Unmatched observed peaks will compared with the predictedpeak and if found inserted.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Base calling
project:base calling in another way
Deconvolve the peaks with thenon-negative least-squares method.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Base calling
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Non-negative least-squares
I least-squares
y = A x + e ⇒ e = y − A x
min ||e||2 = min ||y − A x ||2 =1N
N∑k=1
e2k ← minimize
I non-negative least-squares method
minx≥0||e||2 = min
x≥0||y − A x ||2 =
1N
N∑k=1
e2k ← minimize
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Non-negative least-squares
I least-squares
y = A x + e ⇒ e = y − A x
min ||e||2 = min ||y − A x ||2 =1N
N∑k=1
e2k ← minimize
I non-negative least-squares method
minx≥0||e||2 = min
x≥0||y − A x ||2 =
1N
N∑k=1
e2k ← minimize
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Matlab - LS
0 20 40 60 80 100 120−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Blue: true xGreen: ChromatogramRed: regularized LS
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Non-negative least-squaresMatlab
Matlab - NNLS
0 20 40 60 80 100 120−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Blue: NNLSGreen: ChromatogramRed: regularized LS
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Problem
0 50 100 1500
500
1000
1500
2000
2500
3000
3500
4000
4500Scaling of A: 1
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Outlook
0 50 100 1500
1000
2000
3000
4000
5000
6000
7000
8000Scaling of A: 0.8
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project
IntroductionPhred
Base calling projectProblems and outlook
Acknowledgement
A big THANKS to Allen, David, Jari, Peter
and to all of the Bioinformatics Institute.
Gabriele Härtinger Universtity of Auckland - Bioinformatics Institute Base calling project