the analysis of proteomics spectra from serum samples · 2004. 12. 30. · this suggests a polymer....
TRANSCRIPT
![Page 1: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/1.jpg)
The Analysis of Proteomic Spectrafrom Serum Samples
Keith BaggerlyBiostatistics & Applied Mathematics
MD Anderson Cancer Center
![Page 2: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/2.jpg)
PROTEOMICS 1
What Are Proteomic Spectra?
DNA makes RNA makes Protein
Microarrays allow us to measure the mRNA complement of a setof cells
Mass spectrometry allows us to measure the proteincomplement (or subset thereof) of a set of cells
Proteomic spectra are mass spectrometry traces of biologicalspecimens
![Page 3: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/3.jpg)
PROTEOMICS 2
Why Are We Excited?
Profiles at this point are being assessed using serum and urine,not tissue biopsies
Spectra are cheaper to run on a per unit basis than microarrays
Can run samples on large numbers of patients
![Page 4: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/4.jpg)
PROTEOMICS 3
How Does Mass Spec Work?
![Page 5: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/5.jpg)
PROTEOMICS 4
What Do the Data Look Like?
![Page 6: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/6.jpg)
PROTEOMICS 5
Learning: Spotting the Samples
![Page 7: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/7.jpg)
PROTEOMICS 6
What the Guts Look Like
![Page 8: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/8.jpg)
PROTEOMICS 7
Taking Data
![Page 9: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/9.jpg)
PROTEOMICS 8
Some Other Common Steps
Fractionating the Samples
Changing the Laser Intensity
Working with Different Matrix Substrates
![Page 10: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/10.jpg)
PROTEOMICS 9
SELDI: A Special Case
www.ciphergen.com
Precoated surface performs some preselection of the proteins foryou.
Machines are nominally easier to use.
![Page 11: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/11.jpg)
PROTEOMICS 10
A Tale of Two Examples
Example 1 – Learning from the literature (SELDI)
Example 2 – Testing out our understanding (MALDI)
A story in pictures
![Page 12: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/12.jpg)
PROTEOMICS 11
A SELDI Example: Feb 16 ’02 Lancet
• 100 ovarian cancer patients
• 100 normal controls
• 16 patients with “benign disease”
Use 50 cancer and 50 normal spectra to train a classificationmethod; test the algorithm on the remaining samples.
![Page 13: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/13.jpg)
PROTEOMICS 12
Their Results
• Correctly classified 50/50 of the ovarian cancer cases.
• Correctly classified 46/50 of the normal cases.
• Correctly classified 16/16 of the benign disease as “other”.
Data at http://www.ncifdaproteomics.com (used to be athttp://clinicalproteomics.steem.com)
Large sample sizes, using serum
![Page 14: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/14.jpg)
PROTEOMICS 13
The Data Sets
3 data sets on ovarian cancer
Data Set 1 – The initial experiment. 216 samples, baselinesubtracted, H4 chip
Data Set 2 – Followup: the same 216 samples, baselinesubtracted, WCX2 chip
Data Set 3 – New experiment: 162 cancers, 91 normals,baseline NOT subtracted, WCX2 chip
A set of 5-7 separating peaks is supplied for each data set.
We tried to (a) replicate their results, and (b) check consistencyof the proteins found
![Page 15: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/15.jpg)
PROTEOMICS 14
We Can’t Replicate their Results (DS1 & DS2)
![Page 16: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/16.jpg)
PROTEOMICS 15
Some Structure is Visible in DS1
![Page 17: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/17.jpg)
PROTEOMICS 16
Or is it? Not in DS2
![Page 18: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/18.jpg)
PROTEOMICS 17
Processing Can Trump Biology (DS1 & DS2)
![Page 19: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/19.jpg)
PROTEOMICS 18
We Can Analyze Data Set 3!
![Page 20: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/20.jpg)
PROTEOMICS 19
Do the DS2 Peaks Work for DS3?
![Page 21: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/21.jpg)
PROTEOMICS 20
Do the DS3 Peaks Work for DS2?
![Page 22: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/22.jpg)
PROTEOMICS 21
Peaks are Offset
![Page 23: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/23.jpg)
PROTEOMICS 22
Which Peaks are Best? T-statistics
Note the magnitudes: t-values in excess of 20 (absolute value)!
![Page 24: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/24.jpg)
PROTEOMICS 23
One Bivariate Plot: M/Z = (435.46,465.57)
Perfect Separation. These are the first 2 peaks in their list, andones we checked against DS2.
![Page 25: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/25.jpg)
PROTEOMICS 24
Another Bivariate Plot: M/Z = (2.79,245.2)
Perfect Separation, using a completely different pair. Further,look at the masses: this is the noise region.
![Page 26: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/26.jpg)
PROTEOMICS 25
Perfect Classification with Noise?
This is a problem, in that it suggests a qualitative difference inhow the samples were processed, not just a difference in thebiology.
This type of separation reminds us of what we saw with benigndisease.
(Sorace and Zhan, BMC Bioinformatics, 2003)
![Page 27: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/27.jpg)
PROTEOMICS 26
Mass Accuracy is Poor?
A tale of 5 masses...
Feb ’02 Apr ’02 Jun ’02DS1 DS2 DS3-7.86E-05 -7.86E-05 -7.86E-052.18E-07 2.18E-07 2.18E-079.60E-05 9.60E-05 9.60E-050.000366014 0.000366014 0.0003660140.000810195 0.000810195 0.000810195
![Page 28: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/28.jpg)
PROTEOMICS 27
How are masses determined?
Calibrating known proteins
![Page 29: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/29.jpg)
PROTEOMICS 28
Calibration is the Same?
M/Z vectors the same for all three data sets.
Machine calibration the same for 4+ months?
![Page 30: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/30.jpg)
PROTEOMICS 29
What is the Calibration Equation?
The Ciphergen equation
m/z
U= a(t− t0)2 + b, U = 20K, t = (0, 1, ...) ∗ 0.004
Fitting it here
a = 0.2721697 ∗ 10−3, b = 0, t0 = 0.0038
![Page 31: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/31.jpg)
PROTEOMICS 29
What is the Calibration Equation?
The Ciphergen equation
m/z
U= a(t− t0)2 + b, U = 20K, t = (0, 1, ...) ∗ 0.004
Fitting it here
a = 0.2721697 ∗ 10−3, b = 0, t0 = 0.0038
These are the default settings that ship with the software!
![Page 32: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/32.jpg)
PROTEOMICS 30
Other issues
Prostate Cancer
Q-star data different
clinical trials?
![Page 33: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/33.jpg)
PROTEOMICS 31
A MALDI Example: Proteomics Data Mining
41 samples, 24 with lung cancer∗, 17 controls.
20 fractions per sample.
Goal: distinguish the two groups; we know this can be done dueto the “zip effect”.
Data used to be at
http://www.radweb.mc.duke.edu/cme/proteomics/explain.htm
but the site has been retired. Send email to Ned Patz or MikeCampa at Duke if interested ([email protected],[email protected]).
![Page 34: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/34.jpg)
PROTEOMICS 32
Raw Spectra Have Different Baselines
Note the need for baseline correction and normalization.
![Page 35: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/35.jpg)
PROTEOMICS 33
Oscillatory Behavior...
Roughly half the spectra have sinusoidal noise.
![Page 36: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/36.jpg)
PROTEOMICS 33
Oscillatory Behavior...
Roughly half the spectra have sinusoidal noise. We’re seeing theA/C power cord.
![Page 37: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/37.jpg)
PROTEOMICS 34
Baseline Adj: Fraction Agreement, Before & After
![Page 38: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/38.jpg)
PROTEOMICS 35
Fractionation is Unstable
![Page 39: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/39.jpg)
PROTEOMICS 36
Unfractionating the Data
![Page 40: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/40.jpg)
PROTEOMICS 37
The Overall Average Shows Spikes. Difference It.
![Page 41: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/41.jpg)
PROTEOMICS 38
Computer Buffer?
Spike spacing has a wavelength of 4096 = 212.
![Page 42: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/42.jpg)
PROTEOMICS 39
Are We Done Cleaning Yet?
Give the problem a chance to be easy, try some simpleclustering.
![Page 43: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/43.jpg)
PROTEOMICS 40
PCA Splits off Half the Normals
![Page 44: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/44.jpg)
PROTEOMICS 41
Peaks at Integer Multiples of M/Z 180.6!
This suggests a polymer. No Amino Acid dimers fit.
![Page 45: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/45.jpg)
PROTEOMICS 42
Cleaning Redux
• Baseline Correction and Normalization
• Inconsistent Fractionation
• Computer Buffers
• Polymers in some Normal Spectra
• Peak Finding (Use Theirs)
Data reduced to 1 spectrum/patient, with 506 peaks perspectrum.
![Page 46: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/46.jpg)
PROTEOMICS 43
Find the Best Separators
Peaks MD P-Value Wrong LOOCV12886 2.547 ≤ 0.005 11 118840, 12886 5.679 ≤ 0.01 5 63077, 12886 9.019 ≤ 0.01 3 4742635863, 8143 12.585 ≤ 0.01 3 38840, 128864125, 7000 23.108 ≤ 0.01 1 19010, 1288674263
There are 9 values that recur frequently, at masses of 3077,4069, 5825, 6955, 8840, 12886, 17318, 61000, and 74263.
P-values are not from table lookups!
![Page 47: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/47.jpg)
PROTEOMICS 44
Testing Reality (Significance)
Generate a bunch of “random noise” data matrices, each41× 506 in size.
For each matrix, split the 41 noise “samples” into groups of 24and 17.
Repeat our search procedure on the random data, and see howwell we can separate things.
![Page 48: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/48.jpg)
PROTEOMICS 45
The Eyeball Test
We applied one last filtering step and actually looked at theregions identified. All 9 peaks listed above passed the eye test.
Blue lines = Cancers
Red lines = Controls
![Page 49: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/49.jpg)
PROTEOMICS 46
Other Stuff
We were the only ones to notice the sinusoidal noise.
![Page 50: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/50.jpg)
PROTEOMICS 46
Other Stuff
We were the only ones to notice the sinusoidal noise.
and the clock tick.
![Page 51: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/51.jpg)
PROTEOMICS 46
Other Stuff
We were the only ones to notice the sinusoidal noise.
and the clock tick.
and they were looking at power cables and other stuff.
![Page 52: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/52.jpg)
PROTEOMICS 46
Other Stuff
We were the only ones to notice the sinusoidal noise.
and the clock tick.
and they were looking at power cables and other stuff.
and they gave us a nice shiny plaque!
![Page 53: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/53.jpg)
PROTEOMICS 47
The Deluge at MDA
Brain Cancer
Bladder Cancer
Leukemia
Pancreatic Cancer
Breast Cancer
Several show real structure, several show processing effects.
“If you’re not working on a proteomics project, you will be soon!”Kevin Coombes to Bioinf section at MDA
![Page 54: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/54.jpg)
PROTEOMICS 48
The Punchlines
• There is no magic bullet here. (Sorry.)
• Data preprocessing is extremely important with this type ofdata, and there is still much room for improvement.
• Use Simple Tests and Pictures
• Insist on Good Experimental Design
• There is structure in this data and it can be found!
![Page 55: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/55.jpg)
PROTEOMICS 49
Our Own Reports
On the Lancet data:
Baggerly, Morris and Coombes (2004), Bioinformatics,20(5):777-785.
On the Proteomics Data Mining Conference Data:
Baggerly, Morris, Wang, Gold, Xiao and Coombes (2003),Proteomics, 3(9):1677-1682.
More methodology:
Coombes et al (2003), Clinical Chemistry, 49(10):1615-1623.
pdf preprints are available.
![Page 56: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/56.jpg)
PROTEOMICS 50
Partners in Crime
Kevin Coombes, Jeff Morris
——-
Jing Wang, David Gold, Lian-Chun Xiao
——-
Ryuji Kobayashi, David Hawke, John Koomen
![Page 57: The Analysis of Proteomics Spectra from Serum Samples · 2004. 12. 30. · This suggests a polymer. No Amino Acid dimers fit. PROTEOMICS 42 Cleaning Redux • Baseline Correction](https://reader033.vdocuments.us/reader033/viewer/2022060520/604efe17dd3c1a4dec3d750f/html5/thumbnails/57.jpg)
PROTEOMICS 51
Index
Title PageIntro to MALDI and SELDIOvarian SELDI ExampleProteomics Data Mining ExampleConclusionsReports and Acknowledgements——————————————————–