nature methods doi:10.1038/nmeth...jesse g. meyer1, sushanth mukkamalla1, hanno steen2, alexey i....
TRANSCRIPT
Nature Methods doi:10.1038/nmeth.4334
Supplementary Figure 1
Details of PIQED automated qualitative and quantitative post-translational modification analysis workflow.
Modified peptides enriched from biological samples or peptides from proteome digestion without modification enrichment are analyzed by data-independent acquisition. PIQED supports data from SCIEX or Thermo instruments. PIQED automates all data analysis steps st arting from instrument .wiff or .raw files, including: (1) file conversion and pseudo-MS/MS spectra generation using DIA-Umpire, (2) database searching by MS-GF+, X! Tandem, and/or Comet followed by results refinement and combination using PeptideProphet/iProphet/PTMProphet, (3) automated spectral library generation and fragment area extraction using SkylineRunner, and finally, (4) Skyline report generation, filtering for only localized PTMs with certain iProphet and PTMProphet scores, option protein level correction, and table reformatting for significance testing with mapDIA. All files produced along the analysis are output by the pipeline, including the raw search outputs, the various refined pep.xml files, the Skyline file containing extracted peaks for manual review, and the final output from mapDIA, which includes site-level fold change for all desired comparisons.
Nature Methods doi:10.1038/nmeth.4334
Supplementary Figure 2
Screenshot of the open-source PIQED GUI written in Java, Windows Batch, Python, and R.
The software has several optional modules that can be run, where each subsequent module can automatically use the output from the previous module. These modules are broadly divided into three sections: (1) file conversions and DIA-Umpire signal extraction, (2) database searches, and (3) post processing, quantification and significance testing. The leftmost section is used to specify parameters for file conversions and DIA-Umpire signal extraction module that generates pseudo-MS/MS spectra for database searches. The middle section is used to set the database search parameters for MS-GF+, X!Tandem, and Comet. The right section contains input for PeptideProphet, iProphet, PTMProphet, Skyline peak area report generation, and mapDIA significance testing. The program user can save and load up to four default parameter sets.
Nature Methods doi:10.1038/nmeth.4334
Nature Methods doi:10.1038/nmeth.4334
Supplementary Figure 3
Example of quantitative difference discovered using PIQED for the discovery of phosphorylation sites and changes using previously published Q-Exactive DIA data collected from urine samples without phospho-enrichment.
(A) Annotated spectra for the peptide TCVADEpSAENCDK, which contains phosphoserine pSer-82 from human serum albumin (HSA). (B) Example of extracted ion chromatograms from the precursor (MS1) and fragment (MS2) ions showing excellent coelution and high mass accuracy. (C) Plot of total MS1 and MS2 peak areas for all replicates measured among the three diagnosis groups: diagnosed urinary tract infections (UTI, n=11), diagnosed ovarian cyst (OC, n=11), or control (pain not diagnosed as UTI or OC, n=11). Group average of (D) raw site-level signal, (E) HSA protein-level, and (F) site-level corrected by protein level and local TIC. Error bars are standard error. Although the protein level of HSA is not significantly different between the diagnosis groups, we observed a statistically significant increase in abundance of pSer-82 comparing the UTI group with the undiagnosed pain group with and without the normalization options available in PIQED.
Nature Methods doi:10.1038/nmeth.4334
Nature Methods doi:10.1038/nmeth.4334
Supplementary Figure 4
Comparison of quantitative results for HSA and HSA pSer-82 using various normalization options in PIQED.
(A) Plot showing individual total site-level peak areas for the peptide TCVADEpSAENCDK containing phosphoserine pSer-82. (B) Protein-level areas computed with mapDIA using unmodified HSA peptides. (C) Site-level areas (from A) for the peptide TCVADEpSAENCDK containing phosphoserine pSer-82 corrected by the observed protein-level areas from (B). (D) Site-level areas for phosphoserine pSer-82 from the peptide in (A) corrected by local total ion chromatogram signal (TIC). (E) Site-level areas for phosphoserine pSer-82 from the peptide in (A) corrected by local total ion chromatogram signal (TIC) and by the protein-level area from (B). In all cases with or without normalizations, this phosphorylation site is determined to be statistically increased in urine from children diagnosed with UTI compared to the undiagnosed pain group.
Nature Methods doi:10.1038/nmeth.4334
Nature Methods doi:10.1038/nmeth.4334
Supplementary Figure 5
Examples of high-quality annotated pseudo-MS/MS spectra showing 4/34 phosphorylation sites in osteopontin identified using PIQED. Osteopontin is overexpressed in a variety of cancers.
(A) Annotated spectra for pSer-219 showing the presence of phosphate neutral loss on all y-ions and a prominent fragment ion corresponding to fragmentation n-terminal of proline. (B) Annotated spectra for a peptide containing pSer-310 from the Osteopontin C-terminal, which has a nearly-complete b-ion series. (C) Annotated spectra for a peptide containing pSer-308 and pSer-310 from the Osteopontin c-terminal, which also contains a nearly complete b-ion series and neutral losses of one or two phosphoric acids starting at b7 and b9, respectively. The spectra in (B) and (C) also contain unlabeled fragment ions that correspond to a neutral loss of water from one of the unmodified serine residues.
Nature Methods doi:10.1038/nmeth.4334
SupplementalInformationfor
AutomatedIdentification&QuantificationofProteinModificationsfromData-IndependentAcquisitionJesseG.Meyer1,SushanthMukkamalla1,HannoSteen2,AlexeyI.Nesvizhskii3,4,
BradfordW.Gibson1,BirgitSchilling1*
Affiliations
1BuckInstituteforResearchonAging,Novato,CA,USA.
2DepartmentofPathology,BostonChildren'sHospitalandHarvardMedicalSchool,Boston,Massachusetts,USA.
3DepartmentofComputationalMedicineandBioinformatics,UniversityofMichigan,AnnArbor,Michigan,USA.
4DepartmentofPathology,UniversityofMichigan,AnnArbor,Michigan,USA.
*Correspondingauthor
BirgitSchilling,[email protected]
Nature Methods doi:10.1038/nmeth.4334
SUPPLEMENTALNOTE
DETAILEDDESCRIPTIONOFPIQEDANALYSISWORKFLOW
FilesfromDIAareprocessedusingourcustom,open-sourcesoftwarePIQED,whichprovidesagraphical
userinterfaceactingasawrapperforallthecomputationalstepsofcombineddiscoveryand
quantificationofPTMpeptidesfromDIA.TheGitHubrepositorycontainsdetailedinstallationandusage
instructions(https://github.com/jgmeyerucsd/PIQEDia/blob/master/Tutorial_and_manual.pdf).Data
fromeitherSCIEXorThermoFisherScientificinstrumentsarecurrentlysupportedbyPIQED.Raw
instrumentfilesarefirstconvertedto.mzXMLusingmsconvert.exefromProteoWizard,andtheDIA-
Umpiresignalextractionmoduleisusedtogeneratepseudo-MS/MSspectra.MGFfilesproducedfrom
DIA-Umpireareautomaticallyconvertedto.mzXML,andtheusercanchoosetospecifyparametersfor
databasesearchesusinganycombinationofMSGF+1,Comet2,and/orX!Tandem3(seesupplemental
files“comet.Kac.DIA.params”and“xTandem_Kac_params.xml”).ThedatabasesearchresultsfromX!
TandemandMSGF+areautomaticallyconvertedtopep.xml,whereasthecometsearchoutputshould
bespecifiedas.pep.xmlintheCometconfigurationfile.Allsearchoutputsareprocessedthrough
PeptideProphet4usingtheTPPcommandxinteract.exeseparately,combinedusingiProphet5,and
refinedforPTMlocalizationusingPTMProphet6.Peptideidentificationsareimportedintoapre-setup
Skyline7documenttemplateusingSkylineRunner.exetogenerateaspectrallibrary,performfragment
ion-levelsignalextraction,andoutputacustomfragmention-levelpeakareareport.TheSkylinereport
isreadintoRandfilteredforonlypeptidescontainingthemodificationofinterestthathaveauser-
definedPTMProphetlocalizationscore,optionallycorrectedforchangesinproteinlevelifthedirectory
containsthefile“proteinlevels.txt”,andreformattedformapDIA.Theoptionalprotein-levelcorrection
usesprotein-levelintensitymeasurementsfromthesamesamplecomputedusingunmodifiedpeptides
(eitherfromaseparateacquisition,orinthecaseoftheUrinedata,fromunmodifiedpeptidesidentified
Nature Methods doi:10.1038/nmeth.4334
inthesameacquisition)anddividesallfragment-levelareasforeachmodificationsitebytheirprotein-
levelintensities.Optionalprotein-levelcorrectionwillautomaticallybeattemptedbythesoftwareifthe
directorycontainsthefile“proteinlevels.txt”,anexampleofwhichisavailableonGitHubunderthe
inputsfolder.Onlymodificationsitesfromproteinsfoundintheseparateprotein-levelquantification
areincludedintheoutput,whichiswhythenumberofphosphorylationsitesquantifiedfromtheurine
datasetdecreasedafterprotein-levelcorrection.SeefigureS4A-Cforanexampleofprotein-level
normalization.TheRstepcanalsorenamereplicatestocontaingroupnames,whichautomatically
happensifthefile“namemapping.txt”ispresentintheoutputdirectory(seeGitHubforanexampleof
“namemapping.txt”).Finally,PIQEDrunsmapDIA8toperforminterferencefilteringandtoassess
statisticalsignificanceofthedesiredcomparisons.Thisstepallowstheoptionfortotalion
chromatogramnormalization,orlocal,retention-timebasednormalization(seefigureS4D-Eand
mapDIApublicationfordetails).Insummary,oursoftwaretoolPIQEDprovidesahigh-throughput
automateddataprocessingpipelinethatnotonlycanbeusedtoidentify,butalsoquantify
posttranslationalmodifications,combiningmanyindividualdataprocessingtoolsinanautomated
fashion,whichcanbeexecutedthroughasimpleGUIandinterface.Atthesametime,byusingaDIA-
onlydataacquisitionworkflow,thePTManalysiscanbeperformedusingsmallamountsofbiological
material(asnoextraDDAacquisitionsareneededforlibrarybuilding),andmassspectrometric
instrumenttimeisreducedsignificantly.TheimplementationofproteinlevelnormalizationforPTM
quantificationallowsformorepreciseresults.
DESCRIPTIONOFFILESONGITHUB:INPUTSANDPARAMS
Diaumpire_se_orbi_strict.txt–parametersfileusedforDIA-Umpiresignalextractionfromurinedataset
Diaumpire_se.params–parametersfileusedforDIA-Umpiresignalextractionmoduleusedfortheacetyldatasetincludingthevariablewindowdefinition
Nature Methods doi:10.1038/nmeth.4334
20150810.mouse.cc.iRT.fasta–databasefileusedforMS-GF+databasesearchesoftheacetyldatasetandforpopulatingSkylinedocument
20161213.human.fasta–databasefileusedforMS-GF+databasesearchesoftheurinedatasetandforpopulatingtheSkylinedocument
20150810.mouse.cc.iRT_DECOY.fasta–databasefileusedforCOMETandX!Tandemdatabasesearchesoftheacetyldataset
20161213.human_DECOY.fasta–databasefileusedfortheCOMETandX!Tandemdatabasesearchesoftheurinedataset
Comet.Kac.DIA.params–Cometdatabasesearchparametersfileusedforacetyldataset
Comet64.params.orbi.new–Cometdatabasesearchparametersfileusedforurinedataset
taxonomy.xml–FilerequiredforX!Tandemdatabasesearchesspecifyingthelocationofthedatabasefileusedfortheacetyldataset
human_taxonomy.xml–FilerequiredforX!Tandemdatabasesearchesspecifyingthelocationofthedatabasefileusedfortheurinedataset
xTandem_Kac_params.xml–X!Tandemdatabasesearchparametersfileusedfortheacetyldataset
xTandem_pSTY_orbi_params.xml–X!Tandemdatabasesearchparametersfileusedfortheurinedataset
Skyline\default_empty.sky–‘empty’Skylinedocumentcontainingallappropriatesettingsforthetutorialdataset.Foruserdata,thetemplatedocumentshouldbeeditedtoreflecttheinstrumentparametersusedtocollectyourdata.
Skyline\Orbi_empty.sky–‘empty’Skylinedocumentcontainingexamplesettingsfordataextractionfromorbitrapdata.
DESCRIPTIONOFFILESONGITHUB,ACETYLLYSINEDATASETRESULTSFILES:OUTPUTS\ACETYL_MOUSE_LIVER\
fullDIA.final.interact.ptm.pep.xls.xlsx–iProphet-filteredpeptideidentificationresults
fullDIA.final.interact.ptm.pep.xml–finalpeptideidentificationresultsinpep.xmlformat
fullDIA.pt99.mProph.features.csv–completelistofmProphetfeaturescoresproducedwithinSkyline
2016_0826_mapDIA.skyr–customSkylinereportfile
2016_0826_mapDIA.csv–SkylinereportbeforemapDIAfilteringforinterferencesandreformatting
mapDIA_Input.txt–FilteredandreformattedSkylinereportusedforinputtomapDIA
site_level_areas.txt–mapDIAsite-levelareareportcontainingareasusedforcalculationofCVvaluesinfigure1c.
Nature Methods doi:10.1038/nmeth.4334
mapDIA_analysis_output.txt–rawmapDIAoutputresultswithsite-levelfoldchangesandprobabilitiesusedtogeneratefigure1d.
DESCRIPTIONOFFILESONGITHUB,URINEPHOSPHORYLATIONDATASETRESULTSFILES:OUTPUTS\URINE\
ptmProphet-output-file.ptm.pep.xml.zip–compressedPTMprophetoutput
iPro-output-file.pep.xml–combinediProphetresultsfromusedforinputtoPTMProphet
noCor_analysis_output.txt–site-levelmapDIAoutputusingnonormalizationusedtoproducesupplementalfigure3A.
proteinlevels.txt–protein-levelquantitiesusedforprotein-levelcorrectionofsite-levelchangesusedtoproducesupplementalfigure3B.
protlvlCor_noTICcor_analysis_output.txt-site-levelmapDIAoutputusingprotein-levelnormalizationbutnotlocalTICnormalizationusedtoproducesupplementalfigure3C.
TICcor_noProtCor_analysis_output.txt-site-levelmapDIAoutputusinglocalTICnormalizationbutnotprotein-levelnormalizationusedtoproducesupplementalfigure3D.
bothCor_analysis_output.txt–site-levelmapDIAoutputusingbothlocalTICnormalizationandprotein-levelcorrectionusedtoproducesupplementalfigure3E.
Nature Methods doi:10.1038/nmeth.4334
SUPPLEMENTALMETHODS
SAMPLEPREPARATION–ENRICHMENTOFACETYLATEDPEPTIDESFROMMOUSELIVER
AnimalstudieswereperformedaccordingtoprotocolsapprovedbyIACUC(theInstitutionalAnimalCare
andUseCommittee).Sirt5-/-malemiceon129backgroundweremaintainedonastandardchowdiet
(5053PicoLabdiet,Purina)untiltheyweresacrificedat24weeksofageforexperiments.Livertissue
fromaSirt5knockoutmousewaslysedand20mgofproteinlysatewasdigestedasdescribed
previously9.Acetylatedpeptideswereenrichedusing1tubeofanti-acetyllysineantibody-bead
conjugatedPTMScan(CellSignalingTechnologies)accordingtothemanufacturer’sinstructions.
PeptideselutedfromtheenrichmentweredesaltedusingC18reversedphaseStageTips,and
resuspendedin0.2%formicacidformassspectrometryanalysis.
NANO-LIQUIDCHROMATOGRAPHY-TANDEMMASSSPECTROMETRY
PeptideseparationswerecarriedoutusingmobilephaseAconsistingof97.95%water/0.05%FA/2%
acetonitrile,andmobilephaseBconsistingof98%acetonitrile/1.95%water/0.05%FA.Sampleswere
loadedontoaC18pre-columnchip(200µmx6mmChromXPC18-CLchip,3µm,300Å)usingan
EksigentcHiPLCsystemfor10minutesataflowof2µLperminofmobilephaseA.Separationwas
performedwitha75µmx15cmChromXPC18-CLanalyticalchip(3µm,300Å)usingagradientfrom
95%to60%mobilephaseAover80minutes.Thecolumnchipwaswashedbyanincreaseto80%B
over5minutesthatwasmaintainedfor8minutes,followedbyareturnto95%mobilephaseAovertwo
minutesthatwasmaintainedfor35minutestore-equilibratethecolumn.Elutingpeptideswere
directlyelectrosprayedintoanorthogonalquadrupoletime-of-flightTripleTOF5600massspectrometer
(SCIEX)andanalyzedbydata-independentacquisition(SWATH).EverySWATHcycleconsistedofa250
Nature Methods doi:10.1038/nmeth.4334
msprecursorscanfrom400-1,250m/zfollowedbyfragmentationofallionsbetween400-1,200m/z
using64variablewidthprecursorisolationwindows(windowdefinitionscontainedinDIA-Umpiresignal
extractionparametersfile“diaumpire_se.params”)for42mseach,resultinginatotalcycletimeof
approximately3sec.Fragmentionspectrawerecollectedfrom100-2,000m/z.
PIQEDSETTINGSUSEDFOR1XVERSUS2XINJECTIONEXPERIMENT
AllparameterfilesusedinthisstudyareavailablefordownloadfromMassIVE(MassIVEID:
MSV000080189ftp://massive.ucsd.edu/MSV000080189/raw/).TheSkylinefileoutputfromPIQEDhas
beenuploadedtoPanoramaathttps://panoramaweb.org/labkey/project/Schilling/PIQED/begin.view
(useyourgeneralPanaramalogintofreelyaccessthedataset).Pseudo-MS/MSspectrafromDIA-
UmpireweredatabasesearchedusingMSGF+,Comet,andX!TandemagainsttheUniProtdatabaseof
allmouseproteinsdownloadedonAugust10th,2015andreversedsequencestoallowfalsediscovery
rate(FDR)estimationusingthetarget-decoyapproach.SeethesupplementalfilesforCometandX!
Tandemsearchparameters.SearchparametersforMSGF+were:numberoftryptictermini=2,peptide
lengthsfrom7-40aminoacids,chargevaluesfrom1to6,20ppmprecursormasserror,andnoisotope
error.Searchesincludedfixedcarbamidomethylationofcysteineandupto3variablemodificationsby
methionineoxidation,lysineacetylation,peptideN-terminalpyroglutamateformation,andproteinN-
terminalacetylation.Databasesearchresultswereconvertedfrom.mzidto.pep.xmlusing
idconvert.exe,andresultswererefinedusingPeptideProphet.PeptideProphetprocessingusedthe
followingoptions:donotmergefilesintooneanalysis,usePPMmasserrormodel,usethenon-
parametricmodelwithdecoyhitstopindownthenegativedistribution,reportdecoyhitswith
computedprobability,enzymetrypsin,andclevel=0.AfterseparaterefinementwithPeptideProphet,all
databasesearchresultswerecombinedintoonefileusingiProphetwiththedefaultparameters.The
Nature Methods doi:10.1038/nmeth.4334
iProphetoutputwasthenusedasinputtoPTMProphetforPTMlocalizationscoring.Finally,thepeptide
identificationresultsin.pep.xmlformatwereimportedintoSkylinetogenerateaspectrallibrary.
SWATHfragmentionpeakareasfromthethreereplicatesof0.5Xand1Xinjectionswereexportedas
thecustomSkylinereportsandpeptidescontaininglysine+42(acetylation)werefilteredbasedona
minimumlocalizationscoreof0.95.mapDIA,whichfiltersfragmentionsanduseswell-established
Bayesianmodelingtocomputethefalsediscoveryratesofchanges,wasthenusedtocomparethetwo
groups(i.e.“halfDIA”and“fullDIA”)withthedefaultparametersusingnonormalization.ThemapDIA
modelisquiterobustwithanydistributionofinputdata,thereforenoassumptionsarerequiredforits
usetotestdifferentialexpression.Theresultspresentedherearerepresentativeofseveralsimilar
experimentsperformedinourlaboratory.
PIQEDANALYSISOFURINEDATAFROMQ-EXACTIVE
Eleveninstrument.rawfilesperdiagnosisgroupweredownloadedfromthesetoffilesavailableon
PeptideAtlas(datasetidentifier:PASS00706)publishedbyMunteletal.10PIQEDsettingforthisdataset
wereasdescribedaboveexceptifnoteddifferentlyhere.Becausethisdatawascollectedwithout
phosphopeptideenrichment,weusedmorestringentsignalextractionandpost-quantificationfiltering
parametersthanthosedescribedintheprevioussection.ThesuggestedDIA-Umpiresignalextraction
parametersincludedwiththeDIA-UmpiredownloadwereusedexceptthatCorrThresholdwassetto
0.5,DeltaApexwassetto0.1,RTOverlapwassetto0.1,BoostComplementaryIonwassettoFALSE,
SE.estimateBGwassettoTRUE,andSE.MassDefectOffsetwassetto0.2.Databasesearcheswere
performedagainstthehumanproteomedownloadedfromUniProtonDecember13th,2016,andused
15ppmprecursortolerance.SearchesallowedvariablephosphorylationofSTYandvariableoxidationof
methionine.
Nature Methods doi:10.1038/nmeth.4334
STATISTICS
Acetylatedpeptidedatausedfortheplotsinthemaintextfigurewasfromthreetechnical
replicatescorrespondingtorepeatinjectionsof0.5Xor1Xsamplevolumefromthesame
sample.Theacetylatedpeptideresultspresentedinfigure1arerepresentativeof2repetitions
oftheexperiment.Phosphorylatedpeptidedatacomesfromreanalysisofbiologicalreplicates
correspondingtourinecollectedfromuniquepatients,elevenfromeachdiagnosisgroup(see
supplementalreference10formoresampledetails).Forstatisticaldetailsofeachindividual
softwareprogram,seetheirrespectivepublications.Eachoftheexternalprograms(usedas
partofourpipeline)containstheirownthoroughstatisticalalgorithmsandsignificancetesting.
Noadditionalstatisticaltestswereimplementedoutsideoftheconstitutiveprograms.
CODEAVAILABILITY
ThecurrentPIQEDsoftwareisavailablefromGitHub
(https://github.com/jgmeyerucsd/PIQEDia/releases/tag/v0.1.2),GNUGeneralPublicLicense
v3.0.Updatedandfuturesoftwareversionswillbeavailablefrom
https://github.com/jgmeyerucsd/PIQEDia.
SUPPLEMENTALREFERENCES
1.Kim,S.&Pevzner,P.A.MS-GF+makesprogresstowardsauniversaldatabasesearchtoolfor
proteomics.NatCommun5,(2014).
Nature Methods doi:10.1038/nmeth.4334
2.Eng,J.K.,Jahan,T.A.&Hoopmann,M.R.Comet:Anopen-sourceMS/MSsequence
databasesearchtool.PROTEOMICS13,22–24(2013).
3.Craig,R.&Beavis,R.C.TANDEM:matchingproteinswithtandemmassspectra.
Bioinformatics20,1466–1467(2004).
4.Keller,A.,Nesvizhskii,A.I.,Kolker,E.&Aebersold,R.EmpiricalStatisticalModelToEstimate
theAccuracyofPeptideIdentificationsMadebyMS/MSandDatabaseSearch.Anal.Chem.
74,5383–5392(2002).
5. Shteynberg,D.etal.iProphet:Multi-levelIntegrativeAnalysisofShotgunProteomicData
ImprovesPeptideandProteinIdentificationRatesandErrorEstimates.Mol.Cell.Proteomics
10,(2011).
6. Shteynberg,D.etal.PTMProphet:TPPsoftwareforvalidationofmodifiedsitelocationson
post-translationallymodifiedpeptides.in60thASMSConferenceonMassSpectrometry,
Vancouver,BC,Canada20–24(2012).
7. Egertson,J.D.,MacLean,B.,Johnson,R.,Xuan,Y.&MacCoss,M.J.Multiplexedpeptide
analysisusingdata-independentacquisitionandSkyline.NatProtoc.10,887–903(2015).
8. Teo,G.etal.mapDIA:Preprocessingandstatisticalanalysisofquantitativeproteomicsdata
fromdataindependentacquisitionmassspectrometry.J.Proteomics129,108–120(2015).
9.Rardin,M.J.etal.SIRT5regulatesthemitochondriallysinesuccinylomeandmetabolic
networks.CellMetab.18,920–933(2013).
10.Muntel,J.etal.AdvancingUrinaryProteinBiomarkerDiscoverybyData-Independent
AcquisitiononaQuadrupole-OrbitrapMassSpectrometer.J.ProteomeRes.14,4752–4762
(2015).
Nature Methods doi:10.1038/nmeth.4334
Nature Methods doi:10.1038/nmeth.4334