evaluang methods - tandy warnowtandy.cs.illinois.edu/method-evaluation.pdf ·...

31
Evalua&ng Methods Tandy Warnow

Upload: others

Post on 17-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Evalua&ngMethods

TandyWarnow

Page 2: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

You’vedesignedanewmethod!Nowwhat?

Toevaluateanewmethod:•  Establishtheore&calproper&es.•  Evaluateondata.•  Comparethenewmethodtoothermethods.Howdoyoudothis?

Page 3: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

GeneralIssues

•  Sofarwehavecomputedtreesandwehavecomputedalignments.

•  Howcanwequan&fyaccuracyorerror?Whatdatasetsshouldweuse?

•  Whataretheissues?

Page 4: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Basiccriteria

•  Sensi&vity=trueposi&verate=recallrate= TP/(TP+FN)

•  Precision=posi&vepredic&vevalue= TP/(TP+FP)

•  Specificity=truenega&verate= TN/(TN+FP)

•  FalseDiscoveryRate=1-PPV

Page 5: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Trueposi&ves,falseposi&ves,etc.

•  Forthesecriteria,weneedtounderstandtheconceptsof–  trueposi&ve,–  falseposi&ve,–  truenega&ve,and–  falsenega&ve

•  Inotherwords,weneedtohavea“yes/no”classifier.

Page 6: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Simpleexample:HIVtes&ng

•  Samplespace:HIVtests(Eliza)– Trueposi&ve:thetestcomesoutposi&veandthepersondoeshaveHIV

– Truenega&ve:thetestcomesoutnega&veandthepersondoesnothaveHIV

– Falseposi&ve:thetestcomesoutposi&vebutthepersondoesnothaveHIV

– Falsenega&ve:thetestcomesoutnega&veandthepersondoeshaveHIV

Page 7: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– TP=9,FP=11,TN=979,FN=1– Sensi&vity=TP/(TP+FN)=9/10=90%– Specificity=TN/(TN+FP)=979/990=98.9%– Precision=TP/(TP+FP)=9/20=45%

Page 8: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– TP=9,FP=11,TN=979,FN=1– Sensi&vity=TP/(TP+FN)=9/10=90%– Specificity=TN/(TN+FP)=979/990=98.9%– Precision=TP/(TP+FP)=9/20=45%

Page 9: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– TP=9,FP=11,TN=979,FN=1– Sensi&vity=TP/(TP+FN)=9/10=90%– Specificity=TN/(TN+FP)=979/990=98.9%– Precision=TP/(TP+FP)=9/20=45%

Page 10: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– TP=9,FP=11,TN=979,FN=1– Sensi&vity=TP/(TP+FN)=9/10=90%– Specificity=TN/(TN+FP)=979/990=98.9%– Precision=TP/(TP+FP)=9/20=45%

Page 11: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– TP=9,FP=11,TN=979,FN=1– Sensi&vity=TP/(TP+FN)=9/10=90%– Specificity=TN/(TN+FP)=979/990=98.9%– Precision=TP/(TP+FP)=9/20=45%

Page 12: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– Whatisthefalseposi&verate?– Whatisthefalsenega&verate?

Page 13: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– Whatisthefalseposi&verate?– FPrate=#falseposi&vesdividedbythenumberoftotalposi&ves,soFP/(FP+TP)=11/20=55%

Page 14: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– Whatisthefalsenega&verate?– FNrate=#falsenega&vesdividedbythenumberoftotalnega&ves,soFN/(FN+TN)=1/990=0.1%

Page 15: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Hypothe&calExample

•  Thepopula&onis1,000samples•  10ofthemhavethedisease,990donot•  Thetestisposi&veon20:9ofthe10withthedisease,and11ofthe990whodonothavethedisease– Whatisthefalsenega&verate?– FNrate=#falsenega&vesdividedbythenumberoftotalnega&ves,soFN/(FN+TN)=1/990=0.1%

Page 16: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

GeneralIssues

•  Sofarwehavecomputedtreesandwehavecomputedalignments.

•  Howcanwequan&fyaccuracyorerror?Whatdatasetsshouldweuse?

•  Whataretheissues?

Page 17: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Performance criteria •  Running time •  Space •  Statistical performance issues (e.g., statistical

consistency and sequence length requirements) •  “Topological accuracy” with respect to the underlying

true tree, typically studied in simulation. •  Accuracy with respect to a mathematical score (e.g.

tree length or likelihood score) on real data

Page 18: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Sta&s&calConsistency

error

Data

Page 19: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

FN: false negative (missing edge) FP: false positive (incorrect edge)

FN

FP

50% error rate

Page 20: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

AlignmentError/Accuracy

•  SPFN:percentageofhomologiesinthetruealignmentthatarenotrecovered(falsenega&vehomologies)

•  SPFP:percentageofhomologiesinthees&matedalignmentthatarefalse(falseposi&vehomologies)

•  TC:totalnumberofcolumnscorrectlyrecovered•  SP-score:percentageofhomologiesinthetruealignmentthatarerecovered

•  Pairsscore:1-(avgofSP-FNandSP-FP)

Page 21: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

OtherAlignmentEs&ma&onCriteria

•  Treetopologyerror•  Treebranchlengtherror

•  Gaplengthdistribu&on•  Inser&on/dele&onra&o•  Alignmentlength•  Numberofindels

Page 22: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

StudyingMethods

•  Thepointistoevaluateanewmethodincomparisontopriormethods.

•  Youneedtodothisondata,notjustusingtheorems.

•  Howdoyoudothis?

Page 23: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Benchmarks

•  Simula&ons:cancontroleverything,andtruealignmentisnotdisputed– Differentsimulators

•  Biological:can’tcontrolanything,andreferencealignmentandreferencetreemightnotbecorrect.Alignmentbenchmarksarealsosomewhatproblema&c,forvariousreasons:–  BAliBASE,HomFam,Prefab–  CRW(Compara&veRibosomalWebsite)

Page 24: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

24 Brief introduction to phylogenetic estimation

Simulation Studies

S1 S2

S3 S4

S1 = -AGGCTATCACCTGACCTCCA S2 = TAG-CTATCAC--GACCGC-- S3 = TAG-CT-------GACCGC-- S4 = -------TCAC--GACCGACA

S1 = AGGCTATCACCTGACCTCCA S2 = TAGCTATCACGACCGC S3 = TAGCTGACCGC S4 = TCACGACCGACA

S1 = -AGGCTATCACCTGACCTCCA S2 = TAG-CTATCAC--GACCGC-- S3 = TAG-C--T-----GACCGC-- S4 = T---C-A-CGACCGA----CA

Compare

True tree and alignment

S1 S4

S3 S2

Estimated tree and alignment

Unaligned Sequences

Figure 1.6 A simulation study protocol. Sequences are evolved down a model tree under a processthat includes insertions and deletions; hence, the true alignment and true tree are known. An align-ment and tree are estimated on the generated sequences, and then compared to the true alignmentand true tree.

distance) between two trees is the number of non-trivial bipartitions that are present in oneor the other tree but not in both trees.

Each of these ways of quantifying error in an estimated tree can be normalized to pro-duce a proportion between 0 and 1 (equivalently, a percentage between 0 and 100). Forexample, the FN error rate would be the percentage of the non-trivial model tree biparti-tions that are not present in the estimated tree, and the FP error rate would be the percentageof the non-trivial bipartitions in the estimated tree that are not present in the model tree.Finally, the Robinson-Foulds error rate is the RF distance divided by 2n� 6, where nis the number of leaves in the model tree; note that 2n� 6 is the maximum possible RFdistance between two trees on the same set of n leaves.

Figure 1.7 provides an example of this comparison; note that the model tree (called thetrue tree in the figure) is rooted, but the inferred tree is unrooted. To compute the tree error,we unroot the true tree, and treat it only as an unrooted tree. Since both trees are binary(i.e., each non-leaf node has degree three), there are only two internal edges. Each of thetwo trees have the non-trivial bipartition separating S1,S2 from S3,S4,S5, but each tree alsohas a bipartition that is not in the other tree. Hence, the RF distance between the two treesis 2, out of a maximum possible of 4, and so the RF error rate is 50%. Note also that thereis one true positive edge and one false positive edge in the inferred tree, so that the inferredtree has FN and FP rates of 50%.

Page 25: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Designingasimula&onstudy

•  Considertherealismofthesimulator.•  Considerwhetherthecondi&onsaretooeasyortoodifficulttobehelpful.

•  Considerthecompe&ngmethodstoexplore.•  Considersta&s&calsignificance.•  Beconcernedwithrepeatability.

Page 26: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Data

•  Biologicaldata:– Howreliablearethereferencealignmentsandtrees?

•  Simulateddata: – Howrealis&carethesimula&oncondi&ons?

Page 27: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Simulators

•  Sequenceevolu&ondownatree:–  Indels?Ifso,whatlengths?– Subs&tu&onsunderwhatmodel?– Howmanysubs&tu&ons?Howmanyindels?– Howisthetreetopologyandsetofbranchlengthsdefined?

–  Isthetreeultrametric?– Howmanyleavesinthetree(i.e.,#sequences)?– Howlongarethesequences?

Page 28: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Methods

•  Areyoupickingthebestcompe&ngmethods?•  Areyourunningtheminthebestway?

Page 29: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Criteria

•  Areyouusingcriteriathatareconsideredappropriatebytheresearchcommunity?

•  Ifyouareusingnewcriteria,jus&fythesecriteria(andprobablyusethestandardcriteriaanyway).

Page 30: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Repeatability

•  Providefulldetailsabouthowyourantheanalysessothatthesameexperimentcouldbedonebythepersonreadingthepaper.

•  Saveyourdataandmakethemavailabletothereaders.

Page 31: Evaluang Methods - Tandy Warnowtandy.cs.illinois.edu/Method-Evaluation.pdf · Method-Evaluation.pptx Author: Tandy Warnow Created Date: 1/3/2017 2:07:09 PM

Wri&ngPapers

Read•  AppendixCinComputa&onalPhylogene&csforguidelinesaboutwri&ngpapersaboutcomputa&onalmethods.

•  “Howtowriteyourfirstpaper”–onmyhomepage

•  “Commonlyencounteredchallengesinresearchethics”–onmyhomepage