egasp 2005 evaluation protocol
DESCRIPTION
EGASP 2005 Evaluation Protocol. Paul Flicek EBI. Basics. The evaluations are probably wrong GTF is not standard There are hidden assumptions Filters, overlaps, clusters Terminology varies Genes, exons, etc. Evaluation Measures. Exons and introns Sensitivity (Sn) Specificity (Sp) - PowerPoint PPT PresentationTRANSCRIPT
EGASP 2005 EvaluationsEGASP 2005 Evaluations
BasicsBasics
The evaluations are probably wrong
GTF is not standard There are hidden assumptions
Filters, overlaps, clusters Terminology varies
Genes, exons, etc.
The evaluations are probably wrong
GTF is not standard There are hidden assumptions
Filters, overlaps, clusters Terminology varies
Genes, exons, etc.
EGASP 2005 EvaluationsEGASP 2005 Evaluations
Evaluation MeasuresEvaluation Measures
Exons and introns Sensitivity (Sn) Specificity (Sp) Exon length Exons per transcript
Transcript Sn / Sp Overlap
Gene Sn / Sp
Exons and introns Sensitivity (Sn) Specificity (Sp) Exon length Exons per transcript
Transcript Sn / Sp Overlap
Gene Sn / Sp
EGASP 2005 EvaluationsEGASP 2005 Evaluations
DefinitionsDefinitions
Positive Transcript Correct translation start Correct translation stop Every splice site correct
Positive Gene At least one positive transcript
Positive Transcript Correct translation start Correct translation stop Every splice site correct
Positive Gene At least one positive transcript
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 0.5Trans Sp = 1.0
Gene Sn = 1.0Gene Sp = 1.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 0.5Trans Sp = 1.0
Gene Sn = 1.0Gene Sp = 1.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 0.0Trans Sp = 0.0
Gene Sn = 0.0Gene Sp = 0.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 1.0Trans Sp = 1.0
Gene Sn = 1.0Gene Sp = 1.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 0.5Trans Sp = 0.5
Gene Sn = 1.0Gene Sp = 1.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExamplesExamples
Annotation
Prediction
Trans Sn = 1.0Trans Sp = 0.67
Gene Sn = 1.0Gene Sp = 1.0
EGASP 2005 EvaluationsEGASP 2005 Evaluations
The winners are… (there are clear
trends)
The winners are… (there are clear
trends) The most successful programs use expressed sequences
Programs using evolutionary conservation are more successful than those that do not
Exon and nucleotide measures are similar
We are improving
The most successful programs use expressed sequences
Programs using evolutionary conservation are more successful than those that do not
Exon and nucleotide measures are similar
We are improving
EGASP 2005EvaluationsBlock 1
EGASP 2005EvaluationsBlock 1
Paul FlicekEBI
Paul FlicekEBI
Expressed Sequence MethodsExpressed Sequence Methods
EGASP 2005 EvaluationsEGASP 2005 Evaluations
NucleotideNucleotide
0
10
20
30
40
50
60
70
80
90
100
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Nuc SnNuc Sp
EGASP 2005 EvaluationsEGASP 2005 Evaluations
ExonExon
0
10
20
30
40
50
60
70
80
90
100
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Exon SnExon Sp
EGASP 2005 EvaluationsEGASP 2005 Evaluations
IntronIntron
0
10
20
30
40
50
60
70
80
90
100
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Intron SnIntron Sp
EGASP 2005 EvaluationsEGASP 2005 Evaluations
GeneGene
0
10
20
30
40
50
60
70
80
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Gene SnGene Sp
EGASP 2005 EvaluationsEGASP 2005 Evaluations
Number of GenesNumber of Genes
0
100
200
300
400
500
600
700
5_66_38_22_39_99_314_87_320_78_334_55_39_100_420_76_437_83_442_53_444_97_49_98_227_95_228_44_29_101_120_79_136_46_141_77_1
1027 1389
EGASP 2005 EvaluationsEGASP 2005 Evaluations
Unique ExonsUnique Exons
0
1000
2000
3000
4000
5000
5_66_38_22_39_99_314_87_320_78_334_55_39_100_420_76_437_83_442_53_444_97_49_98_227_95_228_44_29_104_717_61_79_101_120_79_136_46_141_77_1
EGASP 2005 EvaluationsEGASP 2005 Evaluations
SummarySummary
0
10
20
30
40
50
60
70
80
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Gene SnGene Sp
0
10
20
30
40
50
60
70
80
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Trans SnTrans Sp
0
10
20
30
40
50
60
70
80
90
100
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Nuc SnNuc Sp
0
10
20
30
40
50
60
70
80
90
100
5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3
Exon SnExon Sp
EGASP 2005EvaluationsBlock 2
EGASP 2005EvaluationsBlock 2
Paul FlicekEBI
Paul FlicekEBI
Evolutionary Conservation (Dual/Multiple Genome) Methods
Evolutionary Conservation (Dual/Multiple Genome) Methods
EGASP 2005 EvaluationsEGASP 2005 Evaluations
SummarySummary
0
5
10
15
20
25
30
35
40
9_100_4 20_76_4 37_83_4 42_53_4 44_97_4
Trans SnTrans Sp
0
5
10
15
20
25
30
35
40
9_100_4 20_76_4 37_83_4 42_53_4 44_97_4
Gene SnGene Sp
0
10
20
30
40
50
60
70
80
90
100
9_100_4 20_76_4 37_83_4 42_53_4 44_97_4
Exon SnExon Sp
0
10
20
30
40
50
60
70
80
90
100
9_100_4 20_76_4 37_83_4 42_53_4 44_97_4
Nuc SnNuc Sp
EGASP 2005EvaluationsBlock 3a
EGASP 2005EvaluationsBlock 3a
Paul FlicekEBI
Paul FlicekEBI
Ab initio (single genome) and Exon only Methods
Ab initio (single genome) and Exon only Methods
EGASP 2005 EvaluationsEGASP 2005 Evaluations
SummarySummary
0
5
10
15
20
25
30
9_98_2 27_95_2 28_44_2 9_104_7 17_61_7
Gene SnGene Sp
0
5
10
15
20
25
30
9_98_2 27_95_2 28_44_2 9_104_7 17_61_7
Trans SnTrans Sp
0
20
40
60
80
100
9_98_2 27_95_2 28_44_2 9_104_7 17_61_7
Exon SnExon Sp
0
20
40
60
80
100
9_98_2 27_95_2 28_44_2 9_104_7 17_61_7
Nuc SnNuc Sp
EGASP 2005EvaluationsBlock 3b
EGASP 2005EvaluationsBlock 3b
Paul FlicekEBI
Paul FlicekEBI
Open (Any) MethodsOpen (Any) Methods
EGASP 2005 EvaluationsEGASP 2005 Evaluations
SummarySummary
0
10
20
30
40
50
60
70
80
90
100
9_101_1 20_79_1 36_46_1 41_77_1
Nuc SnNuc Sp
0
10
20
30
40
50
60
70
80
90
100
9_101_1 20_79_1 36_46_1 41_77_1
Exon SnExon Sp
0
10
20
30
40
50
60
70
80
9_101_1 20_79_1 36_46_1 41_77_1
Trans SnTrans Sp
0
10
20
30
40
50
60
70
80
9_101_1 20_79_1 36_46_1 41_77_1
Gene SnGene Sp