egasp 2005 evaluation protocol

31
EGASP 2005 Evaluation Protocol Paul Flicek EBI

Upload: ginger-mcdowell

Post on 30-Dec-2015

29 views

Category:

Documents


0 download

DESCRIPTION

EGASP 2005 Evaluation Protocol. Paul Flicek EBI. Basics. The evaluations are probably wrong GTF is not standard There are hidden assumptions Filters, overlaps, clusters Terminology varies Genes, exons, etc. Evaluation Measures. Exons and introns Sensitivity (Sn) Specificity (Sp) - PowerPoint PPT Presentation

TRANSCRIPT

EGASP 2005EvaluationProtocol

EGASP 2005EvaluationProtocol

Paul FlicekEBI

Paul FlicekEBI

EGASP 2005 EvaluationsEGASP 2005 Evaluations

BasicsBasics

The evaluations are probably wrong

GTF is not standard There are hidden assumptions

Filters, overlaps, clusters Terminology varies

Genes, exons, etc.

The evaluations are probably wrong

GTF is not standard There are hidden assumptions

Filters, overlaps, clusters Terminology varies

Genes, exons, etc.

EGASP 2005 EvaluationsEGASP 2005 Evaluations

Evaluation MeasuresEvaluation Measures

Exons and introns Sensitivity (Sn) Specificity (Sp) Exon length Exons per transcript

Transcript Sn / Sp Overlap

Gene Sn / Sp

Exons and introns Sensitivity (Sn) Specificity (Sp) Exon length Exons per transcript

Transcript Sn / Sp Overlap

Gene Sn / Sp

EGASP 2005 EvaluationsEGASP 2005 Evaluations

DefinitionsDefinitions

EGASP 2005 EvaluationsEGASP 2005 Evaluations

DefinitionsDefinitions

Positive Transcript Correct translation start Correct translation stop Every splice site correct

Positive Gene At least one positive transcript

Positive Transcript Correct translation start Correct translation stop Every splice site correct

Positive Gene At least one positive transcript

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 0.5Trans Sp = 1.0

Gene Sn = 1.0Gene Sp = 1.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 0.5Trans Sp = 1.0

Gene Sn = 1.0Gene Sp = 1.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 0.0Trans Sp = 0.0

Gene Sn = 0.0Gene Sp = 0.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 1.0Trans Sp = 1.0

Gene Sn = 1.0Gene Sp = 1.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 0.5Trans Sp = 0.5

Gene Sn = 1.0Gene Sp = 1.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExamplesExamples

Annotation

Prediction

Trans Sn = 1.0Trans Sp = 0.67

Gene Sn = 1.0Gene Sp = 1.0

EGASP 2005 EvaluationsEGASP 2005 Evaluations

The winners are… (there are clear

trends)

The winners are… (there are clear

trends) The most successful programs use expressed sequences

Programs using evolutionary conservation are more successful than those that do not

Exon and nucleotide measures are similar

We are improving

The most successful programs use expressed sequences

Programs using evolutionary conservation are more successful than those that do not

Exon and nucleotide measures are similar

We are improving

EGASP 2005 EvaluationsEGASP 2005 Evaluations

Spear Catching TimeSpear Catching Time

EGASP 2005EvaluationsBlock 1

EGASP 2005EvaluationsBlock 1

Paul FlicekEBI

Paul FlicekEBI

Expressed Sequence MethodsExpressed Sequence Methods

EGASP 2005 EvaluationsEGASP 2005 Evaluations

NucleotideNucleotide

0

10

20

30

40

50

60

70

80

90

100

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Nuc SnNuc Sp

EGASP 2005 EvaluationsEGASP 2005 Evaluations

ExonExon

0

10

20

30

40

50

60

70

80

90

100

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Exon SnExon Sp

EGASP 2005 EvaluationsEGASP 2005 Evaluations

IntronIntron

0

10

20

30

40

50

60

70

80

90

100

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Intron SnIntron Sp

EGASP 2005 EvaluationsEGASP 2005 Evaluations

GeneGene

0

10

20

30

40

50

60

70

80

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Gene SnGene Sp

EGASP 2005 EvaluationsEGASP 2005 Evaluations

Number of GenesNumber of Genes

0

100

200

300

400

500

600

700

5_66_38_22_39_99_314_87_320_78_334_55_39_100_420_76_437_83_442_53_444_97_49_98_227_95_228_44_29_101_120_79_136_46_141_77_1

1027 1389

EGASP 2005 EvaluationsEGASP 2005 Evaluations

Unique ExonsUnique Exons

0

1000

2000

3000

4000

5000

5_66_38_22_39_99_314_87_320_78_334_55_39_100_420_76_437_83_442_53_444_97_49_98_227_95_228_44_29_104_717_61_79_101_120_79_136_46_141_77_1

EGASP 2005 EvaluationsEGASP 2005 Evaluations

SummarySummary

0

10

20

30

40

50

60

70

80

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Gene SnGene Sp

0

10

20

30

40

50

60

70

80

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Trans SnTrans Sp

0

10

20

30

40

50

60

70

80

90

100

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Nuc SnNuc Sp

0

10

20

30

40

50

60

70

80

90

100

5_66_3 8_22_3 9_99_3 14_87_3 20_78_3 34_55_3

Exon SnExon Sp

EGASP 2005EvaluationsBlock 2

EGASP 2005EvaluationsBlock 2

Paul FlicekEBI

Paul FlicekEBI

Evolutionary Conservation (Dual/Multiple Genome) Methods

Evolutionary Conservation (Dual/Multiple Genome) Methods

EGASP 2005 EvaluationsEGASP 2005 Evaluations

SummarySummary

0

5

10

15

20

25

30

35

40

9_100_4 20_76_4 37_83_4 42_53_4 44_97_4

Trans SnTrans Sp

0

5

10

15

20

25

30

35

40

9_100_4 20_76_4 37_83_4 42_53_4 44_97_4

Gene SnGene Sp

0

10

20

30

40

50

60

70

80

90

100

9_100_4 20_76_4 37_83_4 42_53_4 44_97_4

Exon SnExon Sp

0

10

20

30

40

50

60

70

80

90

100

9_100_4 20_76_4 37_83_4 42_53_4 44_97_4

Nuc SnNuc Sp

EGASP 2005EvaluationsBlock 3a

EGASP 2005EvaluationsBlock 3a

Paul FlicekEBI

Paul FlicekEBI

Ab initio (single genome) and Exon only Methods

Ab initio (single genome) and Exon only Methods

EGASP 2005 EvaluationsEGASP 2005 Evaluations

SummarySummary

0

5

10

15

20

25

30

9_98_2 27_95_2 28_44_2 9_104_7 17_61_7

Gene SnGene Sp

0

5

10

15

20

25

30

9_98_2 27_95_2 28_44_2 9_104_7 17_61_7

Trans SnTrans Sp

0

20

40

60

80

100

9_98_2 27_95_2 28_44_2 9_104_7 17_61_7

Exon SnExon Sp

0

20

40

60

80

100

9_98_2 27_95_2 28_44_2 9_104_7 17_61_7

Nuc SnNuc Sp

EGASP 2005EvaluationsBlock 3b

EGASP 2005EvaluationsBlock 3b

Paul FlicekEBI

Paul FlicekEBI

Open (Any) MethodsOpen (Any) Methods

EGASP 2005 EvaluationsEGASP 2005 Evaluations

SummarySummary

0

10

20

30

40

50

60

70

80

90

100

9_101_1 20_79_1 36_46_1 41_77_1

Nuc SnNuc Sp

0

10

20

30

40

50

60

70

80

90

100

9_101_1 20_79_1 36_46_1 41_77_1

Exon SnExon Sp

0

10

20

30

40

50

60

70

80

9_101_1 20_79_1 36_46_1 41_77_1

Trans SnTrans Sp

0

10

20

30

40

50

60

70

80

9_101_1 20_79_1 36_46_1 41_77_1

Gene SnGene Sp