analysis of similarity measures between short text for the...
TRANSCRIPT
AnalysisofSimilarityMeasuresbetweenShortTextforthe NTCIR-12ShortTextConversationTask
KozoChikai andYukiAraseGraduateschoolofinformationscienceandtechnology,OsakaUniversity
IntroductionGoalGivenaninputpost,searchappropriaterepliesfromthepool.
Approachwecomparethestate-of-the-artmethodsforestimatingtextsimilaritiestoinvestigatetheirperformanceinhandlingshorttext,specially,underthescenarioofshorttextconversation.
SystemDesignWeimplementedthesefivemethodsasSimilarityCalculation.
① TF-IDF② WeightedTextMatrixFactorization③ Word2vec→ TF-IDF④ RandomForest→ TF-IDF⑤ RandomForest+ TF-IDF
Results
00.050.10.150.20.250.30.350.4
ndcg-1 nerr-5 accuracy2-1 accuracy2-5 accuracy12-1 accuracy12-5
Meanoflabels
EvaluationMethod
Resultofexperiments
Oni-J-R3(①TF-IDF) Oni-J-R5(②WTMF)Oni-J-R4(③Word2Vec→TF-IDF) Oni-J-R1(④RandomForest→TF-IDF)Oni-J-R2(⑤RandomForest+TF-IDF)
0.3
0.5
0.7
0.9
rank1 rank2 rank3 rank4 rank5
Meanoflabe
ls
Order
Meanoflabelsforeachrank
Oni-J-R3(①TF-IDF) Oni-J-R5(②WTMF)Oni-J-R4(③Word2Vec→TF-IDF) Oni-J-R1(④RandomForest→TF-IDF)Oni-J-R2(⑤RandomForest+TF-IDF)