tunc gultekin eren golge havva gulay gurbuz ahmet iscencanf/cs533/hwspring13/five... · 21...

Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscen

Upload: others

Post on 13-Feb-2020

3 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Tunc Gultekin

Eren Golge

Havva Gulay Gurbuz

Ahmet Iscen

Goal & Contribution

Motivation

Test Environment

New Approach

Error Rate Calculation

Error Rates

Conclusion

Page 3: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

There are many evaluation measures, which one should we trust for comparing algorithms?

How do we interpret their results?

Are they stable?

A novel approach for quantifying errors of evaluation measures has been developed.

Page 4: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

To compare performances of different I.R algorithms, some experiments are performed on test collections.

Relative performances of algorithms are expressed with evaluation measures. ◦ Precision, recall...

Page 5: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

Evaluation of these measures rely on some rules of thumb; ◦ Experiments must use reasonable evaluation measures.

◦ Conclusions must be based on reasonable performace differences.

The meaning of «Reasonable» can be changed among people.

An objective decision making system is required.

Page 6: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

Precision(@), Recall(1000), Precision at 0.5 Recall, R-Precision, Average Precision methods were compared.

9 different I.R algorithms were used from TREC-8

21 different query set.

21 different query set run on 9 different I.R algorithms.

An evaluation measure was choosen and a fuzziness* value was defined.

A query set was selected and the mean of evaluation measure was computed over set for each of 9 retrieval methods.

*a threshold value that defines if the difference of measures are discriminative enough

Page 8: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

For each pair of retrieval methods, better method was found.

Another query set was selected and comparisons were repeated multiple times.

Results are presented in 9x9 comparison matrix.

Page 9: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

Page 10: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

For each cell in the matrix, greater value of better-than, worse than values were accepted as correct answer and other one is error.

Lesser values of all cells were summed and divided by total number of decisions. ◦ Error Rate for Average Precision Matrix;

16 / 756 = 0.021 = 2.1%

Error* = min(A>B,B>A)/ (A>B + B>A + A==B)

*if Error of the comparison matrix >~ 25% then discrimination converges to randomness

Page 11: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

Page 12: Tunc Gultekin Eren Golge Havva Gulay Gurbuz Ahmet Iscencanf/CS533/hwSpring13/five... · 21 different query set run on 9 different I.R algorithms. An evaluation measure was choosen

Error rates of evaluation measures inversely proportional with the topic set size.

Query sets should be carefully choosen. ◦ Something may be biased.

Recall(1000) is very stable but it appropriated for limited environments.

Average Precision is good for general purpose.

Precision at a cut off level is appropriate for web.

CYTOTOXICITY OF SILVER SULFIDE QUANTUM DOTS EVALUATED BY THE MTT CELL CULTURE ASSAY in HeLa CELLS Deniz Özkan Vardar 1, Ibrahim Hocaoğlu 2, Havva Funda

Challenges of Strategic Human Resource Management (SHRM)...International Journal of Project Management, 28, 4, 314-327. Gurbuz, S. and Mert, I. (2011) Impact of the Strategic Human

Epilepsy and Oral Health - Semantic Scholar...Epilepsy and Oral Health Taskin Gurbuz Department of Pedodontics, Faculty of Dentistry, Atatürk University, Erzurum Turkey 1. Introduction

Multistatic Ground-Penetrating Radar Experiments · Multistatic Ground-Penetrating Radar Experiments Tegan Counts, StudentMember,IEEE, Ali Cafer Gurbuz, StudentMember,IEEE, Waymond

The Lane Covering Problem - ISyE · The Lane Covering Problem Ozlem Ergun1, Gultekin Kuyzu, Martin Savelsbergh Industrial and Systems Engineering and The Logistics Institute Georgia

Compressive Sensing for GPR Imaging€¦ · Compressive Sensing for GPRImaging Ali Cafer Gurbuz James H. McClellan WaymondR. Scott Jr. School ofElectrical and ComputerEngineering

Chemical Kinetics Concepts Achievement Test …yunus.hacettepe.edu.tr/~cakmakci/ckcat.pdfChemical Kinetics Concepts Achievement Test (CKCAT) Gultekin Cakmakci B.Sc. (Ankara), M.A

Yucca Mountain Project Alparslan Gurbuz. Types of Waste To Be Stored SNF - spent nuclear fuel from nuclear reactors that hasn’t been reprocessed HLW

UNIVERSITY OF ARKANSAS FOR MEDICAL SCIENCES … · 2019-07-26 · ahmed magdy salem 55197 $408.38 privacy protection required *** $1,500.00 ahmet gurbuz 4010454 $687.30 ahsan bashir

Jahandoost Sabzalipour Havva Ashuri ShahrestaniHavva Ashuri Shahrestani2 1. Introduction Clothes is one of the elements that it’s recognition, conduce to recognizes the culture,

1 TechnoWallSt TechnoCap TechnoWallSt Estelle Havva Vice-présidente, TechnoWallSt [email protected]

RETROFIT DESIGN OF A SUSTAINABLE ECO … of Mechanical Engineering RETROFIT DESIGN OF A SUSTAINABLE ECO-OFFICE FROM AN EXISTING BUILDING Author: Sibel Gultekin Supervisor: Dr. Paul

Effects of Different Storage Conditions on the Nutritional ...tarimbilimleri.agri.ankara.edu.tr/2015/21_2/10.makale.pdf · Barley and Maize Havva Eylem POLATa aAnkara University,

A Case Report of Myocarditis and Sinus Arrest · A Case Report of Myocarditis and Sinus Arrest Filiz Kizilirmak Yılmaz*, Gultekin Gunhan Demir, Mehmet Onur Omaygenç, Bilal Boztosun

Surgical Treatment of Displaced Abomasum in Cattle Using ... · Surgical Treatment of Displaced Abomasum in Cattle Using Ljubljana Method Jože Starič1, Halil Selcuk Biricik2, Gurbuz

GEA GEOEXPO+ - IRENA Sacramento GGA 25 Oct.pdf · GEA GEOEXPO+ GGA Working Session GLOBAL GEOTHERMAL ALLIANCE: A GLOBAL PARTNERSHIP FOR GEOTHERMAL ENERGY Gurbuz Gonul Senior Programme

Dynamic Resource Scheduling for Variable QoS Traffic in W …Dynamic Resource Scheduling for Variable QoS Traffic in W-CDMA Ozgur Gurbuz, Henry Owen School of Electrical and Computer

rev - The Royal Society of · PDF file1 rev SUPPLEMENTARY DATA Title: Dispersive and FT-Raman Spectroscopic Methods in Food Analysis Authors Ismail Hakki Boyaci1,2*, Havva Tümay Temiz1,

Temporal and Real-Time Databases: A Survey by Gultekin Ozsoyoglu and Richard T. Snodgrass Presentation by Didi Yao

ANDROID DATABASE TUTORIAL - Muaz Gultekin - Blog · Introduction Android provides several ways to store user and app data. SQLite is one way of storing user data. SQLite is a very

IRENA Project Facilitation Tools · IRENA Project Facilitation Tools Gurbuz Gonul, Acting Director Country Support and Partnerships, IRENA ... Maldives 1 MW Waste to energy (on-grid)

An Empirical Study on Identifying the Performance ...kosbed.kocaeli.edu.tr/sayi13/gurbuz-dikmenli.pdf · potheses using sample data collected from 150 ... ganization should give performance

ERP Software Selection Model for Small and Medium-Sized in ... · factors in the study, which are>>> (Balanced project team, Project management, and Change management). Gurbuz et

SMHS M+H Spring2015-042915 - Health Digital Magazine ......SENIOR EDITOR Laura Otto Caroline Trent-Gurbuz CONTRIBUTING WRITERS Steve Goldstein, Christian Mysliwiec, and Robin Tricoles

MEASUREMENT OF AIR TOXICS IN THE CABINS OF COMMUTER VEHICLES UNDER SUMMER AND WINTER CONDITIONS IN OTTAWA, CANADA Deniz Karman, Oznur Oguz (*), and Gultekin

REKTÖR GÜNDOĞAN TÜBA ÜYELİK BERATINI BAKAN FARUK … · Basın ve Halkla İlişkiler Müdürlüğü Telefon: 0.222 335 05 80 - 2484 Bilim ve Teknoloji Havva ŞEKERCİOĞLU Çevre

ERKUT GULTEKIN - ScriptieBank€¦ · with the Designs of Gustave Doré (New York: Cassell Publishing Company, 1890) 1 From the last four cantos in Purgatory until the end of Paradise

Kinetic and Mechanistic Assessment of Alkanol/Alkanal …iglesia.cchem.berkeley.edu/Publications/2015 Gurbuz... · 2015-11-13 · Kinetic and Mechanistic Assessment of Alkanol/Alkanal

Havva Alizadeh Ferdowsi University of Mashhad, WTLab Spring 2011 1

MBA: Mini-Batch AUC Optimization - arXiv · MBA: Mini-Batch AUC Optimization San Gultekin, Avishek Saha, Adwait Ratnaparkhi, and John Paisley Abstract Area under the receiver operating

Strategic Planning Presentation Rıza GURBUZ, Çankırı Karatekin University August 24,Sep. 01 2015,Çankırı

Leadership And Human Development - Official Site · Leadership And Human Development Oscar Arias ... Faika Bürkan and Havva Salman. From The School of ... (England), Mr. Leong Sze

Wear and Friction Behavior of Metal Impregnated … Wear and Friction Behavior of Metal Impregnated Microporous Carbon Composites GULTEKIN GOLLER, D.P. KOTY, S.N. TEWARI, M. SINGH,

SEVINDIRICI HABER’IN TEMEL BILGILERIBölüm Adem ve Havva ... ve onlar hakkında dua ederken bunların doğru olduğunu kendiliğiniz-den bileceksiniz ve Kutsal Ruh size bunların

QUINOLINE EUhhAhEE h - Defense Technical … under industrial conditions results In the ... HDN by Satterfield and Gultekin' have shown that ... catalysts,17 It was thought that perhaps