linking images across text rebecka weegar | kalle astrom | pierre nugues cs671a paper presentation...
TRANSCRIPT
![Page 1: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/1.jpg)
LINKING IMAGES ACROSS TEXT
REBECKA WEEGAR | KALLE ASTROM | P IERRE NUGUES
CS671APaper Presentation by:Archit Rathore12152
![Page 2: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/2.jpg)
INTRODUCTION
a flat landscape with a dry meadow inthe foreground, a lagoon behind it andmany clouds in the sky
Annotated
Image
Caption
![Page 3: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/3.jpg)
GOAL
o Skyo Cloudo Landscap
eo Lagoono Meadow
To correctly map images and caption entities
![Page 4: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/4.jpg)
MOTIVATION
Improve image retrieval by complementing images with accompanying text
Mapping to translate knowledge and information across text and images
Deduce geometric/spatial relations in images using captions
![Page 5: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/5.jpg)
DATASET
•Segmented and Annotated IAPR TC-12 Benchmark data set (Escalantea et al., 2010) that consists of about 20,000 photographs with a wide variety of themes
•Each image has a short caption that describes its content, most often consisting of one to three sentences separated by semicolons
•Each region is labelled with one out of 275 predefined image labels
![Page 6: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/6.jpg)
PREPROCESSING
•The Stanford CoreNLP pipeline is applied to the captions to extract the entities.
•Consists of a part-of-speech tagger, lemmatizer, named entity recognizer (Finkel et al., 2005), dependency parser, and coreference solver.
•The cross product of all image entities with caption entities is taken to get (IEi , CEi ) pairs
•The test set is built by manually annotating 200 randomly selected images
![Page 7: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/7.jpg)
RANKING ENTITY PAIRSINITIAL RANKING
•Using Semantics distance: by employing 8 metrices provided by WS4J (PATH,WUP,RES,JCN,HSO,LIN,LCH and LESK)
•Using Statistical Associations: Co-occurrence counts, Pointwise Mutual Information (PMI) and simplified Student’s t-score
•All the pairs are ranked using the above 8+3 scoring functions
RERANKING
•The spatial features of images (topographical, horizontal and vertical) are taken into consideration on their own as well as were aggregated with corresponding syntactic features of the caption entity
![Page 8: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/8.jpg)
Reranking Example
![Page 9: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/9.jpg)
IMPROVING THE RESULTS
•Highest accuracy of 78.6% obtained by HSO similarity metric
•Further reranking done by creating an ensemble of the classifiers based on all scoring functions using a hard voting heuristic
•The number of votes for each classifier were picked from [0,3] and all possible permutations were tested
•The highest achieved accuracy thus obtained was 88.76%
![Page 10: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/10.jpg)
RESULTS
![Page 11: LINKING IMAGES ACROSS TEXT REBECKA WEEGAR | KALLE ASTROM | PIERRE NUGUES CS671A Paper Presentation by: Archit Rathore 12152](https://reader036.vdocuments.us/reader036/viewer/2022083010/5697bfc01a28abf838ca3acf/html5/thumbnails/11.jpg)
FIN
The slides are part of paper review for course CS671. All the work is authored by:
Weegar, Rebecka, Kalle Aström, and Pierre Nugues. "Linking Entities Across Images and Text." CoNLL 2015 (2015): 185.
This presentation, in no way, claims owernship of all or any content in the slides.