determining the importance of figures in journal articles to find representative images

16
Determining the importance of figures in journal articles to find representative images Henning Müller, Antonio Foncubierta-Rodriguez, Chang Lin, Ivan Eggel

Upload: university-of-applied-sciences-western-switzerland

Post on 04-Jun-2015

270 views

Category:

Technology


1 download

DESCRIPTION

When physicians are searching for articles in the medical literature, images of the articles can help determining relevance of the article content for a specific information need. The visual image representation can be an advantage in effectiveness (quality of found articles) and also in efficiency (speed of determining relevance or irrelevance) as many articles can likely be excluded much quicker by looking at a few representative images. In domains such as medical information retrieval, allowing to determine relevance quickly and accurately is an important criterion. This becomes even more important when small interfaces are used as it is frequently the case on mobile phones and tablets to access scientific data whenever information needs arise. In scientific articles many figures are used and particularly in the biomedical literature only a subset may be relevant for determining the relevance of a specific article to an information need. In many cases clinical images can be seen as more important for visual appearance than graphs or histograms that require looking at the context for interpretation. To get a clearer idea of image relevance in articles, a user test with a physician was performed who classified images of biomedical research articles into categories of importance that can subsequently be used to evaluate algorithms that automatically select images as representative examples. The manual sorting of images of 50 journal articles of BioMedCentral with each containing more than 8 figures by importance also allows to derive several rules that determine how to choose images and how to develop algorithms for choosing the most representative images of specific texts. This article describes the user tests and can be a first important step to evaluate automatic tools to select representative images for representing articles and potentially also images in other contexts, for example when representing patient records or other medical concepts when selecting images to represent RadLex terms in tutorials or interactive interfaces for example. This can help to make the image retrieval process more efficient and effective for physicians.

TRANSCRIPT

Page 1: Determining the importance of figures in journal articles to find representative images

Determining the importance of figures

in journal articles to find representative

images

Henning Müller, Antonio

Foncubierta-Rodriguez,

Chang Lin, Ivan Eggel

Page 2: Determining the importance of figures in journal articles to find representative images

Introduction

Images are essential for diagnosis and treatment planning

Diversity of modalities and protocols is growing

Medical images represent largest amount of medical information produced

Medical literature carries much of the medical knowledge

Images are essential for medical articles

Images can help understand content of an article quickly

Much quicker than reading a text to determine relevance

Search in the literature is frequent

2

Page 3: Determining the importance of figures in journal articles to find representative images

Motivation

Image retrieval and classification make medical visual content accessible for search

Images from PACS or from the medical literature (as

in the ImageCLEF image retrieval campaign)

When searching the image needs to be put into context

Text and visual data to understand context

Often more than a single image

Identification of important figures

Use of visual content to determine relevance quickly

Limit search to important figures

3

Page 4: Determining the importance of figures in journal articles to find representative images

Examples for using figure importance

Interfaces and visualization require choices

Mobile information search has small interface

Creation of ground truth and rules for importance ranking to select in the best way

Often the first N are taken

4

Page 5: Determining the importance of figures in journal articles to find representative images

Database used

ImageCLEF 2012 dataset

~75,000 articles of PubMed Central, ~300,000 images

Subset was used

~800 articles contained more than 8 images

Final choice of 50 articles containing 641 diagnostic

images

5

Page 6: Determining the importance of figures in journal articles to find representative images

Types of figure importance

Five categories – sorted by importance

Most important figure – 1 figure that best

describes the article

Essential figures – 0-3 figures

Important figures – 0-3 figures

Little importance figures – 0+ figures

Totally unimportant figures – 0+ figures

6

Page 7: Determining the importance of figures in journal articles to find representative images

User test

User test participant profile:

45-year-old physician, surgeon

Working with medical images on a daily basis

User test scenario:

Searching for relevant articles on the topic of the

text

Which figures contain the most important

information about the article?

Requires to read and understand article

All systematic findings or rules for the ranking were written down

7

Page 8: Determining the importance of figures in journal articles to find representative images

Results

All 50 articles were analyzed and all 641 images manually ranked by importance

Usually over 1 hour per article

Rules and findings were noted and discussed

Statistics on figure importance and places in the text were made

Compound figures were difficult to judge

Information content is high as they contain different

subfigures and links between them

Compound figure separation new sub task in

ImageCLEF 8

Page 9: Determining the importance of figures in journal articles to find representative images

Rules for determining importance

Diagnostic images are more important than graphs/system overviews

Diversity in the results set is important

Not several times the same modality if others exist

Size/resolution matters

Image quality is importance

Position in article

Most importance in the conclusions, then results

Common vs. rare modality

Common modalities are often easier to understand

9

Page 10: Determining the importance of figures in journal articles to find representative images

More rules for the importance

Magnification

Highest magnification most important

MRI for soft tissue diseases, CT for bone related ones

Modality adapted to the disease, can be derived

from RadLex terms

Compound vs. other

Compound figure contains much information

In non-diagnostic images, statistical graphs most important

10

Page 11: Determining the importance of figures in journal articles to find representative images

Results

Distribution of figure importance

11

Page 12: Determining the importance of figures in journal articles to find representative images

Results

Importance vs. position of the figure in the article based on relative position in the text

12

Page 13: Determining the importance of figures in journal articles to find representative images

Example for figure importance

13

Page 14: Determining the importance of figures in journal articles to find representative images

Conclusions

Relative importance of visual content in scientific articles is important for many applications

First study to investigate this in a user test

Can improve IR system performance

Rank images based on position or remove some

images to search a smaller sub space

Can improve visualization

For small interfaces such as mobile phones and

tablets

To combine text and images in an optimal way

14

Page 15: Determining the importance of figures in journal articles to find representative images

Future work

Repeat the same test with a second person to measure inter-rater disagreement

Quantify the subjectivity of the task

Model the rules and optimize an automatic selection strategy

Modality classification to filter out unwanted

images

Clustering and use of cluster centers to increase

diversity

Find the place of figures in the text and rank

based on this 15

Page 16: Determining the importance of figures in journal articles to find representative images

Thank you for your attention!

Further information:

[email protected]

http://medgift.hevs.ch/

http://www.khresmoi.eu/

http://www.mitime.org/width/

16