addressing the medical image annotation task using visual words representation

Addressing the Medical Image Annotation Task using visual words representation

Uri Avni , Tel Aviv University, IsraelHayit Greenspan Tel Aviv University, Israel Jacob Goldberger Bar Ilan University, Israel

Outlineo Challenge description

o Proposed systemo Image representationo classification

o Results o Parameters optimizationo Performance analysis

o Conclusion

ImageClef 2009 medical annotation challenge12,677 classified x-ray images, 1733 unknown

images

Classification according to four labeling sets:o 57 classes o 116 classeso 116 IRMA codeso 196 IRMA codes

• Noisy images• Irregular brightness, contrast• Non-uniform class distribution

IRMA database

The IRMA group - Aachen University of Technology (RWTH), Germany

0 20 40 60 80 100 1200

500

1000

1500

2000Training data categories

Category number

Freq

uenc

y

• Great intra-class variability

Category #: 1121-230-961-700 Sagittal, Mediolateral, Left hip

IRMA Database - samples

Category #1121-110-500-000overview imageposteroanterior (PA)

Category #1123-112-500-000high beam energy posteroanterior (PA),expiration

Category #1123-121-500-000high beam energyanteroposterior (AP),inspiration

Category #1121-127-500-000overview imageanteroposterior (AP), supine

IRMA Database - samples• Great inter-class

similarity

Outlineo Challenge description

o Proposed systemo Image representationo classification


o Conclusion

Image representation o Move from 2D image to a vector of numbers

o Representation should preserve enough information of the image content

o Should be not sensitive to translation, artifacts and noise

o Compare and classify the compact representation

0 100 200

0

0.02

0.04

Word number

Image model

Patch extraction

• Extract raw pixels from patches of fixed size

• Dense sampling, ~200,000 patches per image • Normalize intensity, variance

• Ignore empty patches• Sample several images – one collection with

millions of patches

Feature space description- Reduce dimension of the collection

- Add position (x,y) to the features, position weight is important

- 8 dimensional feature vector

9x9 pixels PCA 6 coefficients

Dictionary

Build dictionary• Select k feature vectors as far apart as

possible

• Run k-means clustering

Cluster centers , with x,yCluster centers

Image representation• Scan image – translate patches to words

histogram

Image Dictionary

0 50 1000

0.02

0.04

Word number

Probability

Image representation• Use multiple scales

0 50 1000

150 200 250 300

Classification• Examine knn classifier, with different distance metrics

2

2( , ) exp( )2i j

i jK

x x

x x

• Examine several SVM kernels:

• Radial basis function

• Chi-square

• Histogram intersection

• One-vs-one multiclass SVM classifier, with n(n-1)/2 binary classifiers

))(exp(),(2

n

in

in

in

in

ji xxxxxxK

),min(),( n

jn

inji xxxxK

Outlineo Our objective

o Proposed systemo Image representationo Retrieval & classification


o Conclusion and future work

Selecting classifier typeEffect of histogram distance metric in k-nearest neighbors vs svm classifier

SVM

Symmetric Kullback – Leibler divergence

Jeffery divergence

i i

ii

i

ii P

QQQPPQPSKL loglog),(

i ii

ii

ii

ii PQ

QQPQPPQPJD 2log2log),(

Selecting feature spaceEffect of parameters on classification accuracy, using 20 cross-validation experiments

0 2 4 6 885

86

87

88

89

90

91

92

93

(x,y) scale

% C

orre

ct

Spatial features

with x,y

No x,y 6 8 10

89

90

91

92

93

PCA components

% C

orre

ct

Selecting type of features - invariance / discriminative power tradeoff

Selecting features

Feature type Average % correct Standard dev

Raw patches 88.43 0.32SIFT* 90.80 0.41Normalized Patches 91.29 0.56

* Scale and rotation invariance are not desired

Running time

SIFT

Raw

0 100 200 300 400 500 600 700 800

Build dictionaryExtract featuresTrain classifierClassify

Minutes

12,677 imagesRunning on Intel daul quad core Xeon 2.33Ghz

Build dictionary

Extract features

Train classifier

classification time per image

Total (train + classify)

Raw 6 min 96.8 min 6 min 0.54 sec 126 minSIFT 10 min 597 min 6 min 3.32 sec 724 min

Selecting dictionary

200 400 600 800 1000 1200 1400 1600

88.5

89

89.5

90

90.5

91

91.5

92

Number of words

% C

orre

ctDictionary size

Selecting dictionary

200 400 600 800 1000 1200 1400 1600

88.5

89

89.5

90

90.5

91

91.5

92

Number of words

% C

orre

ctDictionary size

Using multiple dictionaries for 3 scales increases classification accuracy by 0.5%

200 400 600 800 1000 1200 1400 1600

88.5

89

89.5

90

90.5

91

91.5

92

Number of words

% C

orre

ctDictionary size

1 scale3 scales

Classification results – effect of kernelEffect of kernel function on SVM classifier, for optimal kernel parameters

Kernel Type % Correct1 Scale 3 Scales

Radial Basis 91.45 91.59Histogram Intersection 91.29 91.89Chi Square 91.62 91.95

Classification results – confusion matrixConfusion matrix of random 2000 test images (2007 labels)91.95% correct

Detected category

True

cat

egor

y

20 40 60 80 100

20

40

60

80

100

0

1

2

3

4

5

Submission to ImageClef 2009 medical annotation task

o One run submitted

o Use the same classifier for the 4 label sets (2005,2006,2007,2008)

o Ignore IRMA code hierarchy

o Don’t use wildcardsRun & error score

2005 2006 2007 2008 SUM

TAUbiomed 356 263 64.3 169.5 852.8

Conclusion & future worko Using visual words with simple

features and dense sampling is efficient and accurate in general x-ray annotation

o We are applying the system to pathology classifications of chest x-rays, together with Sheba Medical Center

Healthy Enlarged heart Lung filtrate Left+right effusion

Thank you.

addressing the medical image annotation task using visual words representation

Documents

image content

d image

image quality

medical image annotation

image representation

hip irma database samples

classified xray images

scanned xray images