mathematical model for the law of comparative judgment in print sample evaluation mai zhou dept. of...

Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation

Mai Zhou Dept. of Statistics, University of Kentucky

Luke C.CuiLexmark International Inc.

The Problem:When evaluating several print samples, pair-wise

comparison experiments are often used. Two print samples at a time are judged by a human

subject to determine which print sample is “better”.

This is repeated with different pairs and different subjects.

The resulting data will look like:

/ 5 4 37 6 / / 7 45 28 / / / 46 40 / / / / 4

How to Summarize the data;

Order the print samples in terms of “strength”;

Margin of error in the analysis/conclusion.

Predict the outcome of future comparisons.

Outline of talk

Introduction to

Thurstone/Mosteller Model

New model, theoretical formulation

Var-Cor modeling, Maximum Likelihood Estimation,

Likelihood ratio confidence interval

New model, application to experimental dataComparisons with classical model, how good is the fit?Discussion

For pairwise comparisons of stimuli i and k, the observable outcomes are the signs of

and the outcomes from different pairs are

independent.(but within the pair, they may or may not be

independent). Assume

Where N( , ) denotes the normal distribution.

)( ki TT

),(~)( 2jkkiki ssNTT

If we observed the outcomes of many pairs, the log likelihood function is

where

And is the cdf of the standard normal distribution (available in many software

packages).

ik

kijkki

ssssNPwinP

)(0

1)0),(()(

pairs

ikik losePLwinPWllk )(log)(log

)(t

Where W (or L) is the times stimulus i is deemed better (or worse) than stimulus k in the pair-wise comparisons.

The classical model assumes

The new model we propose assumes for the variances

kjjkkjjk r 2222

),( kjjk ssf

Because the human perceptual process is highly adaptive and is at its best when used as a null tester, ie, more sensitive for closely matched stimuli. Thus the variances should be related to how closely the strengths are matched. e.g. ||1 kjjk ss

Computation

Use software Splus (commercial) or R (Gnu) or Mathcad (commercial) or Matlab (commercial) or SAS (commerical)

1. Define the log likelihood function llk() as a function of the parameters.

2. Maximize the llk() or minimize the negative of llk() by using the optimization functions supplied.

In R the optimize functions are: nlm( ) optim( )

In SAS iml we could use function nlptr( )

The parameter values that achieve the maximization (max1) are the estimate of the parameters.Confidence interval of the parameter can be obtained by temporarily fix the value of the parameter at and maximize over the remaining parameters. Suppose it achieved maximum value max2.those values for which

max1 – max2 < 3.84/2

is the 95% confidence interval for the parameter.

t

t

Example: Colorfulness data

Nine print samples were compared. Pairwise experiment, 50 subjects

Models fitted are: 1. Classic model with equal

variances. 2. New model

||1 kjjk ss

Models fitted are:2. New model

0

1

2

3

0 1 2 3

Diff erence in scaled values

Differences: (predicted – observed) Model 1

0 10 20 30 40 50 60 70

-0.2

-0.1

0.0

0.1

0.2

Index

pair

data

7/50

- v

fn72

pred

(est

1)

Differences: (predicted – observed) Model 2 with one more para.

0 10 20 30 40 50 60 70

-0.2

-0.1

0.0

0.1

0.2

Index

pair

data

7/50

- v

fn75

pred

(est

2)

Differences: (predicted – observed) Model 1 vs 2

We also fit Bradley-Terry model to the data (use SAS) and the fit is similar to the classic model.

References1. Peter, G. Engeldrum, Psychometric scaling, A toolkit for imaging system development, Imcotek press. (2000) 2. Torgerson, W.S. Theory and methods of scaling, John Wiley & Sons, Inc. (1958)3. Bradley, R.A. and Terry, M. E. "Rank analysis of incomplete block design. I. The method of paired comparisons." Biometrika 39, 324-345. (1952)4. P. Hall and B. La Scala, Methodology and algorithms of empirical likelihood, International Statistical Review, 58, 109-127. (1990)

R: http://cran.us.r-project.orgUpdated manuscript: http://www.ms.uky.edu/~mai/research/

Acknowledgements: We would like to thank Dr. Shaun Love at Lexmark International Inc. for helpful discussions.

mathematical model for the law of comparative judgment in print sample evaluation mai zhou dept. of...

Documents