mathematical model for the law of comparative judgment in print sample evaluation mai zhou dept. of...
TRANSCRIPT
![Page 1: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/1.jpg)
Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation
Mai Zhou Dept. of Statistics, University of Kentucky
Luke C.CuiLexmark International Inc.
![Page 2: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/2.jpg)
The Problem:When evaluating several print samples, pair-wise
comparison experiments are often used. Two print samples at a time are judged by a human
subject to determine which print sample is “better”.
This is repeated with different pairs and different subjects.
The resulting data will look like:
/ 5 4 37 6 / / 7 45 28 / / / 46 40 / / / / 4
![Page 3: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/3.jpg)
How to Summarize the data;
Order the print samples in terms of “strength”;
Margin of error in the analysis/conclusion.
Predict the outcome of future comparisons.
![Page 4: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/4.jpg)
Outline of talk
Introduction to
Thurstone/Mosteller Model
New model, theoretical formulation
Var-Cor modeling, Maximum Likelihood Estimation,
Likelihood ratio confidence interval
New model, application to experimental dataComparisons with classical model, how good is the fit?Discussion
![Page 5: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/5.jpg)
For pairwise comparisons of stimuli i and k, the observable outcomes are the signs of
and the outcomes from different pairs are
independent.(but within the pair, they may or may not be
independent). Assume
Where N( , ) denotes the normal distribution.
)( ki TT
),(~)( 2jkkiki ssNTT
![Page 6: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/6.jpg)
If we observed the outcomes of many pairs, the log likelihood function is
where
And is the cdf of the standard normal distribution (available in many software
packages).
ik
kijkki
ssssNPwinP
)(0
1)0),(()(
pairs
ikik losePLwinPWllk )(log)(log
)(t
![Page 7: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/7.jpg)
Where W (or L) is the times stimulus i is deemed better (or worse) than stimulus k in the pair-wise comparisons.
The classical model assumes
The new model we propose assumes for the variances
kjjkkjjk r 2222
),( kjjk ssf
![Page 8: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/8.jpg)
Because the human perceptual process is highly adaptive and is at its best when used as a null tester, ie, more sensitive for closely matched stimuli. Thus the variances should be related to how closely the strengths are matched. e.g. ||1 kjjk ss
![Page 9: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/9.jpg)
Computation
Use software Splus (commercial) or R (Gnu) or Mathcad (commercial) or Matlab (commercial) or SAS (commerical)
1. Define the log likelihood function llk() as a function of the parameters.
![Page 10: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/10.jpg)
2. Maximize the llk() or minimize the negative of llk() by using the optimization functions supplied.
In R the optimize functions are: nlm( ) optim( )
In SAS iml we could use function nlptr( )
![Page 11: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/11.jpg)
The parameter values that achieve the maximization (max1) are the estimate of the parameters.Confidence interval of the parameter can be obtained by temporarily fix the value of the parameter at and maximize over the remaining parameters. Suppose it achieved maximum value max2.those values for which
max1 – max2 < 3.84/2
is the 95% confidence interval for the parameter.
t
t
![Page 12: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/12.jpg)
Example: Colorfulness data
Nine print samples were compared. Pairwise experiment, 50 subjects
![Page 13: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/13.jpg)
Models fitted are: 1. Classic model with equal
variances. 2. New model
||1 kjjk ss
![Page 14: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/14.jpg)
Models fitted are:2. New model
0
1
2
3
0 1 2 3
Diff erence in scaled values
![Page 15: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/15.jpg)
Differences: (predicted – observed) Model 1
0 10 20 30 40 50 60 70
-0.2
-0.1
0.0
0.1
0.2
Index
pair
data
7/50
- v
fn72
pred
(est
1)
![Page 16: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/16.jpg)
Differences: (predicted – observed) Model 2 with one more para.
0 10 20 30 40 50 60 70
-0.2
-0.1
0.0
0.1
0.2
Index
pair
data
7/50
- v
fn75
pred
(est
2)
![Page 17: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/17.jpg)
Differences: (predicted – observed) Model 1 vs 2
![Page 18: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/18.jpg)
We also fit Bradley-Terry model to the data (use SAS) and the fit is similar to the classic model.
![Page 19: Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark](https://reader036.vdocuments.us/reader036/viewer/2022082816/56649f455503460f94c66d04/html5/thumbnails/19.jpg)
References1. Peter, G. Engeldrum, Psychometric scaling, A toolkit for imaging system development, Imcotek press. (2000) 2. Torgerson, W.S. Theory and methods of scaling, John Wiley & Sons, Inc. (1958)3. Bradley, R.A. and Terry, M. E. "Rank analysis of incomplete block design. I. The method of paired comparisons." Biometrika 39, 324-345. (1952)4. P. Hall and B. La Scala, Methodology and algorithms of empirical likelihood, International Statistical Review, 58, 109-127. (1990)
R: http://cran.us.r-project.orgUpdated manuscript: http://www.ms.uky.edu/~mai/research/
Acknowledgements: We would like to thank Dr. Shaun Love at Lexmark International Inc. for helpful discussions.