11 prior distribution elicitation for generalized linear and piecewise-linear models paul garthwaite...

38
1 Prior Distribution Elicitation for Generalized Linear and Piecewise-Linear Models Paul Garthwaite and Fadlalla Elfadaly Open University

Post on 20-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

11

Prior Distribution Elicitation forGeneralized Linear and Piecewise-Linear

Models

Paul Garthwaite and Fadlalla Elfadaly

Open University

22

Why piecewise-linear models?

• Initial motivation for this model came from the need to model ecologists’ opinion about the presence/absence of rare and endangered animals.

• (A good example where expert opinion is useful –the ecologists had sightings of rare species but the data was not from a sampling frame and hence hard to incorporate in a statistical analysis.)

• For most variables there was an optimum value for a species. E.g. too hot or too cold did not suit it; nor too wet or too dry, etc.

3

4

Sampling ModelLogistic model: y = ln( p/(1-p)) = β0 + β1 x1 + …+ βk xk .

GLM:

y = g(μ) = β0 + β1 x1 + …+ βk xk .

Strategy: Elicit quantiles of p or μ and transform the assessments to quantiles of y.

Prior model: β ~ multivariate normal.

Three software implementations of the method:Garthwaite (1998: Visual Basic)

Kynn (2004: Pascal. Elicitor)

Elfadaly, Jenkinson, Garthwaite and Laney (2007/9: JAVA)

(The programs of Garthwaite and Kynn only handle logistic regression.)

55

Assessments at reference point

Scene-setting questions determine the number of variables and factors, their ranges and also a reference point.

The reference point is chosen by the expert and gives the origin of variables and the reference level of factors.

For a continuous variable it is assumed that opinion about slopes on one side of the reference point is independent of opinion about slopes on the other side.

With the methods of Garthwaite and Elfadaly et al., median, lower and upper quartiles of the response at the reference point are assessed.

66

Lower and upper quartiles have the advantage that they can be assessed by the method of bisection.

L M U 25% 25% 25% 25%

_________________________________ 0 0.3 1.0

7

Elicitor is much more flexible.

For assessing the median, some techniques that can be used with logistic regression are available to the expert:Visual aids such as a probability wheel can be used. Probabilities can be given by first stating a (large) sample size and then assessing the number in that sample with the characteristic of interest.Scales marked in odds or log-odds can also be used.

For credible intervals, intervals other than 50% intervals can be specified and a form of fixed interval method is also advocated.

88

Median Assessments

Medians are assessed for one covariate at a time.

The expert is asked to assume that all other covariates are at their reference values and to consider how the response varies with the covariate of current interest.

The expert clicks on a graph to draw a curve for covariates or a bar chart for factors.

(This is a poor approach to designing experiments but has clear benefits when eliciting expert opinion.)

9

10

• The number of knots does not seem crucial.

• Elicitor gives the option of fitting a linear or quadratic function to the medians.

• Garthwaite (1998) gave option of superimposing graphs to help improve the expert’s internal consistency across covariates.

(In forming models we almost always adopt linear relationships as the building blocks. Elicited piecewise linear relationships could instead be used as the building blocks.)

11

12

Feedback• Feedback is generally beneficial.

• Useful to display the median estimate at other design points, other than those points where all but one of the covariates are at their reference values.

• Mason (2008) used Elicitor to question an expert about non-random non-response in a longitudinal survey.

• Reference point was for best response-rate. Worst case setting of the covariates gave a response rate of only 1%. The expert revised his median assessments and the worst-case response-rate increased to 9%, which the expert still thought was too low.

• The response-rate rapidly diminishes as probabilities are multiplied.

• Intend adding this feedback option to the software.

13

14

15

16

17

18

19

Examples• O’Leary et al. (2009a) give an example where

two experts assessed the probability of presence/absence for the brush-tailed rock-wallaby using Elicitor.

• Only two covariates:

(i) Aspect (northerly vs other)

(ii) Slope (0o - 90o).

• O’Leary et al. (2009b) also gives an example where presence/absence for this wallaby is assessed – this time by only one expert but using four different methods, with aspect as

the only covariate.

• Data: presence at 41 sites and absence at 9 (rare species? pest?)

20

Assessments of the two experts (O’Leary et al., 2009a)

Classification rates of four methods (O’Leary et al., 2009b)

Method Predicted Observed

present absent

Elicitor present 41 9

absent 0 0

Map-method present 0 1

absent 41 8

Questionnaire present 41 9

absent 0 0

Classification tree

present 35 1

absent 6 8

22

Kynn (2004) gives five case studies conducted during the development of Elicitor where ecologists used it to quantify their opinions about an endangered species. Two of the studies had sample data with which to evaluate models.

Ground parrot137 presences and 438 pseudo-absences.

80% of the data was used to fit models and 20% for testing.

Two continuous covariates, a factor with three levels and a second factor with four levels.

Three models were considered:

(a) Assessed prior + data

(b) “Relaxed” prior + data

(relaxed: variances were multiplied by 10)

(c) Classical logistic stepwise regression.

23

Classification rates for ground parrot (Kynn, 2005)

Stepwise does best – presumably variable selection helps. It used just the two continuous variables.

Method Predicted Observed

present absent

Assessed prior + data

present 28 16

absent 1 48

Relaxed prior + data

present 22 11

absent 7 53

Frequentist stepwise

present 27 10

absent 2 54

24

Criteria for threshold: minimise2 2(1 sensitivity) (1 specificity)

25

Stemmacantha (a thistle)

203 presences and 2741 absences.

Same three models; 80% of the data for fitting & 20% for testing.

Stemmacantha Ground Parrat

26

Classification rates for Stemmacantha (Kynn, 2005)

Numbers are inconsistent, but there seems little to choose between the methods.

Method Predicted Observed

present absent

Assessed prior + data

present 33 77

absent 11 457

Relaxed prior + data

present 34 83

absent 6 461

Frequentist stepwise

present 32 69

absent 15 468

2727

Garthwaite (1998) and Garthwaite & Al-Awadhi (2006) also quantify the opinion of ecologists about rare species in Queensland. Central Government wanted State Government to estimate habitat distribution of rare and endangered species.Some sample data were gathered. The aim was to link the data, ecologists’ knowledge and a GIS database to relate the probability of presence/absence to a large number of covariates.Preliminary meeting with about a dozen ecologists indicated that non-linear relationships were needed to model their opinion (hence the piecewise linear models).

Little bent-wing bat. (5 variables and 8 factors, giving 57 regression coefficients. Data: 42 presences in 375 sites.)

28

Plumed frogmouth. (7 variables, 3 factors; 58 parameters). Data: 31 presences in 324 sites.

Powerful owl. (1 variable, 5 factors; 24 parameters).Data: 13 presences in 324 sites.

Greater glider. (7 variables, 4 factors; 60 parameters).Data: 53 presences in 343 sites. Common bent-wing bat. (4 variables, 7 factors; 59 parameters).Data: 13 presences in 375 sites.

2929

Various prior distributions were fitted to compensate for systematic biases in the expert’s assessments.

1. (β0 , β1 ,…, βk) multivariate normal.

2. β0 diffuse, (β1 ,…, βk) ~ MVN(b, Σ).

3. θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, θ2Σ).

4. γ, θ, β0 diffuse, (β1 ,…, βk) ~ MVN(θb, γΣ).

Cross-validation: Repeatedly using 80% of the data for fitting and 20% for testing.

Squared error loss was used to measure performance.

3030

Little bent-

wing bat

Common b-w bat

Plumed frogmouth

Powerful Owl

Greater glider

Prior 1 36.74 12.75 28.76 13.61 43.90

Prior 2 36.87 12.73 28.91 13.60 43.94

Prior 3 36.11 12.42 25.99 13.17 42.35

Prior 4 36.13 12.75 28.62 13.62 43.90

Stepwise logistic Regression

41.07 13.70 30.91 14.68 44.16

Prior: no data 41.12 13.67 29.54 15.07 48.81

Prior 3 (constant term given diffuse prior and all coefficients multiplied by a constant) is the best for each animal – noticeably better for the plumed frogmouth.The prior with no data is comparable with stepwise regression except for the greater glider. There is quite limited data.

31

A second example: Air pollution in (Khaldiya) Kuwait City

• Khaldiya had a mobile laboratory station to monitor pollution for one year.

• Focus is on the probability of pollutants exceeding harmful threshold level.

• There are two permanent fixed laboratory stations: 5 km north-east and 5 km south-west of Khaldiya.

• Aim is to use the data and the opinion of two scientists to relate Khaldiya pollution to the permanent laboratories.

• Pollutants: SO2, NO2 and n-CH4 (non-methane).

• Scientists quantified their opinions separately.

• Variables: pollution levels at the permanent labs, temperature, wind speed, humidity, height of the inversion line.

3232

Expert A/ SO2

Expert A/ NO2

Expert B/ NO2

Expert A/ n-CH4

Expert B/ n-CH4

Prior 1 16.94 16.44 17.97 46.25 48.75

Prior 2 16.99 16.42 17.98 44.49 48.77

Prior 3 17.32 16.43 17.97 46.23 48.74

Prior 4 16.99 16.45 17.95 46.27 48.84

Stepwise logistic Regression

18.02 19.71 19.71 46.29 46.29

Prior: no data 17.87 24.31 27.52 96.71 78.31

Non-methane: priors seem poor as priors + no data do much worse than other methods; stepwise logistic does better than using expert B’s prior but not expert A’s, especially with Prior 2.

For SO2 and NO2, the prior’s seem better and prior + data does better than stepwise logistic regression. Prior 2 is perhaps the best.

33

(Not Kuwait City)

3434

A medical application

• The UK National Health Service (NHS) initiated a study to estimate the benefits of current bowel cancer services in England and examine costs and benefits of alternative developments in service provision.

• ScHARR developed a treatment pathway model that gave the possible sequences of presentation, diagnosis, treatment and outcomes that could be followed by a patient with suspected colorectal (bowel) cancer. Available information supplied most of the required numbers but expert opinion filled in gaps.

• The resulting report states, “Owing to a lack of empirical evidence in a number of areas, several of the model parameter and details of the model structure were elicited

from experts.”

35

• For two quantities there were covariates. For these, the new version of the software was used to quantify consultants’ opinions.

• Choice of diagnostic test had level of fitness as a covariate.

• Choice of adjuvant chemotherapy had five covariates (mostly factors): age, tumor location, disease status, perforation/obstruction, and fitness for cytotoxic therapy.

• Results were validated where possible. Commenting on assessments about adjuvant chemotherapy the YHEC-ScHARR report notes that “The [pathways] model uses expert 1’s responses as part of a generalised linear model and is validated by expert 2’s responses.”

• The use of elicitation in the study is reported in Garthwaite, Chilcott, Jenkinson & Tappenden (2008).

3636

• Al-Awadhi & Garthwaite (2006). Computational statistics, 21, 121-140.

• Garthwaite (1998). Quantifying expert opinion for modelling habitat distributions. Sustainable Forest Management Tech. Report, Queensland Depart. Natural Resources.

• Garthwaite & Al-Awadhi (2006). Tech. Report 06/07. Dept. Statistics, Open University.

• Garthwaite, Chilcott, Jenkinson & Tappenden (2008). Int. J. Technology assessment in Health Care, 24, 350-357.

• Kynn (2005). Eliciting expert knowledge for Bayesian logistic regression in species habitat modelling in natural resources. PhD thesis. Queensland University of Technology.

• Mason (2008). Methodological developments for combining data. www.Bias-project.org.uk/Papers/CombineDataAJM.pdf.

• O’Leary, Choy, Kynn, Denham, Martin, Mengersem & Murray (2009a). Environmetrics, 20, 379-398.

• O’Leary, Mengersem, Murray & Choy (2009b). Comparison of four expert elicitation methods. 18th World IMACS/MODSIM Congress.

37

38