Download - Model Selection for Selectivity in Fisheries Stock Assessments

Model Selection for Selectivity in Fisheries Stock Assessments

André Punt, Felipe Hurtado-Ferro, Athol Whitten

13 March 2013; CAPAM Selectivity workshop

2

Overview

• What is the problem we want to solve?• Can selectivity be estimated anyway?• Fleets and how we choose them• Example assessments• Alternative methods:

– fit diagnostics– model selection and model weighting

• What do simulation studies tell us?• Final thoughts

3

Definitions of Selectivity

Selectivity : Is the relative probability of being captured by a

fleet (as a function of age / length) Depends on how “fleet” is defined

Selectivity is NOT:• Gear selectivity• Availability

4

Some of the key questions-I

Should there be multiple fleets and, if so, how do we choose them?• More fleets (may) make the assumption of time- invariant selectivity more valid.• More fleets lead to more parameters (and potentially model instability).

5

Some of the key questions-II

Given a fleet structure:• What functional form to assume?• Should selectivity change with time?• Parametric or non-parameteric?

0 5 10 15 20

0.0

0.4

0.8

Age

sele

ctiv

ity

AgeTime

selectivity

6

Some of the key questions-IIIGiven time-varying selectivity:• Blocked or unblocked• Which parameters of the selectivity function (or all) should change?

AgeTime

selectivity

AgeTime

selectivity

Annual Five-year blocks

Age-at-50% selex

7

Caveat – Can selectivity be estimated anyway-I?

Selectivity is confounded with:• Trends in recruitment (with time)• Trends in natural mortality (with age / time)

8

Caveat – Can selectivity be estimated anyway-II?

Age

05

10

15

Age

Low recruitment? Low selectivity?

Declining recruitment? Declining selectivity?High F

9

Caveat – Can selectivity be estimated anyway-III?

5 10 15 20

0.00

0.02

0.04

0.06

0.08

Year

Cat

ch

S trend with ageM trend with ageR trend with time

(a)

5 10 15 200.

00.

20.

40.

60.

81.

0

Year

Num

bers

(b)

Fit of various selectivity-related models to a theoretical age-composition.

10

Caveat – Can selectivity be estimated anyway-III?

The Solution: MAKE ASSUMPTIONS:• Natural mortality is time- and age-invariant• Selectivity follows a functional form.• Selectivity is non-parametric, but there are penalties on changes in selectivity with age/ length

11

Example Stocks

Pink lingPacific sardine

12

Example Stocks(fleet structure)

-134 -130 -125 -120 -115 -110 -105

20

30

40

50

55

MexCalPNW

(a)

-134 -130 -125 -120 -115 -110 -105

20

30

40

50

55

ENSSCACCAPNW

(b)2011 2010

13

Example Stocks(fleet structure)

110 115 120 125 130 135 140 145 150 155 160 165

-50

-45

-40

-35

-30

-25

80

50

40

60

30

20

10

70

(c)

Pink Ling

One fleet ormany

Fleets:• Trawl vs Non-trawl• Zones 10,20,30• Onboard vs port samples

14

Sensitivity to Assumptions

1995 2000 2005 2010

050

010

0015

00

Year

SS

B (

'000

t)

Base caseNo time varyingMexCal f leets: same selexAll f leets: same selex

(a)

1970 1980 1990 2000 2010

020

0040

0060

0080

00Year

SS

B (

t)

Base caseAll traw l mirroredAll mirroredSpatial AggAll Asym

(b)

Largest impacts:• Is selectivity time-varying or static?• Number of fleets / treatment of spatial structure• Is selectivity asymptotic or dome-shaped?

15

Selection of Fleets

Definition:• Ideally – group of vessels fishing in the same spatio- temporal stratum using the same gear and with the same targeting practices• In practice – depends on data availability, computational resources, model stability, trends in monitored data.

16

Fleets as areas-I

It is common to represent “space” by “fleets” (e.g. pink ling): • what does this assume? • does it work?

Key Assumptions:• The population is fully mixed over its range• Differences in age / length compositions are due to differences in selectivity.

17

Fleets as areas-II(does it work)

In theory “no” – in practice “perhaps”!

Cope and Punt (2011) Fish Res. 107: 22-38

Clearly, the differences in length and age structure among regionsis due to differences in population structure; not selectivity! Self-evidently then the approach is wrong

Simulations suggest that treating fleets as areas can reduce bias(Ferro-Hurtado et al.) but that spatial models may perform better(if the data exist – and perhaps not) but M probably isn’t age

and time-invariant either!

18

The State of the Art (as I see it)

• Disaggregate data when including them in any assessment (it is easy to aggregate the data when fitting the model).

• Test for fleet structure early in the model development process.

• Apply clustering-type methods to combine areas / gear types (not statistical tests, which will lead to 100s of fleets).

19

Residual Analysis

In principle this is easy:• Plot the data• Compute some statistics• Compare alternative assumptions…

EBS Tanner crab

20

• We know how to do this for index data (well)

• It gets trickier for compositional data (and hence selecting functional forms for selectivity)

Trawl10A N=18495effN=2565.5

0.00

0.05

0.10

0.15

0.20

0.25

0.30

length comps, sexes combined, retained, aggregated across time by fleet

Length (cm)

Prop

ortio

n

NonTrawl20A N=9559effN=1190.1

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Trawl20A N=20789effN=888.1

0.00

0.05

0.10

0.15

0.20

0.25

0.30

NonTrawl30A N=8711effN=1115.1

20 40 60 80 100 120

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Trawl30A N=3268effN=696

Trawl10B N=4297effN=328.4

NonTrawl20B N=4189effN=769.2

Trawl20B N=5589effN=468.2

20 40 60 80 100 120

NonTrawl30B N=1430effN=210.6

Trawl30B N=3682effN=84

Kapala N=400effN=278.8

20 40 60 80 100 120

Fits to aggregated lengthdata for pink ling whenselectivity is assumed tobe independent of zone

Trawl10A N=2311.9effN=2454.6

0.00

0.05

0.10

0.15

0.20

0.25

0.30

length comps, sexes combined, retained, aggregated across time by fleet

Length (cm)

Prop

ortio

n

NonTrawl20A N=1433.8effN=1330.9

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Trawl20A N=2078.9effN=2621.1

0.00

0.05

0.10

0.15

0.20

0.25

0.30

NonTrawl30A N=1742.2effN=1772.1

20 40 60 80 100 120

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Trawl30A N=653.6effN=641.3

Trawl10B N=429.7effN=548.6

NonTrawl20B N=628.4effN=694.1

Trawl20B N=279.4effN=286

20 40 60 80 100 120

NonTrawl30B N=214.5effN=231.8

Trawl30B N=184.1effN=170.7

Kapala N=200effN=145.3

20 40 60 80 100 120

21

BUT!

• Evaluating mis-specification for compositional data is usually not this easy:– The fit may be correct “on average” but there are

clear problems.– It may not be clear whether the model is mis-

specified

22

Is this acceptable?And this?

23

BUT!

• Evaluating mis-specification for compositional data is usually not this easy:– The fit may be correct “on average” but there are clear

problems.– It may not be clear whether the model is mis-specified

• Comparing time-varying and static selectivity can be even more challenging because it depends on how much selectivity can vary [Maunder and Harley identify an approach based on cross-validation to help with this]

24

Using profiles to identify mis-specification

0.5 1.0 1.5

010

20

30

40

logR0

Diffe

rence in log-lik

elih

ood

1 1 1 1 1 1 1 122

22

22

22

3 3 3 3 3 3 3 34 4 4 4 4 4 4 4

55

55 5 5 5 5

66

6 6 66

66

7 7 7 7 7 7 7 7

8

88

88 8 8 89 9 9 9 9 9 9 910 10 10 10 10 10 10 1011 11 11 11 11 11 11 1112 12 12 12 12 12 12 1213

1313

13

13

1313

13

(a)

0.0 0.5 1.0 1.5 2.0 2.5

05

10

15

20

logR0

Diffe

rence in log-lik

elih

ood

11

11

1

1

1

1

11

22

22 2 2

2

2

2

2

3333 3 3 3 3 3 34444 4 4 4 44

4

55555

55

5

55

(b)

Spatially-disaggregated Spatially-aggregated

Plot the negative log-likelihood [compositional data only] for each fleet to identify fleets whose compositional data are “unduly” informative

Fleets 2 and 13 (left) and 2 and 5 (right): fleet 13 (a) and 5 (b) are the same fleetand have only two length-frequencies… Should we learn this much?

25

Automatic Residual Analysis

Punt & Kinzey: NPFMC crab modelling workshop

Two sample Kologorov-Smirnov test applied toartificial data sets

26

The State of the Art (as I see it)-I

• Always:– examine plots of residuals – compare expected effective sample sizes with input values

• But:– Viewing plots of residuals can be difficult– How to define / test for time-varying selectivity is tough– Residual patterns in fits to compositions need not be due to choices related to selectivity– There is no automatic approach for evaluating residuals plots for compositional data.– No testing of methods based on residual plots has occurred (yet?)

27

The State of the Art (as I see it)-II

Aggregated compositions

Observed vs expected compositions

28

Model SelectionNo-one would say that model selection (and modelaveraging) are not part of the tool box of analysts BUTdo we know how well they work for stock assessment models?

Model selection methods used:Maximum Likelihood

• F-tests / likelihood ratio tests • AIC, BIC, AICc

Bayesian• DIC

29

Examples of Model Selection

• AIC:– Butterworth et al. [2003]: is selectivity for

southern bluefin tuna time-varying?– Butterworth & Rademeyer [2008]: is selectivity for

Gulf of Maine cod dome-shaped or asymptotic ?• DIC

– Bogards et al. [2009]: is selecticity for North Sea spatially-varying or not?

30

Examples of Model Selection(Issues)

• AIC, BIC and DIC are too subtle:– Often fits for two models are negligibly different

“by eye”, but highly “statistically significant” (AIC>200).

• All these metrics depend on getting the likelihood “right”, in particular the effective sample sizes for the compositional data.

31

Model Selection and weights

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

xx

yy

So which model fits the data best? And if we accidentallycopied the data file twice?

32

Effective Sample Sizes-I

Many assessments:• Pre-specify EffNs.• Use the “McAllister-Ianelli” approach.

But• Residuals are seldom independent

An alternative is Chris Francis’ approach, but that may fail whenthere is time-varying selectivity.

33

Effective Sample Sizes-II

• Maunder [2011] compared various likelihood formulations including:– Multinomial– Fournier et al. with observed rather than expected proportions– Punt-Kennedy (with observed proportions)*– Dirichlet – Iterative (essentially the “McAllister-Ianelli” method)– Multivariate normal

2, , ,

2,

ˆ(ln ln )

2,

ln ( | ) ln t a t a t a

t a

P P P

Pt a

L Data

Estimated effective sample size

34

AIC, BIC and Random Effects

Most (almost all) assessments using an “errors in variables” formulation of the likelihood function:

,argmax ( | , ) ( | )L D P

argmax ( | , ) ( | )L D P d

rather than the correct (marginal) likelihood:

How this impacts the performance of model selectionmethods is unknown.

35

The State of the Art (as I see it)• AIC, BIC, and DIC are commonly used.• But:

– Do we need an analogue to the “1% rule” as is the case for CPUE standardization?

– We need to get the effective sample sizes right! Using a likelihood function for which the effective sample size

can be estimated is a good start!– Performance also depends on treatment of random effects (recruitment, selectivity)

• What is the value of looking at retrospective patterns? Can we identify when the cause of a retrospective pattern is definitely selectivity?

36

Simulation Testing

Operating Model Operating Model

Method 1 Method n Method 1 Method n….. …..

Model Selection

Performance measures

Performance measures

37

Simulation Testing

• Caveats before we start:– Simulations are only as good as the operating

model• Most simulation studies assume that the likelihood

function is known (as is M)• Few simulation studies allow for over-dispersion.• No simulation studies simulate the “meta” aspects of

stock assessments (such as how fleets are selected).

– Avoid too many generalizations – most properties of estimators will be case-specific

38

Overdispersal?

How often do the data generated in simulation studies look like this?

How much does it matter?

39

Overview of Broad Results

• Getting selectivity assumptions wrong matters! HOWEVER, other factors (data quality, contrast, M) may be MORE important.

• Estimating time-varying selectivity when selectivity is static is safer than ignoring it when selectivity is time-varying.

• Model selection methods can discriminate among selectivity functions very well (do I really believe this – why then does it seem so hard in reality?)

40

The State of the Art (as I see it)

• The structure of most (perhaps all) operating models is too simple and leads to simulated data sets looking “too good” – Andre’s suggestion: if you show someone 99 simulated

data sets and the real data set, could they pick it out?• Future simulation studies should:

– Include model and fleet selection.– Focus on length-structured models.– Examine whether selectivity is length or age-based.

41

Final Thoughts

• Methods development– Non-additive models?– State-space models?

• Residuals and model selection– Weighting philosophy

• Simulation studies– Standards for what constitutes a

“decent” operating model?– Compare methods for

implementing time-varying selectivity (blocked vs annual)

– Consider length-structured models

42

Final Thoughts

• Ignore “space” at your peril!• What about model mis-specification in general.

43

Final Points to Ponder!

• Should guidelines be developed for when to:– downweight compositional data rather than modelling time-

varying selectivity– fix selectivity and not estimate it!– use retrospective patterns in model selection / bootstrapping– conduct model selection when the selectivity pattern is “non-

parameteric”– apply time-varying selectivity

• Model selection• Fixing / estimating sigma

– trump AIC, BIC and DIC using “by eye” residual patterns.

44

Questions?

Support for this paper was provided by NOAA:• The West Coast Groundfish project• Development of ADMB libraries• Simulation testing of assessment models

Download - Model Selection for Selectivity in Fisheries Stock Assessments

Top Related