chapter 7 geostatistics and their applications to ... · chapter 7 geostatistics and their...

34
Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1 Introduction Geostatistical applications to fisheries survey data were first developed to estimate population abundance and its precision for survey designs such as the systematic design where sample points are not independent from each others (Gohin 1985, Conan 1985, Petitgas 1993a, 1996). The International Council for the Exploration of the Sea (ICES) and its Fish Technology Committee (FTC) played a major role in promoting the application of geostatistics to fisheries by organising workshops (ICES 1989, 1992, 1993). Centre de Geostatistique (Fontainebleau) was also active and organised in 1992 a course for fisheries scientists which was advertised by ICES (Armstrong et al. 1992). The course explained geostatistical modeling and provided illustrative fisheries case studies in acoustic and egg surveys. Petitgas and Lafont (1997) produced a software specifically dedicated to the geostatistical estimation of global fish abundance and its precision for a variety of survey designs. Based on the experience gained, Rivoirard et al. (2000) presented the theory, a variety of demonstrative fisheries applications to acoustic and trawl surveys and provided much guidelines in the application of geostatistics to fisheries survey data. Petitgas (2001) reviewed concepts in geostatistics and statistics as well as tools for estimating population abundance with different survey designs. Geostatistical applications have now flourished not only in the field of fisheries survey-based abundance estimation but also in marine science in general (e.g., method assumptions and ecological characteristics: Rossi et al. 1992; distribution of invertebrates: Rufino et al. 2006; variograms inside schools: Gerlotto et al. 2006; interpolation in a predator-prey space: Bulgakova et al. 2001). Methods and tools are now widely documented, available and used. It seems useful at this time, to assemble the past history of ideas and formulate challenging questions for the future. Theoretical foundations of geostatistics can be found in Matheron (1971), Journel and Huijbregts (1978) and more recently in Chile` s and Delfiner (1999). P. Petitgas (*) IFREMER, Department Ecology and Models for Fisheries, BP. 21105, 44311 cdx 9, Nantes, France B.A. Megrey, E. Moksness (eds.), Computers in Fisheries Research, 2nd ed., DOI 10.1007/978-1-4020-8636-6_7, Ó Springer ScienceþBusiness Media B.V. 2009 191

Upload: vucong

Post on 29-Aug-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Chapter 7

Geostatistics and Their Applications to Fisheries

Survey Data: A History of Ideas, 1990–2007

Pierre Petitgas

7.1 Introduction

Geostatistical applications to fisheries survey data were first developed toestimate population abundance and its precision for survey designs such asthe systematic design where sample points are not independent from each others(Gohin 1985, Conan 1985, Petitgas 1993a, 1996). The International Council forthe Exploration of the Sea (ICES) and its Fish Technology Committee (FTC)played a major role in promoting the application of geostatistics to fisheries byorganising workshops (ICES 1989, 1992, 1993). Centre de Geostatistique(Fontainebleau) was also active and organised in 1992 a course for fisheriesscientists which was advertised by ICES (Armstrong et al. 1992). The courseexplained geostatistical modeling and provided illustrative fisheries case studiesin acoustic and egg surveys. Petitgas and Lafont (1997) produced a softwarespecifically dedicated to the geostatistical estimation of global fish abundanceand its precision for a variety of survey designs. Based on the experience gained,Rivoirard et al. (2000) presented the theory, a variety of demonstrative fisheriesapplications to acoustic and trawl surveys and provided much guidelines in theapplication of geostatistics to fisheries survey data. Petitgas (2001) reviewedconcepts in geostatistics and statistics as well as tools for estimating populationabundance with different survey designs. Geostatistical applications have nowflourished not only in the field of fisheries survey-based abundance estimationbut also in marine science in general (e.g., method assumptions and ecologicalcharacteristics: Rossi et al. 1992; distribution of invertebrates: Rufino et al.2006; variograms inside schools: Gerlotto et al. 2006; interpolation in apredator-prey space: Bulgakova et al. 2001). Methods and tools are now widelydocumented, available and used. It seems useful at this time, to assemble thepast history of ideas and formulate challenging questions for the future.

Theoretical foundations of geostatistics can be found in Matheron (1971),Journel and Huijbregts (1978) and more recently in Chiles and Delfiner (1999).

P. Petitgas (*)IFREMER, Department Ecology and Models for Fisheries, BP. 21105, 44311 cdx 9,Nantes, France

B.A. Megrey, E. Moksness (eds.), Computers in Fisheries Research, 2nd ed.,DOI 10.1007/978-1-4020-8636-6_7, � Springer ScienceþBusiness Media B.V. 2009

191

Page 2: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Rivoirard et al. (2000) provides a useful guide to the geostatistical text bookliterature. The purpose of Geostatistics (Matheron 1971) is to model spatialvariability in a variable of interest and use this model for the estimation inspace of that variable or a function of it. The objective of the estimation can bemapping the variable (interpolation), estimating its mean over an area, per-forming a change of scale, or mapping the probability of passing a threshold.The methodology provides the estimation variance (i.e., the variance of theestimation error) for any type of survey design. Further, the geostatisticalspatial structural model contains ecological information characterising aggre-gation patterns. Since the structural analysis is the corner stone of geostatistics,themethod is of interest both to the fisheries assessment scientist and themarineecologist.

Geostatistical applications for the evaluation of marine resources havedeveloped to consider more complex structural models than the variogramand more complex survey designs than random or systematic designs. Theyhave also considered solutions to outliers, as well as the use of geostatisticalsimulations. Ecological applications have mainly dealt with ways to character-ise spatial fish aggregation patterns. The spatial variographic structure has beenused to characterise schooling aggregative patterns, density dependence inspatial organisation, or border effects or geometry of the area. Other toolsthan the variogram have been used to characterise the spatial pattern and itslink to co-variates, including the inertio-gram, the D2-variogram or a pointprocess approach. Also indicators of spatial patterns have been developed tomonitor spatial distributions of fish stocks as an element of the ecosystemapproach to fisheries. Each topic in this chapter will be reviewed and commentsare offered that focus on the concepts. For full theoretical descriptions, thereader is referred to the literature and in particular to Rivoirard et al. (2000) orPetitgas (1996, 2001).

7.2 Abundance Estimation and Mapping

7.2.1 Geostastistical Concepts and Basic Geostatistics

7.2.1.1 Random Functions

Geostatistics is applied in two steps (Matheron 1971). The first step is thestructural analysis in which a model is chosen and fitted that interprets theunderlying spatial continuity in the data. The second phase is that of estimation,which involves using the model to derive estimates of the variable and theirestimation variances. The mathematical framework (Matheron 1971) is that ofrandom functions: the sampled values are interpreted as the outcome of onerealisation of a random function Z (Z(x1), Z(x2), . . .Z(xn),. . .) within a defineddomain. The structural model (e.g., the variogram) applies to the the randomfunction Z, not the particular realisation sampled. Inference is possible by

192 P. Petitgas

Page 3: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

making stationary assumptions, which can be made at various spatial scales, e.g.,

for all distances in the domain (strict stationarity) or only at short distances

(quasi-stationarity). The random functionmodel is then:Z(x)=E[Z(x)]+Y(x),

where the expectation E is taken over all realisations. E[Z(x)] is the drift (or

trend), which due to its definition is not necessarily a smooth surface in space.

Y(x) are the residuals, which have some degree of stationarity in space. The

random function is a mathematical representation. Thus the practicioner will

usually prefer to estimate quantities of the sampled realisation (at locations

unsampled or means over blocks) rather than of the random function (drift

values). Geostatistics will allow the estimation of quantities of the sampled

realisation as well as the drift. In contrast, classical statistical theory (e.g., linear

models) will only allow the estimation of the drift values, i.e., that of the random

function (Matheron 1989, Petitgas 2001).The variogram � is the structural tool. It is defined as the half variance of

increments ofZ between pairs of points separated by vector distance h (Matheron

1971): g(h) = 0.5 E [(Z(x) – Z(x + h))2]. The expectation E is taken over all

realisations of the random function. The intrinsic model of random function to

which the variogram applies is more general than the strict stationary model: the

incrementsZ(x)–Z(x+h) are supposed to have zero mean and a stationary semi-

variance (the variogram) depending on h only. The inference for the variogram is

possible by assuming a certain degree of stationarity in space which allows one to

estimate the expectation over different realisations by spatial averaging. Quasi-

stationarity is sufficient in general, which in practice applies to distances smaller

than a third of the sampled domain (Journel and Huijbregts 1978).

7.2.1.2 Variances

When estimating the mean over a domain v by the simple average of point

values Z(xi): Z�v ¼ 1n

Pn

i¼1ZðxiÞ, the geostatistical measure of precision is the

estimation variance. It is written as a function of the variogram only (Matheron

1971): �2E ¼ var½Zv � Z�v � ¼ 2��ðv; xÞ � ��ðv; vÞ � ��ðx; xÞ, where ��ðv; vÞ is the

mean variogram value for all distances in v (model dispersion variance in v),��ðx; xÞ is the mean variogram value for all sample points x used in the estima-

tion (sample dispersion variance) and ��ðv; xÞ is the average variogram value for

all distances between each point sample x and all points of the domain v. The

estimation variance depends on the geometry of the domain, the position of the

samples relative to each other, and the position of the samples relative to

the domain. The more continuous the spatial structure and the tighter the

sampling, the smaller the estimation variance. It is noteworthy that because

the estimation variance depends on the variogram and the configuration of the

sampling, sampling schemes can be compared and optimised based on the

variogram.

7 Geostatistics and Their Applications to Fisheries Survey Data 193

Page 4: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Matheron (1971) defined two types of variances, the dispersion variance andthe estimation variance. The dispersion variance over domain v is the varianceof the random function over the domain and equals ��ðv; vÞ. It can be estimatedby using the variogram, ��ðv; vÞ, or alternatively with n random samples,1

n�1P

i

ðzi � �zÞ2. The dispersion variance is the classical variance but the spatial

domain on which it applies is explicitly defined. The estimation variance is thevariance of the error between the (true) value of the random function and its

estimate, � 2E ¼ var½Zv � Z�v �, where the variance is taken over all realisations of

the random function. The estimation variance differs conceptually from thevariance of the estimate, var½Z�v �, which is the classical measure of precision in

statistics. Petitgas (2001) further discussed differences in estimation quantitieswhen using geostatistics and generalised linear models (GLM: McCullagh andNelder 1995) as GLMs have been used for estimation purposes in fisheries.

Geostatistical estimates of variance aremodel-based. In contrast to samplingtheory (e.g., Cochran 1977) where variance estimates are design-based, whichrequires randomising the sample locations, here the sample locations may befixed. Matheron (1989) further discussed the passage from randomisation ofsamples to the use of random functions (see also Petitgas 2001). When samplepoints are not positioned independently from each other and when the popula-tion sampled is spatially structured the estimation of any variance requires amodel of the spatial correlation in the population (Cochran 1977, Matheron1971). Thus geostatistics solves the problem of the estimation of variance forsurvey designs that are not random, in particular grids of points as in ichyo-plankton surveys, or parallel or zig-zag transects as in acoustic surveys (ICES1993).

7.2.1.3 Kriging

Kriging is a linear estimation procedure (Matheron 1971) that is unbiased andof minimum variance. By kriging one can estimate either point values (pointkriging), mean values over blocks (block kriging) or the mean value over theentire domain (kriging the mean). Kriging not only provides estimates but alsotheir estimation variance. Suppose we want to estimate the mean over block vcentred on x0 by a linear combination of sample values known at points x�:Z�v ¼

P

i2��iZðxiÞ. � is the neighbourhood in which n samples are considered. The

estimation variance writes: �2E ¼ 2P

i

�i��ðv; xiÞ � ��ðv; vÞ �P

i

P

j

�i�j�ðxi; xjÞ.

The kriging weights are those that minimise the estimation variance (namedkriging variance at the minimum). The minimisation can be done under theconstraint that the kriging weights sum to unity:

P

i2��i ¼ 1, which will ensure the

estimate to be unbiased. The estimation will be done with only those linearcombinations of samples that filter a constant mean which can stay unknown.

194 P. Petitgas

Page 5: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

In practice the constraint will result in keeping the estimate close to the neigh-bourhood mean of the samples. Such estimation procedure is called ordinaryblock kriging with a moving neighbourhood and is widely used for mappingpurposes. A quasi-stationary variogram model is sufficient in which stationar-ity of the mean and variance applies for only those distances involved in theneighbourhood.

7.2.1.4 Comparing and Optimising Survey Designs

A software tool (EVA: Petitgas and Lafont 1997) was specifically designed tocalculate the estimation variance of the global mean estimate for a variety ofsampling schemes used in fisheries surveys. Considering that other surveydesigns than the one performed would provide a similar variogram, differentsampling schemes can be compared based on the estimation variance formula ofthe global mean estimate. On an illustrative example, Petitgas (1996) showedthat a regular sampling design would have performed as precisely as the unevendesign performed. Doray et al. (2008) compared different star acoustic surveysaround a Fish Aggregative Device and defined the appropriate number ofbranches to the star. In acoustic surveys, a compromise in survey time alloca-tion must be achieved between acoustic transect sampling for measuring fishdensity and trawl haul sampling for measuring fish length. Simmonds (1995)(also in Simmonds andMcLennan 2005, Chap. 8) analysed the effort allocationbetween the number and spacing of acoustic transects and the number of trawlhauls and found that a fine tuning between more acoustic transects or moretrawl hauls was not necessary for the Scottish acoustic surveys on North Seaherring.

7.2.2 Variography

7.2.2.1 Inference and Model Choice

Three types of variograms can be distinguished (Matheron 1989): the regional,experimental and model variograms. The regional variogram is that of thesampled realisation if all values were known at all points. The experimentalvariogram is the estimate of the regional variogram based on the data samples.The model variogram is that of the underlying random function. When fitting avariogram model, one passes from the experimental variogram of one realisa-tion to the model variogram of the random function. Matheron (1989) suppliestheoretical proof that the variability between regional variograms of the samerandom function is small for short distances, thus it is possible in practice toinfer the variogrammodel from one realisation of the random function (i.e., onedata set). He also supplies experimental proof that models with similar para-meters give similar kriging estimates and estimation variances. Variogrammodels are mathematically appropriate functions ensuring that calculated

7 Geostatistics and Their Applications to Fisheries Survey Data 195

Page 6: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Table7.1

Common2D

variogram

functionsandtheirphysicalcharacteristics:Cissill,risrange,hisdistance

Modelname

Modelform

ula

Behaviouratorigin

h!

0Sill

Modeled

irregularity

Spherical

Cð1:5h=r�0:5h3=r3Þ

if0�

h�

rC

ifh�

r

�Linear

Yes

Medium

Exponential

C(1

–exp(h/r))

Linear

Asymptotic

Medium

Gaussian

C(1

–exp(h

2/r2))

Parabolic(horizontaltangent)

Asymptotic

Verysm

ooth

Power

hawith0<

a<

1Increasingly

vertical

tangentasa!

0No

Veryirregular

hawitha=1

Linear

No

Medium

hawith1<

a<

2Increasingly

horizontal

tangentasa!

2No

Smooth

Nugget

C0ifh4

00

ifh¼

0

�Discontinuous

Yes

Purely

random

196 P. Petitgas

Page 7: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

variances are positive. The behaviour at the origin of the variogrammodel (e.g.,for short distances smaller than the grid mesh size) has a major influence on theestimation variance (Matheron 1971, Petitgas 2001). Table 7.1 lists differentmodel functions often used in fisheries applications. The choice of the modelfunction corresponds to a physical interpretation of the spatial regularity of theunderlying process from which the data have been sampled (constitutiveassumption: Matheron 1989). Model fitting then has two steps: first the choiceof the model function then the fit of that function to estimate its parameters.

7.2.2.2 Model Validation

Adequacy of the model to represent the spatial variability in the data can bechecked in various ways. Goodness of fit criteria of the model to the experi-mental variogram can be helpful in fitting the variogrammodel (e.g., Fernandesand Rivoirard 1999, Rivoirard et al. 2000). Comparing the dispersion variancein the model to that of the data is also advised (Matheron 1971) as the modelshould contain as much variance as there is in the data: the model dispersionvariance over the entire domain ��ðV;VÞ should be close to that of the datavariance (if the variogram model is fitted for all distances in V: strict stationar-ity) or alternatively the model dispersion variance over a small block��ðv; vÞshould be close to the data variance within such block size (if the vario-gram model is fitted for only small distances: quasi-stationarity). Last, cross-validation of data values by kriging provides a way to measure how the modeland the kriging procedure reproduce the data (e.g., Journel and Huijbregts1978).

7.2.2.3 Variogram Characteristics

Important characteristics of the variogram are the nugget, sill, range andanisotropy. The nugget is a discontinuity of amplitude C0 at the origin of thevariogram (Table 7.1). It has three interpretationswhich cannot be distin-guished in practice. These are a purely random component, a measurementerror, and spatial structures with range smaller than the grid mesh size. The sillis the variance of the random function. Because the (dispersion) variance is afunction of the domain on which it is computed (��ðv; vÞ), the variogram sill andthe data variance need not coincice. In general, the variogram sill will be greaterthan the data variance. They will be close in value when the variogram range isshort relative to the domain studied (stationary case). The range is the distanceat which correlation vanishes. It relates to the average dimension of patches ofeither low or high values. The anisotropy models directional differences in thespatial variation. In a geometric anisotropic variogram model, all directionshave the same sill but the range varies elliptically with direction (e.g., fishaggregations are elliptical rather than circular). In zonal anisotropy, the sillvaries with direction meaning that the spatial distribution is more heteroge-neous is certain directions (e.g., an in-shore off-shore gradient in fish density

7 Geostatistics and Their Applications to Fisheries Survey Data 197

Page 8: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

can be modelled with zonal anisotropy). A variogram model can be the sum ofvarious variogram functions when the data show different nested structures(e.g., nugget + isotropic spherical variogram + linear directional variogram).A variety of case studies in modelling experimental variograms can be found inJournel and Huijbregts (1978) and Chiles and Delfiner (1999). Petitgas (1996)also documents the variety of variogram models of interest in fisheriesapplications.

7.2.2.4 Variogram Estimates

The classical experimental variogram (Matheron 1971) is:

��ðu; hÞ ¼ 12 nðu;hÞ

P

x�yj j�hðzðxÞ � zðyÞÞ2, where n(u, h) is the number of pairs of

points (x, y) separated by distance class h in direction class u and z(x) thesample value at location x. Distance and direction classes are defined depend-ing on the sampling scheme configuration to ensure a sufficient number ofpairs in each class. It is recommended to compute the variogram for distancesnot exceeding half of the maximum dimension of the domain studied (Journeland Huijbregts 1978). In fisheries applications, sample point coordinates areusually in navigational units of degrees of latitude and longitude. To computedistances between sample points a transformation of the coordinates to geo-graphic units is necessary. Because the classical estimate of the variograminvolves a square difference between sample values, the experimental vario-gram may be erratic. Thus capturing the spatial structure with that estimatemay be difficult although there is spatial structure in the data. Alternativeestimates of the variogram have been proposed for dealing with the effect ofzeroes, or high values, or inhomogeneity in the sampling locations (Rivoirardet al. 2000). The irregularity in the sampling design (e.g., clusters of points inparticular areas) can affect the experimental variogram. Spatial weights canbe given to sample points (e.g., area of influence) leading to a weighted

variogram estimate: 0:5P

x�yj j�hwxwyðzðxÞ � zðyÞÞ2=

P

x�yj j�hwxwy, where wx is

the spatial weight of point x. Rivoirard et al. (2000) have used such estimatein the case of zig-zag acoustic surveys. Also, in acoustic surveys the interactionbetween spatial and temporal variability may be such that the along transectvariogram can be more easy to interpret than the 2D variogram calculatedalong and across transects. Another situation is when the spatial distributionshows many zeroes with occasional patches of positive values. Then a vario-gram estimate based on the non-centred covariance can be better adapted to

reveal the spatial structure: 1N 2

P

i

zðxiÞ2 � 1nðhÞ

P

x�yj j�hzðxÞzðyÞ, where N is the

total point number. This covariance estimate of the variogram may under-estimate the sill due to departure from strict stationarity thus care should betaken in comparing the model dispersion variance with the data variance. Fordealing with high values, transformations of the variable have been suggested

198 P. Petitgas

Page 9: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

based on distributional assumptions. Cressie (1991) proposed a variogramestimate robust to outliers from a Gaussian distribution. Guiblin et al. (1995)(also in Rivoirard et al. 2000) suggested to log-transform the data Z into L(L=Ln(1+Z/b)), estimate the variogram of the log-transform L usingthe classical estimate, model the variogram of the log-transform �Land back-transform this variogram model to obtain the variogram model of

the original data �: �ðhÞ ¼ ðmþ bÞ2 þ varZh i

1� expð�ð�2�LðhÞ=varLÞÞ� �

with �2 ¼ Ln 1þ varZ=ðmþ bÞ2h i

and m=E[Z]. Such a procedure has been

successfully used on Northern North Sea herring (Rivoirard et al. 2000) andits robustness tested by simulations.

7.2.2.5 Automated Fitting Procedures

Once the variogram model function is chosen, its parameters can be fitted byeye or by an automated algorithm using a least squares procedure. Cressie(1991) and Chiles and Delfiner (1999) document a variety of statistical fittingprocedures. Fernandes and Rivoirard (1999) (also in Rivoirard et al. 2000) usedweighted least squares to estimate variogram model parameters and comparedmodels with a goodness of fit criteria. The function to be minimized was:

qðbÞ ¼P

j

wj ��ðhjÞ � �ðhi; bÞ

� �where b is the set of model variogram parameters

and * denotes the variogram estimate. The weightswj can be proportional to thenumber of pairs of points in distance class j or an inverse power of the distancehj. This second possibility will ensure a good variogram fit for small distances.

The goodness of fit criteria was: gof ¼

P

j

wj ��ðhjÞ��ðhi;bminÞ½ �P

j

wj��ðhjÞwhere bmin is the set of

fitted variogram model parameters. Monitoring yearly fisheries surveys resultin a time series of spatial data in which the spatial structure of a given speciesshows both variability and some consistency across years. One would thereforelike to model a variogram in each year as well as coherently across all years.Bellier et al. (2007) considered that all years had a similar variogram modelfunction (e.g., spherical) but that the parameters varied between years. Theyfitted a spherical variogram model to a set of yearly experimental variogramsusing non-linear mixed-effects regression (Pinheiro and Bates 2000). They usedfixed effects for the range parameter and random effects for the sill and nugget,which resulted in estimating a constant range across all years and yearly vari-able sill and nugget.

7.2.3 Multivariate Approaches

The correlation information between a target fish species and explanatorycovariates may be used to improve the estimation of the target species.

7 Geostatistics and Their Applications to Fisheries Survey Data 199

Page 10: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Multivariate geostatistics comprise a diversity of methods adapted to differentsituations such as e.g., particular covariation between covariate and target, ordifferences in the sampling configurations between covariate and target or whena drift surface is considered. A comprehensive presentation of multivariategeostatistics and related topics can be found in Wackernagel (1995) or Chilesand Delfiner (1999). Rivoirard (1994) provides a simple introduction to co-kriging and its relationship with non-linear geostatistics. Here we shall reviewthe different situations that have been encountered so far in fisheries applica-tions when considering an ancillary and a target variable. Two classes ofcovariates should be distinguished, spatial and non spatial. Non spatial covari-ates are explanatory variables controlling fish behaviour (e.g., time of day) andtherefore variation in fish density is independent of space. Spatial covariates areexplanatory variables (e.g., bottom depth, river plume) that covary in spacewith fish density and therefore can explain the fish spatial distribution.

7.2.3.1 Universal Kriging

Large scale drifts in the data can result from the response of fish concentrationto explanatory environmental parameters (e.g., a gradient in fish density fromcoast to off-shore depending on bottom depth). In the Universal kriging modelthe drift is separated from the residuals and explicitly modelled at large scale.Matheron (1971) showed that drift and residuals could not be estimatedtogether using only one realisation of the random function (i.e., one data setfrom 1 year): the variability in the (estimated) residuals that result from thespatial smoothing procedure in estimating the drift is an underestimate of the(true) residual process variability. Rivoirard and Guiblin (1997) advocate con-sidering a bias term coming from the estimation of the drift which is neededwhen estimating the estimation variance of the mean estimate. Ancillary cov-ariates have been used to estimate the drift by regression. Sullivan (1991) wasconfronted with a gradient in demersal fish density with bottom depth. He tookadvantage of the fact that the drift in the fish density developed in one geogra-phical direction only, across the isobaths. The variogram of the residuals wasthen estimated along the isobaths and applied in all directions. The drift wasestimated across the isobaths using a regression with depth. AUniversal krigingprocedure was then used for mapping. When the drift is consistent in time,repeated surveys have been used to estimate the drift directly as the mean overdiffferent realisations of the process. Using repeated surveys on the same grid ofstations, Petitgas (1997) estimated the dome-shape drift in sole egg distributionson a spawning ground by averaging egg density in time. The model developedwas multiplicative as the residual variance was proportional to the drift. Dorayet al. (2008) worked on repeated surveys over a tuna aggregation around a FAD(fish aggregative device). The drift was first estimated by time averaging thenmodelled using an advection diffusion equation. It formulated the balancebetween oriented movements toward the aggregation centre and dispersivenon oriented movements and resulted in modeling the decrease of fish

200 P. Petitgas

Page 11: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

concentration from the FAD’s head to the border of the aggregation. Residualsfrom the advection diffusion model were estimated. The variogram of theresiduals was used to estimate the mean density around the FAD by kriging.The variogram was also used to optimise the star acoustic survey design.

7.2.3.2 External Drift

With the external drift procedure, the kriged estimates are constrained to con-form to the shape of an ‘external’ variable in time or space. This is achieved byimposing constraints on the kriging weights. Constraints on the kriging weightscan be extended to filter other functions than a constant mean. Suppose thedrift is linearly related to an explanatory variable f(x): E [Z(x)] =af(x) + b.Imposing supplementary conditions on the kriging weights:

P

i2��i ¼ 1 and

P

i2��ifðxiÞ ¼ fðx0Þ will result in constructing the estimate with only those linear

combinations of the samples that filter the drift whatever the values of a and b.The result is that the kriged estimates will conform to the shape of the variablef(x). The price to pay is that the kriging variance will be increased as the numberof constraints is increased. The values of f(x) need to be known at all samplepoints as well as at all points or blocks to be estimated. The variogram to beused is that of the residuals Y(x). The external drift procedure allows one toadjust the drift to conform to a particular functional relationship with acovariate, which plays the role of a guiding variable. The procedure is helpfulwhen the target variable is undersampled in comparison to the covariate orfor dealing with a drift that has a functional relationship with a covariate.Rivoirard et al. (2000) (also in Guiblin et al. 1996) and Petitgas et al. (2003b)used an external drift procedure to map fish length while accounting for a linearrelationship between fish length and bottom depth. Rivoirard and Wieland(2001) accounted for the effect of time of day on trawl haul catch using anexternal drift procedure and mapped the fish spatial distribution at a given timeusing both day and night samples. Bouleau et al. (2004) investigated how densilysampled acoustic data between trawl haul stations could help the mapping of thesparcely sampled trawl haul data (heterotopic and undersampled configuration).An external drift procedure was used to map the trawl haul data while followingthe spatial distribution of the acoustically sampled fish density. The map of theacoustic data was first estimated by ordinary kriging and represented the shapesurface to conform to. Its values were available at all trawl stations as well as allpoints to be estimated. The external drift procedure was then used to guide themapping of the trawl hauls with the map of the acoustic data.

7.2.3.3 Co-kriging

In this section, the cross-variogram is the structural tool. It is an extension of thevariogram for multivariate random functions: �ijðhÞ ¼ 0:5E½ðZiðxÞ � Ziðxþ hÞÞðZjðxÞ � Zjðxþ hÞÞ�, where i and j are indices for different co-varying variables.

7 Geostatistics and Their Applications to Fisheries Survey Data 201

Page 12: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

The cross-variogram is a symmetrical function of h. Also both variables play asimilar role. The coherent fitting of variograms and cross-variograms is notalways an easy task (Wackernagel 1995) and simplifications of co-kriging havebeen developed depending on the variograms and cross-variograms structures.There are simplifying cases: factorisation of covariates and intrinsic correlation.Until now only intinsic correlation has been used in fisheries applications.Intrinsic correlation applies when cross-variograms and variograms are allproportional to a same variogram: �ijðhÞ ¼ �2ij�ðhÞ. Co-kriging is helpful forimproving the estimate of the target by using correlated covariates known atmore locations than the target (undersampled configuration) or for ensuringcoherence in the estimated values when variables are functionally related (theco-kriging estimate will comform to the relationship). In their analysis ofacoustic and trawl data and in addition to using an external drift procedure,Bouleau et al. (2004) also used an intrinsic co-kriging model to map trawl hauldata using both acoustic and trawl data. The authors compared maps andestimation variances obtained by co-kriging and external drift to that obtainedby kriging the trawl data. Co-kriging and External drift methods made use ofthe acoustic data between trawl hauls while the kriging of the trawl hauls did notmake use of the acoustic information. Co-kriging or external drift providedsimilar maps. They had more details than the univariate kriged map, in parti-cular they were less smooth and areas of high abundance were more restricted.The external drift had slightly increased estimation variance in comparison tothe co-kriging, as is expected by theory. Petitgas (1991) proposed a co-krigingmodel accounting for the aggregation/disaggregation of fish schools betweenday and night. The day and night samples were functionally related consideringthat the night fish density at point x was equal to the average of the day values

over area v centred on x: ZnightðxÞ ¼ 1v

R

v

Zdayðxþ uÞdu. The relationship

allowed one to specify coherently day and night variograms as well as derivethe cross-variogram model between day and night values. The co-krigingpocedure allowed one to estimate a day (or night) map using both day andnight samples.

In all, multivariate approaches allowed one tomake full and coherent use of theavailable information which resulted in more realistic maps. But the estimationvariances were not dramatically decreased in comparison to the univariate case.The reason why is perhaps that the mathematical dimension of the estimationproblem was increased in the multivariate situation in comparison to the univari-ate case as more sources of variations in the target variable were considered.

7.2.4 Simulations

The interest in simulations is to generate maps that contain all the variabilitythat is in the data. In contrast to kriging which results in a smoothed inter-polated surface, a simulated field shows all the variability in the process while

202 P. Petitgas

Page 13: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

respecting the data variogram and the histogram. A non conditional simulation

is one realisation of the random functionmodel. A conditional simulation is onerealisation that conforms to the sample point values. Simulations are an appro-priate tool for evaluating the impact of spatial uncertainty on the result of

complex procedures. They are useful when estimating derived variables thatrequire variabilty in the surveyed variables (e.g., estimating the length of the seafloor to an island for designing a cable), or computing the estimation variance

of estimates that are themselves a non-linear combination of the surveyedvariables (e.g., combining fish length and acoustic backscatter for estimatingfish abundance in acoustic surveys), or testing estimation procedures and

survey design (e.g., testing different rules for adding samples in adaptive sam-pling). Chiles andDelfiner (1999) and Lantuejoul (2002) document comprehen-sively the many methods and algorithms for simulating spatially structured

random functions with defined histogram and covariance, along a line (1D), onthe plane (2D) or in 3D, as well as the conditioning to the data values. Amongthe variety of methods, the turning bands method due to Matheron (1973) is

practical and efficient. The different steps for constructing a conditional simu-lation with the turning bands method are assembled as a flow chart on Fig. 7.1.

Raw data 1 : Gaussian transform Gaussian data

2 : Joint variography

Variogram of Gaussian

3 : Turning bands

Non conditional simulation of Gaussian

4 : Conditioning by kriging

Conditional simulation 5 : Back-transform Conditional simulation of Gaussianof raw data

Fig. 7.1 Flow chart showing the steps for constructing a geostatistical conditional simulationthat match the histogram, the variogram and the data samples, using the turning bandsmethod and conditioning by kriging. Non conditional simulation will match histogram andvariogram. A conditional simulation will match histogram, variogram and data values. InStep 1 when transforming the raw data into a Gaussian, the use of a Gibbs sampler can behelpful for dealing with many zeroe values. Step 2 is in fact a joint analysis where variogramsof raw and transformed data are modelled coherently. Steps 3 and 4 can be obtained directlyby other methods, e.g., sequential gaussian simulation, that requires bi-gaussian assumptionof the transformed data [adapted from Chiles and Delfiner (1999)]

7 Geostatistics and Their Applications to Fisheries Survey Data 203

Page 14: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

The anamorphosis and back-transform steps result in matching the simulatedvalues to the histogram of the data. The turning bands method results inmatching the covariance of the simulated values to that of the data. Last,kriging is used to match the simulated values to the data values resulting in aconditional simulation.

7.2.4.1 Gaussian Anamorphosis

Because the methods simulate Gaussian random functions, it is useful toGaussian transform the data. A Gaussian anamorphosis is used where theGaussian value y is associated to the data value z that has the same cumulativeprobability: P(Z(x)<z)=P(Y(x)<y). The anamorphosis is in general inverta-ble unless the histogram of Z presents spikes. In particular, the high propor-tion of zero values in fisheries survey data will generate a spike on thehistogram of Z. Each strictly positive data value can be assign one Gaussianvalue y>yc using an invertable monotone anamorphosis function, leading to atruncated gaussian variable at cut-off yc (P[Z(x)= 0]= P[Y(x)< yc]). How toassign Gaussian values to the zeroes in order to match the covariance struc-ture and knowing the Gaussian valuesY(x)>yc? It is useful here to use a Gibbssampler otherwise the assignment of Gaussian values Y(x)<yc to the zeroevalues is arbitary, which impacts the covariance structure. Woillez et al.(2006a) used a Gibbs sampler to assign coherently Gaussian values to thedata zeroes and estimated a Gaussian covariance for the full data set thatcould then be simulated. In contrast, Gimona and Fernandes (2003) failed toassign adequately Gaussian values to the zeroes and could not appropriatlycontrol the simulated covariance.

7.2.4.2 The Turning Bands Method

Chiles and Delfiner (1999) and Lantuejoul (2002) provide full and up-to-datedocumentation of the method. The turning bands method constructs asimulation in <nof a Gaussian random function based on independentlysimulated processes in <1 taking advantage of the formal relationshipbetween the covariances in <n and <1. Consider a sequence of n lines withdirections �d andX1

d the independently simulated 1D processes along the linesof directions �d. The simulated Gaussian random function in <n will be:

YðxÞ ¼ 1ffiffinpPn

d¼1X1

dð5x; �d4Þ, where <x, �d> is the projection of point x of <n

on the line of direction �d. Using a large number of lines n, the central limittheorem implies that Y is Gaussian. The directions �d can be random orfollowing a quasi-random sequence (e.g., van der Corput sequence) that ismore efficient in filling space evenly as n increases. In 3D the isotropiccovariance C3(r) of Y is simply related to the 1D covariance C1(r) of X

1 as:

C1ðrÞ ¼ ddr rC3ðrÞ½ �. For each covariance model classically used (e.g., Table 7.1)

204 P. Petitgas

Page 15: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

the associated 1D covariance is known. The simulation of a random processalong a line with a specified covarianceC1 can be constructed with a variety ofmethods, e.g., using autoregressive or dilution algorithms. Note that becausethe relationship between the 1D and the 2D covariances is more complicatedthan that between the 1D and the 3D covariances, it is easier to simulate in the2D plane as if it was a section of the 3D space. In comparison to othersimulation methods, the turning bands method offers flexibility in the imple-mentation as well as control on the simulated covariance structure. Also, itallows one to simulate a large number of points in the simulated fields withcomputational efficiency.

7.2.4.3 Conditioning by Kriging

Some methods based on more assumptions than the Turning bands method(e.g., sequential gaussian simulation) allow one to simulate conditionally tothe data directly but the Turning bands method does not. Conditioning isobtained by simulating non conditionnally a kriging error. The steps are asfollows: (i) perform a non conditional simulation of Z at the nodes x of thesimulation grid: Snc(x) and at the data points x�: Snc(x�); (ii) perform krigingat the grid nodes x using the data Z(x�): Z

k(x); (iii) perform kriging at thegrid nodes x using the simulated values at the data points Snc(x�): Snc

k(x).The conditional simulation at the grid nodes Sc(x) is constructed as:ScðxÞ ¼ ZkðxÞ þ ½SncðxÞ � S k

ncðxÞ�. Since kriging is an exact interpolator, theconditional simulation conforms to the data values.

7.2.4.4 Testing Homogeneous Survey Designs and Variogram Estimators

Since transects in acoustic surveys are being sampled continuously, it has beensuggested to sum the recorded fish density along the transect lines and esti-mate fish stocks using a one-dimensional procedure (Petitgas 1993a). Follow-ing on this idea, Simmonds and Fryer (1996) (also in Rivoirard et al. 2000,Chap. 5) tested a variety of survey designs applied to a variety of simulated one-dimensional processes. Designs considered were parallel transects that wererandomly or regularly spaced or randomly spaced within strata and zig-zagtransects. The 1D simulations were non conditional and used an autoregressivemethod. Simulated processes varied in their degree of correlation (nugget,range) as well as in the incorporation of a trend (linear). Designs were rankedconsidering precision on the mean estimate, bias, and precision on the var-iance. The conclusion was that the systematic design (regularly spaced trans-ects) was the best strategy for estimating abundance with highest precisionand negligible bias. Random stratified design with two or one transect perstrata were ranked as best strategies (but close to the systematic design) whenthe aim was both the estimate of the mean and that of the variance. Theinterest in using a zig-zag transects design in comparison to a parallel transectsdesign depended on the correlation range and transects spacing. Rivoirard et al.

7 Geostatistics and Their Applications to Fisheries Survey Data 205

Page 16: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

(2000, Chap. 5) also used simulations to compare variogram estimators. Theyused similar sets of simulations (nugget, range, linear trend) than previously inaddition to varying skewness in the histogram. Three variogram estimators(classical, log back transform, non-centred covariance) were tested for theirability to infer the simulated variogram structure for a large range of skew in thedata. Relative bias in variogram parameters was less than 5% with all vario-gram estimates.

7.2.4.5 Investigating Adaptive Survey Designs

The sampling design is homogeneous when sample points cover the surveyeddomain independently from the underlying spatial distribution beingsampled. The position of sample points can be random, stratified random,or on a regular or irregular grid. Geostatistics by modelling the spatialcovariance and weighting samples by kriging gives some flexibility in thedesign. When the design incorporates additional sample points in areas ofhigher abundance with the idea to gain information in those areas, the designis heterogeneous or adaptive. Adaptive designs result in targetting the sam-pling effort in positive areas rather than dispersing the effort in all areasincluding empty ones. Simmonds and MacLennan (2005) describe variousadaptive rules for fisheries acoustic surveys. Typically, an adaptive design is atwo stage sampling procedure. Level 1 samples are located according to ahomogeneous sampling scheme. Then level 2 samples are added in the vicinityof level 1 samples conditionally on the values observed at those level 1samples. Adaptive sampling suffers the risk of bias in the design. A richblock will contain low and rich values. When sampling the rich block, if thelevel 1 sample is low, no additional sample will be added and the block will beconsidered low in abundance. In contrast, if the level 1 sample has a highvalue, additional samples will be added and lower values will be sampled. Theresult is systematic underestimation of rich areas. Clearly, the bias depends onthe rule adopted to allocate additional samples. And simulations have beenused to evaluate the bias associated to particular rules. With a design-basedapproach Thompson and Seber (1996) proposed an unbias adaptive samplingrule (adaptive cluster sampling) with corresponding estimators. The clusteraround the rich value that triggers the addition of samples must be sampledentirely. Such design and corresponding mean estimates were applied by Loet al. (1997) for a larval survey. Conners and Schwager (2002) simulated nonconditionally 2D fields of Gaussian values with various cases of spatialcorrelation to test adaptive cluster sampling against homogeneous designs.Gaussian simulated values were exponentiated. Correlation in the simulatedvalues was obtained with an ‘ad hoc’ procedure (not allowing completecontrol of the simulated variogram). The adaptive cluster design performedwith no more bias than homogeneous designs and had higher precisionin cases of patchy distributions. A geostatistical approach to adaptive sam-pling is to model the relationship between point values and that of block

206 P. Petitgas

Page 17: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

means or to model the position of high values relative to lower ones.Non-linear geostatistics can be helpful in analysing data sampled with anadaptive rule. Petitgas (1997b) post-stratified ichtyoplankton survey databased on the correlation structure between high and low abundance areas.Petitgas (2004) used simulations to test bias in the design depending on theadaptive rule. 2D fields were non conditionally simulated using the Turningbands method and the Gaussian values were exponentiated. The level 1samples were positioned on a systematic grid of points with a mesh sizeequal to the variogram range. The rule for adding level 2 samples was basedon the mean of 3 consecutive level 1 samples. That rule allowed one to obtainhigher precision on the mean than a systematic design that had more pointswhile the bias in the mean stayed lower than 3%.

7.2.4.6 Variance Estimate of Combined Variables

The abundance of a given species derived from acoustic surveys is a combina-tion of the species length and the acoustic backscatter assigned to that species,where each variable comes from a specific sampling process (e.g., SimmondsandMcLennan 2005). The fish length is measured by sampling fish schools withappropriate gear using (quasi) point sampling with pelagic trawl hauls or purseseine sets. The acoustic backscatter is (quasi) continuously recorded by theechosounder along the sailed transect line and integrated over depth and aunit sailed distance (one nautical mile). Fish density is then estimated by a non-

linear combination of the two variables: ZðxÞ ¼ sAx

�l 2x10b=10

where sA is the acous-

tic (calibrated) nautical area scattering coefficient (sensu Simmonds and

McLennan 2005), �l 2 is the mean fish length squared and b is the (known)coefficient of the species target strength to length relationship. Fish densityZ(x) can be further assigned to ages using an age-length key giving the propor-tion of ages at any given length p(a,l): Z(a,x)=p(a,l)Z(x). In his study onoptimising effort allocation between acoustic transects and trawl hauls, Sim-monds (1995) considered variance terms for the different sampling processesbut the correct variance of the mean estimate of Z was not established. Theestimation variance of the mean fish density over the surveyed area is bestobtained by geostatistical conditional simulations, allowing for the combina-tion of the many possible maps (realisations) of length and acoustic backscatterthat contain the correct spatial variability in each variable. Working with theScottish acoustic surveys for North Sea herring, Gimona and Fernandes (2003)and Woillez et al. (2006a) estimated the error variance in the mean fish abun-dance by conditional simulations. Gimona and Fernandes (2003) used sequen-tial Gaussian simulations and Woillez et al. (2006a) used the turning bandsmethod. Conditional simulations of the fish length and the acoustic backscattersAwere performed on the same grid of points. Then the maps were combined toestimate the map of the fish density using the above acoustic formula:

7 Geostatistics and Their Applications to Fisheries Survey Data 207

Page 18: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Zði; rÞ ¼ fðsAi;r; �l2

i;r Þwhere i is the index for the grid points and r corresponds to

the realisations. The maps of Z(i,r) were averaged over the domain V to obtainthe mean fish abundance in each realisation ZV (r). For a large number ofrealisations, the histogram of ZV (r), its mean E [ZV (r)] and variance Var[ZV

(r)] were estimated. The estimation variance of themean estimatewasVar[ZV (r)].The bias in the simulations was: E ½ZVðrÞ� � Z�V, where Z

�Vwas the mean survey

estimate. Simulating fish length caused no problem because structure was welldefined and the variable is close to Gaussian. In contrast, the simulation of theacoustic backscatter wasmore difficult because of the large amount of zero valuesin the data. To solve that problemWoillez et al. (2006a) used a Gibbs sampler tocoherently assign Gaussian values to the zero samples knowing the surroundingpositive values and the covariance structure of the raw data. The resulting relativeestimation error was close to 15% for the different surveys analysed. This is theorder of magnitude that can be expected in acoustic surveys when the assignmentof echo-traces to species is consideredwithout error, as was the case in the presentexample.

7.3 Ecological Considerations

In contrast to the previous part in which spatial variation was modelled to servethe estimation process, in this part we focus on different ways by whichgeostatistical structural analysis tools can be used to reveal ecologically mean-ingful characteristics in the spatial variation and enhance ecologicalunderstanding.

7.3.1 Spatial Relationships Between Variables

There can be many ways by which two variables are related and thereforespatial relationships between variables can be investigated in several ways.A variety of tools have been suggested to characterise collocation, covariationand conditional variation. Collocation characterises the point-to-point agree-ment of two spatial distributions while covariation captures how the change invalues as a function of distance (e.g., gradients) is correlated between twomaps. Conditional variation looks at the variation of one variable relative tothe other.

7.3.1.1 Overlap

The Global index of collocation (GIC: Bez and Rivoirard 2000a, Table 7.2) is ameasure of how closely collocated two spatial distributions are. The indexprovides a measure of overlap between two spatial distributions and can serveas a simple distance between maps allowing for the classification of maps.

208 P. Petitgas

Page 19: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Table

7.2

Attributesofspatialdistributionsandindices

tocharacterisethem

.Notationsareasfollows.i:index

ofsamples;z(x):fish

density

atx

(countsoffish

per

n.m

.2);s i:areaofinfluence

ofsamplei;Iz

i:equals1ifsamplezi>

0and0otherwise;Q:totalabundance

asdefined

bythespatial

integralofz:Q¼RzðxÞdx;g:geostatisticaltransitivecovariogram:gðhÞ¼

RzðxÞzðxþhÞdx;Q(a):summed

abundance

from

richestvalues

thatstand

onareaa:QðaÞ¼

PN

i¼pðaÞs i

z iwhereaisaproportionoftotald

omain:a¼PN

i¼pðaÞs i

/A;C

G:centreofgravity;h

0:a

lagdistance

chosenasappropriate(m

ean

distance

betweennearestsampleneighbours)

Attribute

Index

name

Index

description

Form

ula

Reference

Occupation

Positivearea

Areaofnonnullvalues

PA¼X

i

s iI z

i40

Woillezet

al.(2007)

Aggregation

Spreadingarea

Equivalentarea

Spatialconcentrationofabundance

relativeto

ahomogeneousdistribution

Integralrangeoftherelativecovariogram,also

theinverse

probab

ilityfortw

orandom

individualsto

beatsamelocation

SA¼

2

Z1

0

ð1�QðaÞ

QÞda

EA¼

Q2=gð0Þ

Woillezet

al.(2007)

Bez

etal.(2001)

Location

GravityCenter(C

G)

Number

ofPatches

Weightedaverageofsamplepositions

Number

ofPatches

asdefined

usingadistance

threshold

CG¼Z

xzðxÞ

Qdx

rule:rankzin

decreasingorder,start

computingCGofrichestvalues;ifytoo

distantfrom

CGofpreviousvalues,

consider

new

patch;continue

Bez

etal.(2001)

Woillezet

al.(2007)

Dispersion

Inertia(I)

Anisptropy

Weightedvariance

ofsamplepositionsaround

agravitycentre

Ratioofinertiafordirectionscarryingminim

al

andmaxim

alinertia

I¼Z

ðx�CGÞ2

zðxÞ

Qdx

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

I max=I m

in

p

Bez

etal.(2001)

Woillezet

al.(2007)

Correlation

Microstructure

index

Range

Decrease

ofcorrelationatshortdistance

Distance

beyondwhichcorrelationvanishes

MI¼

gð0Þ�

gðh

gð0Þ

Firstuforwhichg(u)=

0

Woillezet

al.(2007)

Matheron(1971)

Overlap

betweentw

odistributions

Globalindex

of

collocation

Ratioofdistance

betweengravitycentres

and

random

individuals

GIC¼

1�

�CG

2

�CG

2þI 1þI 2

Bez

etal.(2000a)

7 Geostatistics and Their Applications to Fisheries Survey Data 209

Page 20: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

The GIC can be used to investigate inter-annual changes in the spatial distribu-

tion of the same fish or between ages to characterise differences in habitats

across the life cycle (Woillez et al. 2007). It could also be helpful as a measure of

spatial overlap between predator and prey. For investigating more complexe

relationships between spatial distributions other tools are helpful.

7.3.1.2 Inertiogram

To investigate whether a fish distribution is constrained to particular domains

of the distribution of an explanatory variable, Bez and Rivoirard (2000b)

suggested the use of the inertiogram. The concept was illustrated with fish egg

spatial distributions and temperature fields. The control of the fish egg spatial

distribution by that of temperature was tested by translating the temperature

field relative to the egg distribution. Temperature was weighted by egg abun-

dance occurring at the same location leading one to estimate the mean tem-

perature per individual egg as well as the temperature variance per egg. The

inertiogram was the temperature variance per egg as a function of the vector

translation distance. The inertiogram map was constructed by considering

translations in different directions. The inertiogrammaps allowed one to visua-

lise whether a particular temperature range or spatial domain controlled the fish

egg distribution. If so, the inertiogram showed low values in those areas.In the previous examples, the structural tools were based on the geostatistical

transitive methodology (Matheron 1971, Petitgas 1993a, Bez 2002). The tran-

sitive method deals simply with the zeroe values without the need to delineate

the domain of presence. It is appropriate for case studies where many zeroes

occur in the data. The structure in the transitive covariogram then characterises

both the intrinsic spatial structure of the variable (that characterised by the

variogram) as well as the influence of the geometry of the domain (e.g., lower

values near the borders). In intrinsic geostatistics (usually named geostatistics:

Matheron 1971), the domain of study is assumed to be known with no influence

on the spatial distribution of the variable of interest.

7.3.1.3 Cross-Variogram

The cross-variogram characterises the spatial covariation between two continu-

ous variables. It is symetrical as both variables play the same role in the analysis.

It is the structural tool in multivariate linear geostatistics (Wackernagel 1995).

Barange et al. (2005) used covariograms to analyse how sardine and anchovy

were spatially organised one relative to the other for different years with different

total abundances. The analysis revealed that sardine and anchovy spatially

alternated in years of low abundance while they co-occurred in years of high

abundance.

210 P. Petitgas

Page 21: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

7.3.1.4 Ratio of Cross and Simple Indicator Variograms

The cross-variogram is also a structural tool in non-linear geostatistics (Rivoirard

1994) as a non linearmodel is a multivariate model for all indicator cut offs of the

variable (the indicator Iz(x)>c equals 1 when the value z(x) is greater than the

cut-off c and 0 otherwise). Building from the properties of the non-linear

model (e.g., diffusive or not), cross-variograms of indicators are helpful in

ecology to investigate whether transitions are progressive or sharp when

crossing a border. Indicators of the target variable spatially define geometrical

sets or domains. The cross-variogram of two indicators (Iz(x)�c1 and Iz(x)�c2with c1 � c2) divided by the variogram of the indicator of the low cut-off c1

represents the probability as a function of distance h to encounter values

higher than the high cut-off c2 when in the domain of the low cut-off c1 (one

extremity of the vector h is outside the domain of c1 and the other inside that of c2):�Ic1xIc2 ðhÞ�Ic1ðhÞ ¼ Prob½Zðxþ hÞ � c2 = ðZðxÞ5c1;Zðxþ hÞ � c1Þ�. Petitgas (1993b)

analysed the spatial setting of high herring densities relatively to low and

medium densities. The analysis revealed that high densities could occur any-

where within the domain of medium values, which was a large domain, meaning

that high densities were difficult to predict. One can also envisage to use these

tools when an explanatory variable is given as a polygon. For instance, polygons

can represent a characteristic of the environment, e.g., the area of a river plume,

the area where gyres develop, etc. The cross variogram between the polygon

indicator and the target variable will allow one to investigate how the target

variable responds spatially to the polygon.

7.3.1.5 Constrained Variogram

In a similar way than above and to demonstrate change in the spatial continuity

of a target variable when crossing the border of geometrical sets defined by the

spatial setting of an ancillary variable, the variogram can be computed with a

selected set of samples resulting in constrained variograms. The selection can be

for those pairs of points that are inside or outside defined spatial sets or that

each stand on one side of the border limit. The difference between the con-

strained variogram and the overall variogram computed on all pairs of points

irrespective of the border limits will serve as a test to demonstrate the impact of

a particular explanatory variable on the spatial structure of the target variable.

Rivoirard et al. (2000, Chap. 4) investigated the influence of the shelfbreak

contour on the variogram of blue whiting. They computed a ‘constrained’

variogram with only those pairs of points that fell close to the shelfbreak

contour. The constrained variogram was lower than the overall variogram,

meaning that the shelf break was associated with a difference in variance

between along and across the contour.

7 Geostatistics and Their Applications to Fisheries Survey Data 211

Page 22: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

7.3.1.6 D2-Variogram

At sample point x the variable of interest may not always be one value z(x) but ap-component vector (v1(x), v2(x), . . .,vp(x)). For instance, when characterisingschools with many parameters, one may construct a p-component vector at thespatial scale of a few nautical miles to characterise the acoustic image (echo-gram). What is then the spatial structure of the acoustic images? Petitgas (2003)

suggested the D2-variogram: D2�ðhÞ ¼ 1

2EP

k

VkðxÞ � Vkðxþ hÞ� �2

� �

. It showed

good spatial structure of acoustic images for the Bay of Biscay echograms.

7.3.2 Indices of Spatial Pattern

Indices have been developed at various spatial scales to summarise variousaspects of spatial distributions that cannot be explicitly characterised with thevariogram. Such indices can then be used in a monitoring approach of spatialdistributions. Indices have been developed to characterise the spatial distribu-tion of fish density values over a few nautical miles, the schooling pattern andthe clustering pattern of schools. The multi-scale organisation of spatial dis-tributions will be discussed using density-dependence as making the linkbetween scales of spatial organisation.

7.3.2.1 Indices for Density Values

Until now we have used tools that mainly characterise correlation. But correla-tion is only one aspect of spatial distributions. How then can one characterise aspatial distribution in its many aspects? For a full characterisation of the manyproperties of spatial distributions a variety of geostatistical indices have beendeveloped. Woillez et al. (2007) proposed a list of 10 indices (Table 7.2) tocharacterise occupation, aggregation, location, dispersion, correlation andoverlap. These notions are somewhat related (e.g., aggregation, dispersionand occupation) and formal relationships exist between indices (Woillez et al.2007). The centre of gravity of a population with a measure of dispersionaround it had been proposed already (Swain and Sinclair 1994, Atkinsonet al. 1997, Bez and Rivoirard 2001). The occupation and aggregation indicesare not truely spatial in the sense that they are sensitive to the histogram and notto the spatial location of values. Various indices to characterise aggregationhave been suggested (area coverage: Swain and Sinclair 1994, Gini index:Myersand Cadigan 1995, spatial selectivity index: Petitgas 1998) which all relate to thearea associated with the highest values. But the spreading index is more generalin the sense that the amount of zeroes do not affect this index. Therefore incalculation of the spreading index the delineation of the positive data domain isnot necessary. Spatial indices are useful in characterising the spatial organisa-tion of the life cycle. It can be demonstrated that young immatures, young

212 P. Petitgas

Page 23: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

matures and older matures differ in some aspects of their spatial distributions,in particular for location, aggregation and dispersion (Woillez et al. 2007). Also,these indices have the potential to be used in a monitoring system so as to detectchanges in the spatial distributions, which could be helpful in the context ofclimate change or habitat conservation for fish stocks.

7.3.2.2 Indices of Schooling Patterns

The previous spatial indices characterise how density values (numbers of fishper unit area) are organised geographically. The area unit of such samples is ingeneral a few nautical miles. At a smaller spatial scale, fish are organised inschools or shoals or aggregations. For pelagic fish, a set of indices have beenproposed to characterise the schools (ICES 2000) which relate to school geo-metry, internal density and vertical position. These parameters can be estimatedby analyzing by school the digital acoustic records (echogram) using imageanalysis software. School size and abundance are in general related on a log-scale (Freon andMisund 1999, Chap. 4). The bivariate plot can be summarisedby summing the schools acoustic backscatter by ascending order of school size.Such (spectrum) curves indicate how biomass is distributed in classes of schoolsize and their curvature indicates the higher contribution of particular schoolsizes. Curvature was characterised by geostatistical (schooling) indices (Petitgas2000, p. 29: spectrum indices 1 and 2) which are an extension of the spatialselectivity index where occupation area is replaced by school size. Two indiceswere considered. Index spectrum 1 was defined as the area difference betweenthe observed curve and the diagonal and characterised the curvature. It issensitive to the skew in the distribution of fish biomass as a function of schoolsize as well as to the skew in the distribution of school size. Index spectrum 2wasdefined as the area difference between the observed curve and the curveobtained considering that all schools had equal density (equal to the ratio ofthe summed school biomass over the summed school sizes). This second indexcharacterised the skew in the distribution of biomass as a function of school sizeirrespective of the distribution of school size. Inter-annual variation in theschooling pattern could be characterised using these indices: in the Bay ofBiscay school biomass varied but not the distribution of school size while fornorthern North Sea herring schools, school biomass varied with school sizedistribution.

7.3.2.3 Indices of Clustering Patterns of Schools

At a higher spatial level of organisation, schools occupy habitats with particularspatial distributions as they generally occur in clusters of the schools (Freon andMisund 1999, Chap. 4). Digital acoustic records can be replayed by school usingimage analysis softwares (ICES 2000) providing data sets of georeferenceschool parameters. In such data, the nearest school neighbour distance is ingeneral skewed, which demonstrates that schools are aggregated in clusters of

7 Geostatistics and Their Applications to Fisheries Survey Data 213

Page 24: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

schools. Swartzman (1997) defined school clusters by grouping schools togetherbased on a chosen distance threshold applied to the nearest school neighbouralong the sailed acoustic transects. Petitgas (2003) suggested a procedure todefine the threshold based on a spatial point process approach which maximisedhomogeneity of the school distribution within the clusters. School cluster indiceswere estimated to characterise the clustering pattern: number of clusters, numberof solitary schools, dimension of clusters, number of schools per unit clusterlength, skewness in the nearest neighbour distance within clusters. Resultingclusters of schools showed spatial scales of a few nautical miles (3–5 n.m.),while larger spatial components of tens of miles could also be present in thedata sets (regional or meso scale structures). The pair correlation function, ananalog of the variogram but for point process (Stoyan and Stoyan 1994), was ahelpful structural tool to demonstrate scale in the schools spatial distribution.

7.3.2.4 Multiscale Organisation and Density-Dependence

How do we relate the different spatial scales at which fish populations areorganised? Statistical relationships have been examined for between globalcharacteristics of the population and particular spatial indices. Relationshipsbetween population abundance and area occupied have been observed either inthe form of a relationship between global abundance and local density (Myersand Stokes 1989, Fisher and Frank 2004) or between global abundance andspatial indices of occupation and dispersion (Swain and Sinclair 1994, Atkinsonet al. 1997, Woillez et al. 2007). But abundance may not always vary withoccupancy (Swain and Morin 1996). Petitgas et al. (2001) attempted a multi-scale analysis of the variation in the spatial distribution with global abundance,including schooling and clusters of schools. They found no relationshipbetween global abundance and indices of schooling and clustering but founda relationship between total school number and clustering parameters. A vari-ety of situations seem possible perhaps because the scales and controls in thepopulation spatial organisation are difficult to identify clearly using fisheriessurvey data. Four scenarios (Fig. 7.2) have been suggested with a geostatisticalprocedure for testing them on data (Swain and Sinclair 1994, Petitgas 1998):proportionality between global abundance and local density; change in habitatoccupancy with no change in average local density; and intermediate caseswhere lower density areas or specific sites are replenished first when globalabundance increases. Based on ecological theory, an underlyingmechanism hasbeen proposed to explain abundance – occupancy relationships. It is that of thedensity dependent suitability of habitats (MacCall 1990: ‘basin model’) whichbalances potential suitability of habitats with intra specific competition. This‘basin model’ falls into one of the intermediate scenarios. The way by which thespatial organisation of a fish population can vary as an ensemble in all itsorganisational scales (schools, clusters, regional) cannot be predicted as yet asno multi-scale integrative model of spatial distribution exists. Such develop-ment would not only require a behavioural spatial mechanism to pass from one

214 P. Petitgas

Page 25: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

scale to the other but also a relation between external forces (fishing, environ-ment, predation) and habitat suitability and fish behaviour. Comprehensivedata analyses using spatial indices at various spatial scales in different stocksituations and other forcing parameters are needed for knowledge to progress.Further, the spatial distribution does not seem to relate to population abun-dance only:Woillez et al. (2006b) also showed correlation between spatial indicesand population dynamics parameters such as mortality or recruitment, makingspatial indices good candidates for indicator based monitoring of fish stocks.

7.3.3 Variation in the Spatial Structure

The way by which fish aggregate and occupy their habitats is the expression offish behaviour and can therefore be expected to depend on a variety of ecolo-gical factors. Thus the spatial structure as characterised by the variogram can beexpected to depend at least on particular constraints (e.g., light, habitat geo-metry) and vary in time (e.g., day and night, seasonally, inter-annually) as wellas with fish length or total abundance. Also, because sampling across spacerequires a certain amount of time, the space-time interaction within the surveydata potentially affects the variographic structure and this has been investigatedas well.

7.3.3.1 Aggregative Behaviour

The size and anisotropy of the domain over which fish distribute constrain theirspatial organisation. Giannoulaki et al. (2006) compared the variographicstructure of sardine and anchovy in different areas and seasons and reportedthat the range of the variogram varied with the size of the areas overwhich fishdistributed. Also variograms varied seasonally concomittantly with seasonalvariation in fish length. Time of day is also a factor affecting the aggregative

Fig. 7.2 Four scenarios of spatial distribution change with global abundance. The linesmarked (1) and (2) represent the density surface when global abundance increases from year(1) to year (2). Abscissa x represents space. Ordinate Z(x,t) represents fish density at locationx in year t [after Petitgas (1998)]

7 Geostatistics and Their Applications to Fisheries Survey Data 215

Page 26: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

behaviour of fish. Pelagic fish tend to aggregate in schools during day time anddisagregated at night (Freon andMisund 1999). Rivoirard et al. (2000) reportedday/night variographic difference on the Norwegian spring spawning herringwhen wintering in the fjord complex at Lofoten. Night variograms had longerrange than day variograms, meaning that the fish aggregations were shorterduring day than during night. The aggregative behaviour also varies seasonally(e.g., feeding vs. spawning or migratory schooling behaviour: Freon andMisund1999). Mello and Rose (2003) reported different types of variographic structuredepending on seasonally varying types of aggregating behaviour. They analysedwhich seasonal spatial structure had the best survey precision for different typesof survey designs.

7.3.3.2 Inter-Annual Variation

Unless total abundance dramatically changes across years, the variographicstructure shows good consistency across years, the range being the most con-sistent while sill and nugget showmore variability. Analysing a large number ofyears and different species, Fernandes and Rivoirard (1999) (also in Rivoirardet al. 2000) used an automated procedure to fit a variogrammodel for each yearand species. Different models (e.g., spherical, exponential, linear) were fitted ineach year and a goodness of fit criteria was used to finally select the model.Nearly all years shared similar variographic models and range parametersexcept for a few years which where atypical and dominated by a few very highvalues. For these years, the average variogram across all years was used. In asimilar situation with egg survey data, Bellier et al. (2007) estimated variogrammodel parameters in each year by fitting all variograms of all years simulat-neously using non linear mixed effects regression (Pinheiro and Bates 2000).Considering that all years shared a similar underlying variogram, all years werefitted with a spherical model with a constant range but varying nugget and sillacross years. Fixed effects in the range, sill and nugget and random effects in thesill and nugget were estimated by the mixed effects regression procedure, whichprovided variogram parameters in each year.

7.3.3.3 Density-Dependence

Stock collapse is often associated with reduction in the spatial occupation andthis has been reported using indices of spatial occupation (see above). Vario-graphic structure is also expected to show density-dependence at particularabundance levels. Warren (1997) reported changes in the Northen cod spatialstructure associated with the stock collapse: variogram range decreased andnugget increased with collapsing stock abundance. Barange et al. (2005)reported little change in the indicator variograms for low and high cut-offs incontrasting years of high and low total abundance. But the variographic struc-ture for intermediate cut-offs varied with total abundance. The range increasedwith abundance, meaning that when abundance increased, the intermediate

216 P. Petitgas

Page 27: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

values occupied larger areas. This spatial behaviour is coherent to that reportedusing other but related tools, e.g., by Swain and Sinclair (1994) using occupiedareas at different cut-offs, by MacCall (1990) using the ‘basin model’ based ondensity-dependent habitat selection theory, by Petitgas et al. (2001) usingclustering indices, or by Petitgas (1998) using geostatistical aggregation curves.

7.3.3.4 Multi-Survey Spatial Structural Models

Annual monitoring fisheries surveys result in a time series of spatial data.To model the space-time spatial structure in all years coherently, and makebest use of all the available information, 3D variogram models have beenconsidered. In acoustic fisheries surveys, the number of trawl hauls are scarcein each year but when surveys are repeated on an annual basis with similarsampling design the number of trawl hauls available across years is larger.When the spatial distribution and variogram structure is consistent acrossyears, the map in any given year could be more precise when using all samplesfrom all years rather than just the samples of the current year. For thatpurpose Guiblin et al. (1996) (also in Rivoirard et al. 2000, Chap. 4) inferreda multi-year spatial structure. They worked on mapping herring length in thenorthern North Sea. The spatial structure of fish length was consistent acrossthe years. Temporal variation was collapsed as either ‘in the same year’ or ‘indifferent years’ irrespective of the year lag. The space-time variogram modelwas addtitive: g(h, t) = gspa (h) + (t), where was a nugget effect in time. Thevariogram in space irrespective of time gspa (h) was estimated as the variogramfor pairs of points belonging to the same year and averaged across all years.The 3D variogram was that for pairs of points separated in space by distance hand belonging to two different years. When kriging at a particular point in aparticular year using a 3D neighbourhood, sample points of the given year willbe considered as well as sample points from other years. A temporal nuggeteffect is thus added in the kriging system for those samples not belonging to theyear considered. It should be noted that the full model contained a trendsurface guided by depth and that the space-time spatial structure applied tothe residuals. A similar model was fitted to anchovy length in the Bay ofBiscay although the structure was slightly different (Petitgas et al. 2003b):g(h, t)= It=0 gspa (h)+It>0 . In this case study, interannual variations weresuch that the location of patches changed between years resulting in a purenugget structure between sample points belonging to two different years.Here, in the 3D neighbourhoods, there is a switch in the model (structure ornugget) depending on whether the sample points belong to the same year forwhich kriging is performed or not. The multi-year modelling approach used inboth case studies could have a generic interest for analysing fisheries surveydata in each year while using the multi-year information. Space time modelsare also of interest when estimating total annual egg abundance using repeatedegg surveys over the spawning season. The estimation problem can be solvedin two steps, first the estimation of egg abundance at each time by spatial

7 Geostatistics and Their Applications to Fisheries Survey Data 217

Page 28: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

integration and second the integration under the egg abundance ogive toestimate total annual egg abundance. Petitgas (1997a) fitted a multiplicativespace-timemodel on sole egg distributions and derived the estimation variancefor the annual egg abundance with that model. In that model, the temporalstructure was in the spatial mean while the spatial structure was constant intime.

7.3.3.5 Space-Time Interaction Within a Survey

It is usual practice to consider survey data as synoptic and correlation in thedata as spatial only. But because of fish movement, variation in aggregatingbehaviour and because sampling across space takes a certain amount of time,survey data will contain space-time interactions. The importance of the inter-action will depend on biological variation, survey design, and boat speed. Theinfluence of time cannot reduce to the addition of a further dimension becausesampling is not performed in 3D space but in a changing 2D space (e.g., Petitgas2001). Rivoirard (1998) proposed space-time variogram models for differenttypes of fish movements. For instance, in the case of brownian fish movementand isotropic spatial structure, the space-time covariance can be written as aconvolution of the underlying spatial covariance and diffusion due to move-ment. The effect of different fish movements (random, cyclical, migration) onthe variogram was investigated on northern North Sea herring using sim-ulations (Rivoirard et al. 2000, Chap. 5). An underlying average spatial dis-tribution was estimated over the time series of surveys that represented theprobability map of the fish distribution. A large number of patches of fish wereconsidered for motion, which were located initially using the probability map.In the case of the random motion of patches, a constraint on the probability ofmotionwas imposed so as to conform at any time and location to the underlyingprobability map of the fish distribution. The fish distribution was dynamicallysimulated while acoustic survey transects were simulated that sampled the fish.Variograms were estimated using the simulated data for the different types offish movements. The result was that the influence of random and tidal motionhad little influence on the variographic structure. In contrast the influence ofmigration was a concern when the survey transects were in the direction of themigration. If the survey transects crossed the migration direction, the effect wasless important.

7.4 A Word on Software

A large variety of geostatistical computer software and libraries are available.Most include variography and kriging on a grid. Some offer a comprehensivelist of geostatistical tools (e.g., isatis, gslib) and others are dedicated to parti-cular aspects of geostatistics (e.g., Variowin, Eva). Updated information on

218 P. Petitgas

Page 29: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

computer software is available on the internet. The ai-geostats home page(http://www.ai-geostats.org/) provides, among other information about geos-tatistics, an inventory of software with descriptions of their functionalities andlinks to their home page. The ai-geostats web resource is also a forum fordiscussion on geostatistics. Rivoirard et al. (2000) provides an appendix withguidance on geostatistical books and description of software tools. Some of thisinformation is updated here. Only softwares that are the most comprehensive oraddress particular aspects that others do not and that are of interest in fisheriesscience are presented here (Table 7.3). The present selection also covers therange of different computer platforms.

Isatis and Gslib are the most comprehensive packages. Isatis is a completegeostatistical package offering all geostatistical methods (linear, non-linear,stationary, non-stationary, monovariate, multi-variate, simulations). It is acommercial software package that runs under Unix or an emulated PC. Gslib(Deutsch and Journel 1992) is a suite of FORTRAN routines that also covers awide range of geostatistical methods though perhaps less complete for non-linear geostatistics. The code of PC executables is free and dowloadable fromthe internet. An interface for running Gslib (WinGslib) exists as a commercialproduct for Windows. The MATLAB kriging tool box is a collection ofMATLAB routines for kriging and co-kriging based on Gslib routines. Thecode is free and accessible from the internet. Gstat (Pebesma 2004) is an S-pluslibrary as well as an R package, which covers multivariate geostatistics andsimulations (the R package may offer slightly less functionalities than theS library). The code is free and is downloadable from the internet. Variowin(Pannatier 1996) is a PC software tool dedicated to the estimation and fitting ofvariograms. The code is not open source. The book and executable software canbe downloaded from the internet. None of the previous softwares considers the

Table 7.3 Selected geostatistical softwares. An inventory of softwares is available at http://www.ai-geostats.org/software/

Name Access Code Internet Reference

Isatis Commercial No http://www.geovariances.fr/

Gslib Free: on internet Yes http://www.gslib.com/ Deutsch andJournel(1992)

Matlab toolbox

Free: on internet Yes http://www.globec.whoi.edu/software/

Gstat Free: on internet Yes http://www.gstat.org/ Pebesma (2004)

Variowin Free: on internet No http://www.sst.unil.ch/research/variowin/index.html

Pannatier(1996)

Eva Free: fromauthors

No contact:[email protected]

Petitgas andLafont(1997)

7 Geostatistics and Their Applications to Fisheries Survey Data 219

Page 30: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

question of global estimation over a domain. EVA (Petitgas and Lafont 1997) isa PC software tool that specifically targets global abundance and its estimationvariance for a large variety of survey designs including adaptive design. It is theonly software available that allows one to estimate the global estimation var-iance. The code is not open source but the software executable is freely availablefrom the authors.

7.5 Future Challenges

Geostatistics formalizes the relationship between sample locations, correlationstucture in the population and precision in the estimates. Geostatistics providesmodel-based variance estimates of global abundance as well as mapping bykriging. Therefore and in contrast to a design-based approach, geostatisticsallows one to separate between data analysis and survey design, giving moreflexibility in the design. These have been the early contributions of Geostatisticsto fisheries science that resolved the problem of estimating the precision ofglobal abundance over a domain for sampling designs in which the sampleswere not taken independently from each other. Because the corner stone ofgeostatistics is the modelling of a spatial structure, the methodology also offerstools to characterise population aggregation patterns and address the keybiological question of changes in the spatial organisation of fish populationsunder the controls of density dependence, environment and behaviour.

The need to understand the spatio-temporal variability in fish stocks and itscontrols as well as develop multiscale models of fish populations spatial orga-nisations remain. In the classical univariate geostatistical approach, variabilityin the data is interpreted as spatial variation only. But when using spatio-temporal and multivariate approaches, mathematical dimensionality isincreased in order to properly consider the variability in the data. To solvethat paradox and because biological variability originates from biologicalprocesses occurring at different scales, multiscale data collection schemeswould be helpful. Multivariate geostatistics has the potential to assemble multi-scale information as well as combine stochastic and deterministic approaches.Multiscale sampling can be achieved in various ways, by adaptive samplingdesigns or by combining surveys performed at different scales, e.g., from thelarge scale fisheries surveys to the fine scale survey on aggregation behaviourover time. Multivariate geostatistical approaches represent a large field withchallenging fisheries applications, that in addition to linear multivariate geos-tatistics also includes non-stationary and non-linear geostatistics as well asconditional simulations. Software tools are now widely available that willallow for the development of many future applications using these more com-plexe approaches.

Fisheries management issues have expanded to include population conserva-tion issues as well as an ecosystem approach to fish stock diagnostics. Thus

220 P. Petitgas

Page 31: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

there is a need to relate fish population dynamics with population distributionand the occupation of essential habitats, trophic interactions and climate for-cing on the habitats. In all these topics it is key to monitor the spatial distribu-tion of a range of target fish species as well as their environment. Geostatisticalspatial indices have shown potential in elaborating indicator based diagnosticsof fish stocks. This approach can be extended to a variety of indices in thedifferent components of the ecosystem. Also, an atlas of geostatistically derivedfish distribution maps can be helpful in establishing regions of essential fishhabitat on a multispecies and multi-year basis. Geostatistics will be useful indeveloping spatial indices to serve as indicators for management as well asdevelop space-time methods to analyse large numbers of distribution mapswith functional relationships at different scales.

References

Armstrong M, Renard D, Rivoirard J, Petitgas P (1992) Geostatistics for fish survey data.Course C-148, Centre de Geostatistique, Fontainableau, France

Atkinson D, Rose G, Murphy E, Bishop C (1997) Distribution changes and abundance ofnorthern cod (Gadus morhua), 1981–1993. Canadian Journal of Fisheries and AquaticSciences 54(Suppl. 1): 132–138

Barange M, Coetzee J, Twatwa N (2005) Strategies of space occupation by anchovy andsardine in the southern Benguela: the role of stock size and intra-specific competition.ICES Journal of Marine Science 62: 645–654

Bellier E, Planque B, Petitgas P (2007) Historical fluctuations in spawning location ofanchovy (Engraulis encrasicolus) and sardine (Sardina pilchardus) in the Bay of Biscayduring 1967–1973 and 2000–2004. Fisheries Oceanography 16: 1–15

Bez N (2002). Global fish abundance estimation from regular sampling: the geostatisticaltransitive method. Canadian Journal of Fisheries and Aquatic Sciences 59: 1921–1931

BezN, Rivoirard J (2000a) Indices of collocation between populations. In: ChekleyD,HunterJ, Motos L, van der Lingen C (eds) Report of a workshop on the use of ContinuousUnderway Fish Egg Sampler (CUFES) for mapping spawning habitat of pelagic fish.GLOBEC Report 14

Bez N, Rivoirard J (2000b) On the role of sea surface temperature on the spatial distributionof early stages of mackerel using inertiograms. ICES Journal of Marine Science 57:383–392

Bez N, Rivoirard J (2001) Transitive geostatistics to characterise spatial aggregations withdiffuse limits: an application on mackerel ichtyoplankton. Fisheries Research 50: 41–58

BouleauM, Bez N, Reid D, GodoO, Gerritsen H (2004) Testing various geostatistical modelsto combine bottom trawl stations and acoustic data. ICES CM 2004/R:28

Bulgakova T, Vasilyev D, Daan N (2001) Weighting and smoothing of stomach content dataas input for MSVPAwith particular reference to the Barents Sea. ICES Journal of MarineScience 58: 1208–1218

Chiles JP, Delfiner P (1999) Geostatistics: modelling spatial uncertainty. Wiley, New YorkCochran W (1977) Sampling techniques. Wiley, New YorkConan, G (1985) Assessment of shellfish stocks by geostatistical techniques. ICES CM 1985/

K:30Conners E, Schwager S (2002) The use of adaptive cluster sampling for hydroacoustic surveys.

ICES Journal of Marine Science 59: 1314–1325Cressie N (1991) Statistics for spatial data. Wiley, New York

7 Geostatistics and Their Applications to Fisheries Survey Data 221

Page 32: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Deutsch C, Journel A (1992) Geostatistical software library and user’s guide. OxfordUniversity Press, Oxford

Doray M, Petitgas P, Josse E (2008) A geostatistical method for assessing biomass of tunaaggregations around Fish Aggregation Devices with star acoustic surveys. CanadianJournal of Fisheries and Aquatic Sciences 65: 1193–1205

Fernandes P, Rivoirard J (1999) A geostatistical analysis of the spatial distribution andabundance of cod, haddock and whiting in North Scotland. In: Gomez-Hernandez J,Soares A, Froideveaux R (eds) GeoENV II – Geostatistics for Environmental Applica-tions. Kluwer Academic Press, Dordrecht. pp 201–212

Fisher J, Frank K (2004) Abundance distribution relationships and conservation of exploitedmarine fishes. Marine Ecology Progress Series 279: 201–213

Freon P, Misund O (1999) Dynamics of pelagic fish distribution and behaviour: effects onfisheries and stock assessment. Blackwell Science, Oxford

Gerlotto F, Bertrand S, Bez N, GutierrezM (2006)Waves of agitation inside anchovy schoolsobserved with multi beam sonar: a way to transmit information in response to predation.ICES Journal of Marine Science 63: 1405–1417

Giannoulaki M, Machias A, Koutsikopoulos C, Somarakis S (2006) The effect of coastaltopography on the spatial structure of anchovy and sardine. ICES Journal of MarineScience 63: 650–662

Gimona A, Fernandes P (2003) A conditional simulation fo acoustic survey data: advantagesand pitfalls. Aquatic Living Resources 16: 123–129

Gohin F (1985) Planification des experiences et interpretation par la theorie des variablesregionalisees: application a l’estimation de la biomasse d’une plage. ICES CM 1985/D:03

Guiblin P, Rivoirard J, Simmonds J (1995) Analyse structurale de donnees a distributiondissymetrique: exemple du hareng ecossais. Cahiers de Geostatistique 5: 137–159

Guiblin P, Rivoirard J, Simmonds J (1996) Spatial distribution of length and age for Orkney-Shetland herring. ICES CM 1996/D:14

ICES (1989) Report of the workshop on spatial statistical techniques. ICES CM 1989/K:38ICES (1992) Acoustic survey design and analysis procedure: a comprehensive review of

current practice. ICES Cooperative Research Report 187ICES (1993) Report of the workshop on the applicability of spatial statistical techniques to

acoustic survey data. ICES Cooperative Research Report 195ICES (2000) Report on Echotrace Classification. ICES Cooperative Research Report 238Journel A, Huijbregts Ch (1978) Mining geostatistics. Academic Press, LondonLantuejoul C (2002 Geostatistical simulations: models and algorithms. Springer-Verlag,

BerlinLo N, Griffith D, Hunter J (1997) Using a restricted adaptive cluster sampling design to

estimate hake larval abundance. CalCOFI report 38: 103–113MacCall A (1990) Dynamic geography of marine fish populations. University of Washington

Press, SeattleMatheron G (1971) The theory of regionalised variables and their applications. Les Cahiers du

Centre deMorphologieMathematiques, Fascicule 5. Centre deGeostatistique, FontainebleauMatheron G (1973) The intrinsic random functions and their applications. Advances in

Applied Probability 5: 439–468Matheron G (1989) Estimating and choosing: an essay on probability in practice. Springer-

Verlag, BerlinMcCullagh P, Nelder J (1995) Generalised linear models. Chapman and Hall, LondonMello L, Rose G (2003) Using geostatistics to quantify seasonal distribution and aggregation

patterns of fishes: an example of Atlantic cod (Gadus morhua). Canadian Journal ofFisheries and Aquatic Sciences. 62: 659–670

Myers R, Cadigan N (1995) Was an increase in natural mortality responsible for the collapseof northern cod? Canadian Journal of Fisheries and Aquatic Sciences 52: 1274–1285

222 P. Petitgas

Page 33: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Myers R, Stokes K (1989) Density dependent habitat utilization of groundfish and theimprovement of research surveys. ICES CM 1989/D:15

Pannatier Y (1996) Variowin: software for spatial data analysis in 2D. Springer-Verlag, BerlinPebesma E (2004) Multivariate geostatistics in S: the gstat package. Computers and Geos-

ciences 30: 683–691Petitgas P (1991) Un modele de co-regionalisation pour les poissons pelagiques formant des

bancs le jour et se dispersant la nuit. Note N33/91/G, Centre de Geostatistique,Fontainebleau

Petitgas P (1993a) Geostatistics for fish stock assessments: a review and an acoustic applica-tion. ICES Journal of Marine Science 50: 285–298

Petitgas P (1993b) Use of disjunctive kriging to model areas of high pelagic fish density inacoustic fisheries surveys. Aquatic Living Resources 6: 201–209

Petitgas P (1996) Geostatistics and their applications to fisheries survey data. In: Megrey B,Moksness E (eds) Computers in Fisheries Research. Chapman and Hall, London.pp 113–142

Petitgas P (1997a) Sole egg distributions in space and time characterized by a geostatisticalmodel and its estimation variance. ICES Journal of Marine Science 54: 213–225

Petitgas P (1997b) Use of disjunctive kriging to analyse an adpative survey design for anchovyeggs in Biscay. Ozeanografika 2: 121–132

Petitgas P (1998) Biomass dependent dynamics of fish spatial distributions characterized bygeostatistical aggregation curves. ICES Journal of Marine Science 55: 443–453

Petitgas P (ed) (2000) Cluster: Aggregation patterns of commercial fish species under differentstock situations and their impact on exploitation and assessment. Final report to theEuropeanCommission, contract FAIR-CT-96.1799. European Commission, DG-Fish, Brussels

Petitgas P (2001) Geostatistics in fisheries survey design and stock assessment: models,variances and applications. Fish and Fisheries 2: 231–249

Petitgas P (2003) A method for the identification and characterization of clusters of schoolsalong the transects lines of fisheries acoustic surveys. ICES Journal of Marine Science 60:872–884

Petitgas P (2004) About non-linear geostatistics and adaptive sampling. In: Report of theWorkshop on Survey Design and Data Analysis (WKSAD). ICES CM 2004/B:07. Work-ing Document 11

Petitgas P, Lafont T (1997) EVA2: Estimation variance version 2, a geostatistical software forthe precision of fish stock assessment surveys. ICES CM 1997/Y:22

Petitgas P, Masse J, Beillois P, Lebarbier E,. Le Cann A (2003a) Sampling variance of speciesidentification in fisheries acoustic surveys based on automated procedures associatingacoustic images and trawl hauls. ICES Journal of Marine Scienc 60: 437–445

Petitgas P, Masse J, Grellier P, Beillois P (2003b) Variation in the spatial distribution of fishlength: a multi-annual geostatistics approach on anchovy in Biscay, 1983–2002. ICES CM2003/Q:15

Petitgas P, ReidD, Carrera P, IglesiasM,Georgakarakos S, Liorzou B,Masse J (2001) On therelation between schools, clusters of schools, and abundance in pelagic fish. ICES Journalof Marine Science 58: 1150–1160

Pinheiro J, Bates D (2000) Mixed effects models in S and Splus. Springer-Verlag, BerlinRivoirard J (1994) Introduction to disjunctive kriging and non-linear geostatistics. Clarendon,

OxfordRivoirard J (1998) Quelques modeles spatio-temporels de bancs de poissons. Note N12/98/G.

Centre de Geostatistique, FontainebleauRivoirard J, Guiblin P (1997) Global estimation variance in presence of conditioning para-

meters. In: Baafi E, Schofield N (eds) Geostatistics Wollongon ’96, Volume I. KluwerAcademic Publishers, The Netherlands. pp 246–257

Rivoirard J, Simmonds J, Foote K, Fernandes P, Bez N (2000) Geostatistics for estimatingfish abundance. Blackwell Science, Oxford

7 Geostatistics and Their Applications to Fisheries Survey Data 223

Page 34: Chapter 7 Geostatistics and Their Applications to ... · Chapter 7 Geostatistics and Their Applications to Fisheries Survey Data: A History of Ideas, 1990–2007 Pierre Petitgas 7.1

Rivoirard J,WielandK (2001) Correcting for the effect of daylight in abundance estimation ofjuvenile haddock (Melanogrammus aeglefinus) in the North sea: an application of krigingwith external drift. ICES Journal of Marine Science 58: 1272–1285

Rossi R, Mulla D, Journel A, Franz E (1992) Geostatistical tools for modeling and interpret-ing ecological spatial dependence. Ecological Monographs 62: 277–314

Rufino M, Maynou F, Abello P, Yule, A (2006) Small-scale non-linear geostatistical analysisofLiocarcinus depurator (Crustacea: Brachyura) abundance and size structure in a westernMediterranean population. Marine Ecology Progress Series 276: 223–235

Simmonds J (1995) Survey design and effort allocation: a synthesis of choices and decisionsfor an acoustic survey. North Sea herring is used as an example. ICES CM 1995/B:09

Simmonds J, Fryer R (1996) Which are better, random or systematic acoustic surveys?A simulation using North Sea herring as an example. ICES Journal of Marine Science53: 39–50

Simmonds J, McLennan D (2005) Fisheries acoustics: theory and practice. Blackwell Science,Oxford

Stoyan D, Stoyan H (1994) Franctals, random shapes and point field. Wiley, New YorkSullivan P (1991) Stock abundance estimation using depth-dependent trends and spatially

correlated variation. Canadian Journal of Fisheries and Aquatic Sciences 48:1691–1703

Swain D, Morin R (1996) Relationships between geographic distribution and abundance ofAmerican plaice (Hippoglossoides platessoides) in the southern gulf of St. Lawrence.Canadian Journal of Fisheries and Aquatic Sciences 53: 106–119

SwainD, Sinclair A (1994) Fish distribution and catchability: what is the appropriate measureof distribution? Canadian Journal of Fisheries and Aquatic Sciences 51: 1046–1054

Swartzman G (1997) Analysis of the summer distribution of fish schools in the PacificBoundary Current. ICES Journal of Marine Science 54: 105–116

Thompson S, Seber, G (1996) Adaptive sampling. Wiley, New YorkWackernagel H (1995)Multivariate geostatistics: an introduction with applications. Springer-

Verlag, BerlinWarrenW (1997) Changes in the within-survey spatio-temporal structure of the northern cod

(Gadus morhua) population, 1985–1992. Canadian Journal of Fisheries and AquaticSciences 54(Suppl. 1): 139–148

Woillez M, Poulard JC, Rivoirard J, Petitgas P, Bez N (2007) Indices for capturing spatialpatterns and their evolution in time with an application on European hake (Merlucciusmerluccius) in the Bay of Biscay. ICES Journal of Marine Science 64: 537–550

Woillez M, Rivoirard J, Fernandes P (2006a) Evaluating the uncertainty of abundanceestimates from acoustic surveys using geostatistical conditional simulations. ICES CM2006/I:15

Woillez M, Petitgas P, Rivoirard J, Fernandes P, terHofstede R, Korsbrekke K, Orlowski A,Spedicato MT, Politou CY (2006b) Relationships between population spatial occupationand population dynamics. ICES CM 2006/O:05

224 P. Petitgas