10980_2017_520_moesm1_esm.docx - springer …10.1007... · web viewa map of the probability of...

Appendix 1.

Brunei Model Description

The ten most important variables in the calibrated random forest Brunei model based on Model

Improvement Ratio were Aggregation Index at 10km radius, Edge Density at 10km radius,

Proportion of peatswamp forest at 40km radius, Proportion of water at 50km radius, Proportion

of plantation or regrowth at 1km radius, Proportion of lowland forest at 40km radius, Shannon

Diversity at 10km radius, Topographical Roughness at 40km radius, Elevation and Proportion of

lowland mosaic at 40km radius (Figure A1).

Of the ten most influential variables in the prediction of forest loss in Brunei between 2000 and

2010, as judged by model improvement ratio, five had positive monotonic relationships with

increasing probability of deforestation with increasing probability of forest loss with increasing

value of the variable (Edge Density at 10km radius, Proportion of peatswamp forest at 30km

radius, Proportion of water at 50km radius, Proportion of plantation or regrowth at 1km radius,

Shannon Diversity Index at 10km radius; Figure A2). Two variables, including the most

influential variable (Aggregation Index at 10km radius), had negative monotonic relationships

with deforestation risk, such that deforestation risk decreases as the value of these variables

increase. Two variables had unimodal relationships such that the frequency of deforestation

was maximum at intermediate values (Proportion of lower montane forest at 40km radius and

Distance to Population Centre).

We produced visualization of the pattern of predicted forest loss probability across Brunei

(Figure A3), with zoomed-in view of two areas showing the pattern of observed loss in relation

to the predicted probability of loss (Figure A4). These maps show a very high association

between the predicted probability of loss and the actual pattern of forest loss and persistence

in Brunei.

The ten most important variables in the calibrated logistic regression Brunei model, based on

variable p-value, were Proportion of Plantation or regrowth at 1km radius, Proportion of water

at 50km radius, Proportion of peatswamp forest at 30km radius, Proportion of lowland forest at

40km radius, Proportion of upper montane forest at 50km radius, Edge Density at 10km radius,

Proportion of lowland mosaic at 40km radius, Focal Mean Population Density at 100km radius,

Topographical Roughness at 40km radius, and Aggregation Index at 10km radius (Table A1).

A map of the probability of forest loss across Brunei predicted by random forest and logistic

regression is shown in Figure A5. The two models differ in that the random forest model has

much sharper transition spatially from areas with high predicted to low predicted forest loss

probability and a more complex, fine-scale pattern of predicted probability than the smoother

prediction produced by logistic regression.

Malaysian Borneo Model Description

The ten most important variables, based on Model Improvement Ratio, in the calibrated

Malaysian Borneo random forest model were Proportion of lowland mosaic at 20km radius,

Proportion of plantation or regrowth at 30km radius, Edge Density at 20km radius, Elevation,

Proportion of montane mosaic at 50km radius, Patch Density at 30km radius, Proportion of

lower montane forest at 50km radius, Shannon Diversity Index at 40km radius, Focal Mean

Population Density at 100km radius and Topographical Roughness at 40km radius (Figure A6).

Of the ten most influential variables in the calibrated random forest model predicting

deforested vs. not-deforested cells in Malaysian Borneo between 2000 and 2010, six had

positive monotonic relationships wherein frequency of forest loss increased as the value of the

variable increased (Proportion of lowland mosaic at 20km radius, Proportion of plantation or

regrowth at 30km radius, Edge Density at 20km radius, Patch Density at 30km radius, Shannon

Diversity at 40km radius, and Focal Mean Population Density at 100km radius). All of these

showed strongly non-linear relationships rapid initial rise followed by asymptotic flattening

(Figure A7). Four of the ten most influential variables in the calibrated Malaysian Borneo

random forest model showed monotonic negative relationships where frequency of forest loss

between 2000 and 2010 declined as the value of the variable increased. These mostly showed

what appear to be negative exponential shapes, with rapid decline at low values of x followed

by flattening as the x variable increased (Elevation, Proportion of montane mosaic at 50km

radius, Proportion of upper montane forest at 50km radius, and Topographical Roughness at a

40km radius; Figure A7).

We produced visualization of the pattern of predicted forest loss probability across Malaysian

Borneo (Figure A8), with zoomed-in view of two areas showing the pattern of observed loss in

relation to the predicted probability of loss (Figure A9). These maps show a very high

association between the predicted probability of loss and the actual pattern of forest loss and

persistence in Brunei.

The ten most important variables in the calibrated logistic regression Malaysian Borneo model,

based on variable p-value, were Edge Density at a 20km focal radius, Elevation, proportion of

lowland open at a 50km focal radius, proportion of water at a 40km focal radius, Distance to

Population Center, Proportion of montane mosaic at a 50km focal radius, Proportion of lower

montane forest at a 50km focal radius, Proportion of lowland forest at a 40km focal radius,

Topographical Roughness at a 40km focal radius, and Proportion of plantation or regrowth at a

30km focal radius (Table A2).

A map of the probability of forest loss across Malaysian Borneo predicted by random forest

and logistic regression is shown in Figure A10. The two models differ in that the random forest

model has much sharper transition spatially from areas with high predicted to low predicted

forest loss probability and a more complex, fine-scale pattern of predicted probability than the

smoother prediction produced by logistic regression (Figure A10).

Kalimantan Model Description

The ten most important variables, based on Model Improvement Ratio, for Kalimantan were

Elevation, Patch Density at 40km radius, Proportion of lowland mosaic at 50km radius,

Proportion of lower montane forest at 20km radius, Proportion of plantation or regrowth at

50km radius, Edge Density at 10km radius, Focal Mean Population Density at 100km radius,

Topographical Roughness at 50km radius, Proportion of lowland open at 50km radius and

Shannon’s Diversity Index at 40km radius (Figure A11).

Of the ten most important variables in the calibrated random forest model predicting

deforested vs. not-deforested cells between 2000 and 2010, seven had positive monotonic

relationships (Patch Density at 40km radius, Proportion of lowland mosaic at 50km radius,

Proportion of plantation or regrowth at 50km radius, Edge Density at 10km radius, Focal Mean

Population Density at 100km radius, Proportion of lowland open at 50km radius, and Shannon

Diversity at 40km radius; Figure A12). As in the Malaysian Borneo model, many of these were

strongly nonlinear relationships. The remaining three top variables all had negative monotonic

relationships (Elevation, Proportion of lower montane forest at 20km, Topographical Roughness

50km radius). As in the Malaysian Borneo model, these were strongly nonlinear, with most

showing a negative exponential shape (Figure A12).

We produced visualization of the pattern of predicted forest loss probability across Kalimantan

(Figure A13), with zoomed-in view of two areas showing the pattern of observed loss in relation

to the predicted probability of loss (Figure A14). These maps show a very high association

between the predicted probability of loss and the actual pattern of forest loss and persistence

in Brunei.

The ten most important variables in the calibrated logistic regression Kalimantan model, based

on variable p-value, were Edge Density at a 10km radius, Elevation, Patch Density at a 40km

radius, Proportion of peatswamp forest at a 40km radius, SHDI at a 40km radius, Slope Position

at a 40km radius, Proportion of upper montane forest at a 40km radius, Proportion of lowland

forest at a 1km radius, Topographical Roughness at a 50km radius, and Proportion of plantation

or regrowth at a 50km radius (Table A3).

A map of the probability of forest loss across Kalimantan predicted by random forest and

logistic regression is shown in Figure A15. The two models differ in that the random forest

model has much sharper transition spatially from areas with high predicted to low predicted

forest loss probability and a more complex, fine-scale pattern of predicted probability than the

smoother prediction produced by logistic regression.

Model Comparisons

Random Forest vs. Logistic Regression

We produced maps of several of the inset areas from figures A8 and A13 for Malaysian Borneo

and Kalimantan to visually display the differences in patterns of predicted forest loss risk

relative to actual forest loss between 2000 and 2010 for random forest and logistic regression

models. In general, the figures show that the random forest models seem to have higher spatial

prediction of areas where forest loss occurs and does not occur than the logistic regression

maps, which tend to make smoother and less precise predictions (Figures A16-A19).

Random Forest with Landscape Metrics vs. Random Forest without Landscape Metrics

We produced maps of several of the inset areas from figures A8 and A13 for Malaysian Borneo

and Kalimantan visually displaying differences in predicted probability between random forest

models including landscape metrics and random forest models excluding landscape metrics

(Figures A20-A23). In general, these figures show that including landscape metrics improved

prediction of forest loss risk in a number of locations across the study area, with a higher

congruence of observed patterns of forest loss and predicted risk of forest loss.

Table A1. Coefficients and p-values for the logistic regression predicting forest loss with the same predictor variables and same training data set as the calibrated Brunei random forest model.

VariableCoefficient

Standard

Error z value P-Value

PLAND_PLANTATION/REGROWTH_1k 5.58E-02 6.34E-03 8.805 < 2e-16

PLAND_WATER_50k 8.84E-02 2.60E-02 3.402 0.000669

PLAND_PEATSWAMP_FOREST_30k -4.92E-01 1.58E-01 -3.122 0.001797

PLAND_LOWLAND_FOREST_40k -1.77E-01 6.44E-02 -2.739 0.006155

PLAND_UPPERMONTANE_FOREST_50k 3.90E-01 1.96E-01 1.991 0.046525

EDGE_DENSITY_10k 5.78E-01 3.47E-01 1.668 0.095411

PLAND_LOWLAND_MOSAIC_40k -1.70E-01 1.17E-01 -1.458 0.144886

(Intercept) -1.52E+01 1.36E+01 -1.119 0.263077

FOCAL_MEAN_POP_DENSITY_100k -2.71E-02 2.51E-02 -1.08 0.280033

ROUGHNESS_40k -1.10E-01 1.16E-01 -0.948 0.343267

AGGREGATION_INDEX_10k 1.32E-01 1.40E-01 0.943 0.345471

DISTANCE _POP_CENTRE 1.81E-05 2.00E-05 0.902 0.366956

PLAND_LOWERMONTANE_FOREST_50k -6.84E-01 8.18E-01 -0.837 0.402671

SHANNON_10k -9.09E-01 1.14E+00 -0.797 0.425579

ELEVATION -1.53E-03 2.65E-03 -0.575 0.565207

Table A2. Coefficients and p-values for the logistic regression predicting forest loss with the

same predictor variables and same training data set as the calibrated Malaysian Borneo random

forest model.

VariableCoefficient

Standard


(Intercept) -3.38E+00 4.01E-01 -8.42 < 2e-16

EDGE_DENSITY_20k 2.48E-01 1.39E-02 17.807 < 2e-16

ELEVATION -2.18E-03 1.08E-04 -20.092 < 2e-16

PALND_LOWLAND_OPEN_50k -3.41E-01 4.14E-02 -8.249 < 2e-16

PLAND_WATER_40k 3.40E-02 3.78E-03 8.998 < 2e-16

DISTANCE _POP_CENTRE -2.75E-06 3.41E-07 -8.053 8.10E-16

PLAND_MONTANE_MOSAIC_50k 4.22E-01 6.07E-02 6.954 3.55E-12

PLAND_LOWERMONTANE_FOREST_50k 2.43E-02 4.25E-03 5.702 1.19E-08

PLAND_LOWLAND_FOREST_40k 1.89E-02 3.58E-03 5.269 1.37E-07

ROUGHNESS_40K -3.74E-02 7.20E-03 -5.194 2.06E-07

PLAND_PLANTATION/REGROWTH_30k 1.76E-02 3.89E-03 4.521 6.15E-06

PLAND_PEATSWAMP_FOREST_40k 1.76E-02 4.33E-03 4.067 4.76E-05

PLAND_LOWLAND_MOSAIC_20k 1.55E-02 4.98E-03 3.1 0.00194

SHANNON_40k -2.52E-01 1.64E-01 -1.538 0.12401

FOCAL_MEAN_POP_DENSITY_100k 3.47E-03 2.32E-03 1.492 0.1356

PLAND_MONTANE_OPEN_50k 8.69E-03 1.16E-02 0.749 0.45367

PATCH_DENSITY_30k 2.29E-01 7.90E-01 0.289 0.77236

Table A3. Coefficients and p-values for the logistic regression predicting forest loss with the

same predictor variables and same training data set as the calibrated Kalimantan random forest

model.

VariableCoefficient

Standard


(Intercept) -2.13E+00 2.07E-01 -10.251 < 2e-16

EDGE_DENSITY_10k 1.97E-01 7.85E-03 25.126 < 2e-16

ELEVATION -6.00E-03 3.89E-04 -15.44 < 2e-16

PATCH_DENSITY_40k -7.48E+00 8.24E-01 -9.078 < 2e-16

PLAND_PEATSWAMP_FOREST_40k -3.81E-02 2.64E-03 -14.426 < 2e-16

SHANNON_40k 1.04E+00 1.20E-01 8.724 < 2e-16

SLOPE_40 3.33E-03 4.66E-04 7.156 8.28E-13

PLAND_UPPERMONTANE_FOREST_40k 3.36E-01 4.79E-02 7.02 2.22E-12

PLAND_LOWLAND_FOREST_1k -5.49E-03 7.95E-04 -6.914 4.73E-12

ROUGHNESS_50 6.45E-02 1.10E-02 5.857 4.72E-09

PLAND_PLANTATION/REGROWTH_50k 1.27E-02 2.23E-03 5.72 1.07E-08

PLAND_LOWERMONTANE_FOREST_20k -2.70E-02 6.40E-03 -4.222 2.42E-05

DISTANCE _POP_CENTRE -8.92E-07 3.42E-07 -2.609 0.00909

PLAND_LOWLAND_MOSAIC_50k 1.23E-02 5.06E-03 2.436 0.01484

PLAND_WATER_30k -7.75E-03 3.26E-03 -2.381 0.01727

PLAND_LOWLAND_OPEN_50k 1.39E-02 6.59E-03 2.107 0.03511

FOCAL_MEAN_POP_DENSITY_100k -6.34E-05 1.82E-03 -0.035 0.97212

Aggregation Index 1

0km

Edge Density 1

0km

Peatswamp Forest

40km

Water 50km

Plantation/Regrowth 1km

Lowland Forest 40km

Shannon Diversi

ty 10km

Topographical R

oughness 40km

Elevation

Lowland Mosa

ic 40km

Distance

Population

Lower Montane Forest

50km

Population Density 1

0km

Upper Montane Forest

50km0

0.2

0.4

0.6

0.8

1

1.2

Figure A1. Model improvement ratio of variable importance for retained variables for the

calibrated random forest Brunei model.

Figure A2. LOWESS splines showing the response curve for forest loss (1.0 on y-axis) and forest

persisting (0.0 on y-axis) for the ten variables with the highest model improvement ratio for

Brunei.

Figure A3. Map of calibrated predicted probability of forest loss between 2000 and 2010 for

Brunei. Black cells are those that were deforested before 2000. The predicted probability of

forest loss increases as a color-ramp linearly from 0 in dark blue to 0.881 in dark red. The two

A

B

inset white boxes are areas where the pattern of actual deforested and not-deforested points

are displayed overlain on the probability map in the next figure.

Figure A4. Display of the location of pixels observed to have been deforested, as greed dots,

between 2000 and 2010 in Brunei in the two insets A and B shown in the previous figure,

overlain on the predicted probability of forest loss surface. Black pixels are areas that were

deforested before 2000. The color ramp indicates the predicted probability of forest loss

between 2000 and 2010, ranging from 0 in dark blue to 0.8811 in dark red.

A B

Figure A5. Comparison of predicted probability of forest loss between 2000 and 2010 for Brunei

produced by (A) the calibrated random forest model, and (B) logistic regression with the same

input variables and training data set.

A B

Lowland M

osaic

20km


Edge Density 2

0km

Elevation

Montane Mosa

ic 50km

Patch Densit

y 30km

Lower M

ontane Forest 50km

Shannon Diversi

ty 40km


0km

Topographical R

oughness 40km

Lowland Forest

40km

Lowland O

pen 50km

Distance

Population

Montane Open 50km

Water 4

0km

Peatswamp Forest

40km0

0.2

0.4

0.6

0.8

1

1.2


calibrated random forest Malaysian Borneo model.



Malaysian Borneo.


Malaysian Borneo. Black cells are those that were deforested before 2000. The predicted

probability of forest loss increases as a color-ramp linearly from 0 in dark blue to 1.0 in dark

red. The four inset white boxes are areas where the pattern of actual deforested and not-

deforested points are displayed overlain on the probability map in the next figure.

A

B

C

D


between 2000 and 2010 in Malyasian Borneo in the four insets A, B, C and D shown in the

previous figure, overlain on the predicted probability of forest loss surface. Black pixels are

areas that were deforested before 2000. The color ramp indicates the predicted probability of

forest loss between 2000 and 2010, ranging from 0 in dark blue to 1.0 in dark red.

A B

C D

Figure A10. Comparison of predicted probability of forest loss between 2000 and 2010 for

Malaysian Borneo produced by (A) the calibrated random forest model, and (B) logistic

regression with the same input variables and training data set.

AB

Elevation

Patch Densit

y 40km

Lowland Mosa

ic 50km

Lower Montane Forest

20km


Edge Density 1

0km


0km

Topographical R

oughness 50km

Lowland Open 50km

Shannon Diversi

ty 40km

Lowland Forest 1km

Water 30km

Peatswamp Forest

40km

Distance

Population

Upper Montane Forest

40km

Slope Position 40km

0

0.2

0.4

0.6

0.8

1

1.2


calibrated random forest Kalimantan model.



Kalimantan.


Kalimantan. Black cells are those that were deforested before 2000. The predicted probability

of forest loss increases as a color-ramp linearly from 0 in dark blue to 1.0 in dark red. The four

A

B

C

D

inset white boxes are areas where the pattern of actual deforested and not-deforested points

are displayed overlain on the probability map in the next figure.


between 2000 and 2010 in Kalimantan in the four insets A, B, C and D shown in the previous

figure, overlain on the predicted probability of forest loss surface. Black pixels are areas that

A B

CD

were deforested before 2000. The color ramp indicates the predicted probability of forest loss

between 2000 and 2010, ranging from 0 in dark blue to 1.0 in dark red.

AB

Figure A15. Comparison of predicted probability of forest loss in the 2000-2010 time period for

Kalimantan produced by (A) calibrated random forest and (B) logistic regression modeling

produced using the same input variables and training data set.

Figure A16. Maps of the difference in predicted probability between for inset A in figure x of

Malaysian Borneo for calibrated random forest models including landscape metrics (A), and

logistic regression models including landscape metrics (B). The predicted probability of forest

loss between 2000 and 2010 is shown as a color ramp from 0 in dark blue to 1 in dark red. Cells

in black were deforested prior to the year 2000. Green dots indicate pixels that were

deforested between 2000 and 2010. The white ovals are areas where the two analyses differ

substantially in predicted forest loss probability, and where actual forest loss occurred, with the

logistic regression model predicting high forest loss risk in places where no forest loss actually

occurred, while random forest predicted lower probability in these areas.

A B

Figure A17. Maps of the difference in predicted probability between for inset B in figure x of

Malaysian Borneo for calibrated random forest models including landscape metrics (A), and

logistic regression models including landscape metrics (B). The predicted probability of forest

loss between 2000 and 2010 is shown as a color ramp from 0 in dark blue to 1 in dark red. Cells

in black were deforested prior to the year 2000. Green dots indicate pixels that were

deforested between 2000 and 2010. The white ovals are areas where the two analyses differ

substantially in predicted forest loss probability, and where actual forest loss occurred, with the

logistic regression model predicting high forest loss risk in places where no forest loss actually


A B


Kalimantan for calibrated random forest models including landscape metrics (A), and logistic

regression models including landscape metrics (B). The predicted probability of forest loss

between 2000 and 2010 is shown as a color ramp from 0 in dark blue to 1 in dark red. Cells in

black were deforested prior to the year 2000. Green dots indicate pixels that were deforested

between 2000 and 2010. The white ovals are areas where the two analyses differ substantially

in predicted forest loss probability, and where actual forest loss occurred, with the logistic

regression model predicting high forest loss risk in places where no forest loss actually


A B


Malaysian Borneo for calibrated random forest models (A) including landscape metrics, (B)

excluding landscape metrics. The predicted probability of forest loss between 2000 and 2010 is

shown as a color ramp from 0 in dark blue to 1 in dark red. Cells in black were deforested prior

to the year 2000. Green dots indicate pixels that were deforested between 2000 and 2010. The

white ovals are areas where the two analyses differ substantially in predicted forest loss

probability, and where actual forest loss occurred, with the model including landscape metrics

(A) more accurately reflecting the higher forest loss risk in these areas than the model that

excluded landscape metrics (B).

A B


Kalimantan for calibrated random forest models (A) including landscape metrics, (B) excluding

landscape metrics. The predicted probability of forest loss between 2000 and 2010 is shown as

a color ramp from 0 in dark blue to 1 in dark red. Cells in black were deforested prior to the

year 2000. Green dots indicate pixels that were deforested between 2000 and 2010. The white

ovals are areas where the two analyses differ substantially in predicted forest loss probability,

and where actual forest loss occurred, with the model including landscape metrics (A) more

accurately reflecting the higher forest loss risk in these areas than the model that excluded

landscape metrics (B).

A B

AB

CD

BA

C D

Figure A24. Landsat RGB imagery of four inset landscapes, two in Malaysian Borneo (A,B), and

two in Kalimantan (C,D), showing extensive network of logging roads in the Malaysian

landscapes and lack of forest roads penetrating the unlogged forest in Kalimantan.

10980_2017_520_moesm1_esm.docx - springer …10.1007... · web viewa map of the probability of...

Documents