Near-real time monitoring of habitat change using a neural network and
MODIS data: the PARASID approach
Andy Jarvis, Louis Reymondin, Jerry Touval
Contents
• The approach
• The implementation
• Some examples
• Comparison with other models
• Plans and timelines
Objectives of PARASID
HUman Impact Monitoring And Natural Ecosystems
• Provide near-real time monitoring of habitat change (<3 month turn-around)
• Continental – global coverage (forests AND non-forests)
• Regularity in updates
The Approach
The change in greenness of a given pixel is a function of:
• Climate• Site (vegetation, soil, geology)• Human impact
Machine learning
We therefore try to learn how each pixel (site) responds to climate, and any anomoly corresponds to human impact
Machine learning (or neural-network), is a bio-inspired technology which emulates the basic mechanism of a brain.
It allows – To find a pattern in noisy dataset– To apply these patterns to new dataset
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
1 2 3 4 5 6 7 8 9Time
ND
VI
Measurments
Predictions
Interval max
Interval min
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
1 2 3 4 5 6 7 8 9Time
ND
VI
Measurments
Predictions
Interval max
Interval min
NDVI Evolution and novelty detection
Novelty/Anomoly
NDVI Cleaning using HANTS Eliminate all short-term variations Uses NDVI quality information Iterative fitting of cleaned curve using
Fourier analysis Least-square fitting to good quality values
Methodology
NDVIt
Precipitation (t)
Temperature(t)
…
…
w0
w1
w2
NDVI(t-1)
NDVI(t-2)
NDVI(t-n)
wp1
wp2
wp3
wo1
wo2
wo3
As required by the ARD algorithm, each input and the hidden output is a weights
class with its own α α0
αc
INPUTS: Past NDVI (MODIS 3b42) Previous rainfall (TRMM) Temperature (WorldClim)
OUTPUT: 16 day predicted NDVI
Methodology – Bayesian NN
• To detect novelties, Bayesian Neural Networks provide us two indicators– The predicted value– The probability repartition of where the value should
be
• The first one allows us to detect abnormal measurements
• The second one allows us to say how sure we are a measurement is abnormal.
The Processing
• For South America alone, first calculations approximated 10 years of processing for the NN to learn:– A map of 30720 by 37440 pixels
1,150,156,800 vectors 23 vectors per year 26,453,606,400 NDVI values to manage per year 9.5 years of data 251,309,260,800 individual data points
• Through various processes, optimizations and hardware acquisitions reduced time to 3 months for NN learning
• Detection takes 2-3 days
Sample novelty analysis
The Bottom-Line
• 250m resolution
• Latin American coverage (currently)
• 3 week turnaround from data being made available (4 week delay in MODIS going to NASA ftp) (3+4 = 7 weeks)
• Report every 16 days
• Measurement of scale of habitat change (0-1) and probability of event
Some statistics
• 75% of deforestation occurs in December and January
• 50,000 Ha deforested in Dec/Jan of 2008/2009 compared with 7,500 Ha in 2004/2005
• During 16 days of Christmas in 2008 16,000 Ha lost, compared with 500 Ha in 2004 (3%)
Parasid Test cases
Introduction
• Different test cases with different vegetation and climate types
• All the test are done with the same parameters– Training parameters
• From 2000 to the end of 2003
– Detections parameters• From 2004 to May 2009• A detection map is created each 16 days within this period
• The process is near to be fully automated
Colombia – Río Caquetá
• Size – 480 * 300 [km2]– 14400000 [ha]
• Vegetation type– Tropical forest
Detection : See Caqueta-meta KML
• See http://www.youtube.com/watch?v=exGmzc70PrQ
• Pink : Too many clouds to analyse
• Red : 3 consecutive times detected with more than 95% confidence
NDVI 2004.01.01 NDVI 2009.01.01
Anomalies probability 2009.01.01
Colombia – Rio Caquetá
Colombia – Rio Caquetá
Cumulative detections in hectares
020000400006000080000
100000120000140000160000180000200000
1/1/
2004
4/1/
2004
7/1/
2004
10/1
/200
4
1/1/
2005
4/1/
2005
7/1/
2005
10/1
/200
5
1/1/
2006
4/1/
2006
7/1/
2006
10/1
/200
6
1/1/
2007
4/1/
2007
7/1/
2007
10/1
/200
7
1/1/
2008
4/1/
2008
7/1/
2008
10/1
/200
8
1/1/
2009
4/1/
2009
Time
Hec
tare
s
Colombia – Rio Caquetá
• Comments– 0.22% deforestation rate per year– The model is working well in this area where
deforestation seems accelerating
Colombia – Serranía San Lucas
• Size– 180*960 [km2]– 4320000 [ha]
• Vegetation type– Lowland tropical forest– Montane forest
Colombia – Serranía San LucasNDVI 2004.01.01
NDVI 2009.01.01
anomalies probability2009.01.01
Cumulative detection on time
Colombia – Serranía San Lucas
Cumulative detection in hectares
0
20000
40000
60000
80000
100000
120000
140000
160000
1/1/
2004
4/1/
2004
7/1/
2004
10/1
/200
4
1/1/
2005
4/1/
2005
7/1/
2005
10/1
/200
5
1/1/
2006
4/1/
2006
7/1/
2006
10/1
/200
6
1/1/
2007
4/1/
2007
7/1/
2007
10/1
/200
7
1/1/
2008
4/1/
2008
7/1/
2008
10/1
/200
8
1/1/
2009
4/1/
2009
Time
Hec
tare
s
Colombia – Serranía San Lucas
• Comments– 0.63% deforestation rate per year– The main area of deforestation in the center is a really
strong change which could be flooding.– A deeper analysis is needed to explain the different
types of changes– The model seems to work well even if the place is a
bit cloudy
Colombia – Sierra Nevada
• Size– 120*120 [km2]– 1440000 [ha]
• Vegetation type– Dry forest– Montane forest
Colombia – Sierra NevadaNDVI 2004.01.01
NDVI 2009.01.01
Anomalies probability2009.01.01
Cumulative detection on time
Colombia – Sierra NevadaCumulative detections in hectares
0
100
200
300
400
500
600
700
1/1
/20
04
4/1
/20
04
7/1
/20
04
10
/1/2
00
4
1/1
/20
05
4/1
/20
05
7/1
/20
05
10
/1/2
00
5
1/1
/20
06
4/1
/20
06
7/1
/20
06
10
/1/2
00
6
1/1
/20
07
4/1
/20
07
7/1
/20
07
10
/1/2
00
7
1/1
/20
08
4/1
/20
08
7/1
/20
08
10
/1/2
00
8
1/1
/20
09
4/1
/20
09
Time
He
cta
res
Colombia – Sierra Nevada
• Comments– 0.01% deforestation rate per year– Difficult to detect changes in this area
– An analysis of the probabilities is needed to show places with anomalies
Bolivia – Santa Cruz
• Size– 480*420 [km2]– 20160000 [ha]
• Vegetation type– Tropical forest
NDVI 2004.01.01
NDVI 2009.01.01
Anomalies probability2009.01.01
Bolivia – Santa Cruz
Cumulative detection on time
Bolivia – Santa Cruz
Cumulative detections in hectares
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
1/1
/20
04
3/1
/20
04
5/1
/20
04
7/1
/20
04
9/1
/20
04
11
/1/2
00
4
1/1
/20
05
3/1
/20
05
5/1
/20
05
7/1
/20
05
9/1
/20
05
11
/1/2
00
5
1/1
/20
06
3/1
/20
06
5/1
/20
06
7/1
/20
06
9/1
/20
06
11
/1/2
00
6
1/1
/20
07
3/1
/20
07
5/1
/20
07
7/1
/20
07
9/1
/20
07
11
/1/2
00
7
1/1
/20
08
3/1
/20
08
5/1
/20
08
7/1
/20
08
9/1
/20
08
11
/1/2
00
8
1/1
/20
09
3/1
/20
09
5/1
/20
09
Time
He
cta
res
0.09% deforestation rate
Paraguay - Boquerón
• Size– 240*240 [km2]– 5760000 [ha]
• Vegetation type– Savannah
NDVI 2004.01.01
NDVI 2009.01.01
Anomalies probability2009.01.01
Cumulative detection on time
Paraguay - Boquerón
Cumulative detections in hectares
0
50000
100000
150000
200000
250000
300000
1/1/
2004
4/1/
2004
7/1/
2004
10/1
/200
4
1/1/
2005
4/1/
2005
7/1/
2005
10/1
/200
5
1/1/
2006
4/1/
2006
7/1/
2006
10/1
/200
6
1/1/
2007
4/1/
2007
7/1/
2007
10/1
/200
7
1/1/
2008
4/1/
2008
7/1/
2008
10/1
/200
8
1/1/
2009
4/1/
2009
Time
Hec
tare
s
Paraguay - Boquerón
• Comments– 0.87% deforestation rate– Savannah and tropical forest have a totally
different environment– The model seems to work well even if the
changes are more subtle
Chile – Region del Bio Bio
• Size– 240*120 [km2]– 2880000 [ha]
• Vegetation type– Tempered forest
Chile – Region del Bio Bio
NDVI 2004.01.01 NDVI 2009.01.01
Anomalies probability2009.01.01 Cumulative detection on time
Chile – Region del Bio Bio
Cumulative detections in hectares
0
10000
20000
30000
40000
50000
60000
1/1/
2004
4/1/
2004
7/1/
2004
10/1
/200
4
1/1/
2005
4/1/
2005
7/1/
2005
10/1
/200
5
1/1/
2006
4/1/
2006
7/1/
2006
10/1
/200
6
1/1/
2007
4/1/
2007
7/1/
2007
10/1
/200
7
1/1/
2008
4/1/
2008
7/1/
2008
10/1
/200
8
1/1/
2009
4/1/
2009
Time
Hec
tare
s
Chile – Region del Bio Bio
• Comments– 0.31% deforestation rate– The model seems to work with a tempered
climate and non-tropical forests
And now the tough one…
OTCAAmazon Cooperation Treaty
• Size– 4228.75*3498 [km2]– 1479216750 [ha]
• Vegetation type– Tropical forest
OTCAAmazon Cooperation Treaty
Cumulative detections in hectares
02000000400000060000008000000
100000001200000014000000160000001800000020000000
1/1/
2004
4/1/
2004
7/1/
2004
10/1
/200
4
1/1/
2005
4/1/
2005
7/1/
2005
10/1
/200
5
1/1/
2006
4/1/
2006
7/1/
2006
10/1
/200
6
1/1/
2007
4/1/
2007
7/1/
2007
10/1
/200
7
1/1/
2008
4/1/
2008
7/1/
2008
10/1
/200
8
1/1/
2009
4/1/
2009
Time
Hec
tare
s
OTCAAmazon Cooperation Treaty
• Comments– Average 0.22% deofrestation rate– Still a bit noisy in the center
• Due to clouds undetected during the cleaning process
– Most of the detections are valid– The system seems stable over big areas and
a certain amount of consecutive dates (detections over 120 dates)
Time processing statistics• For an area of the size of OTCA with
– One Dell server • 16 [GB] of RAM • 8 processors Intel Xeon X5365 3 [GHz]
• Cleaning process– Cleaning 214 date– 12 hours
• Clustering process– 6 Clusters– Clustered on the years 2000 to the end of 2003– 12 hours
• Modeling process– 3 Models per clusters– 2000 pixels as training dataset– 5000 pixels as validation dataset– 3 hours
• Detections process for 2004 to 2009– 120 detections grids– 70 hours
• Whole process – Only 4 days processing from the raw data
Tasas de deforestación
Country Region Deforestation RateParaguay Boqueron, Chaco 0.87%Colombia Serrania San lucas 0.63%Chile Region Bio Bio 0.31%Colombia Rio Caqueta 0.22%Multiple OTCA 0.22%Bolivia Santa Cruz 0.09%Colombia Nevado de Santa Marta 0.01%
Model comparisonPARASID vs. FORMA
PARASID detectionsFirst detection in 2004
FORMA probabilitiesFirst detection in 2000
PARASID vs DETER
It seems Parasid model detects quite small and isolate events which Deter doesn’t detect.
2006
2004
Next Steps
– Fully functioning web interface January 2010– Preliminary continental validation and
calibration (January 2010)– Global extent (2011)– Additional models to identify type of change
(drivers) (2011)
Analysis of three images between the years 2000 and 2009.
MATO-GROSSO – BRASIL
LAT: - 10.1, LON: - 51.3
10/10/2000
LANDSAT 7 SLC ON
29/06/2009
LANDSAT 7 SLC OFF
CLASSIFIED IMAGES IN
ERDAS
Forest
Uncoverage
Change 00-09
Unchanged
CHANGE DETECTION IN
ERDAS
SAMPLING POINTS IN LATIN-AMERICASAMPLING POINTS IN LATIN-AMERICA
1. Covering the whole Latin-America
2. Sampling of different land use type
a. Tropical forestb. Andesc. Savannad. Desert
3. Selection of areas with high risk of change
a. Near to citiesb. Near to roadc. Near to riversd. With crops already existing
SELECTION CRITERIASELECTION CRITERIA
Conclusions
• Near-real time global monitoring is possible
• PARASID now functioning for Latin America
• Providing first approximations of deforestation rates in over a decade for some parts of Latin America
Puntos de mejoramiento
• Corridas detalladas para zonas
• Mejoramiento significativo en precisión y cobertura (por nubes) usando datos crudos de MODIS diarios
• Con calibración y validación en campo
• Estudio conjunto IDEAM con CIAT/TNC para aproximación nacionales
GRACIAS!