processing and analysis of earth observation data and analysis … · processing and analysis of...
TRANSCRIPT
Processing and analysis of Earth Observation data
Carsten Brockmann, Brockmann Consult GmbHESA Climate Change Initiative Toolbox Science Lead
Big Data Analytics & GIS, Münster 20.-21. September 2017.
Earth Observation
Managing big EO data is increasingly complex.
But not just technically.
Collaboration. Culture. Organisation. Structure.
Lake MacKay, Australia
Tonga, Pacific
Arctic Ocean
Karavasta Lagoon, Albania
Satellite Images = Measurement Data
Turning image data into information
Satellite Images = Measurement Data
Generic Tools• Data selection & access• Visualisation• Analysis• Processing• ExportInstrument specific tools• System correction• Data processing (L1 –> L2)Thematic processing, synergy• Programming tools• API for python and Java• Scripting• Graph model builder
SNAP Architecture
SNAP Engine
Java SE 8 Platform
NetBeans RCP
SNAP Desktop
Sentinel-3 Toolbox
(S3TBX)
Sentinel-2 Toolbox
(S2TBX)Sentinel-1 Toolbox
(S1TBX)
Python
GeoTools JAI NetCDF …
Any combination of toolboxes add-ons is allowed, even none, as SNAP Desktop is a already a useful stand-alone application for EO data exploitation.
Programminglanguage layer
3rd-party librarylayer
SNAP layer
SNAP Architecture
SNAP Engine
Java SE 8 Platform
NetBeans RCP
SNAP Desktop
Sentinel-3 Toolbox
(S3TBX)
Sentinel-2 Toolbox
(S2TBX)Sentinel-1 Toolbox
(S1TBX)
Python
GeoTools JAI NetCDF …
Any combination of toolboxes add-ons is allowed, even none, as SNAP Desktop is a already a useful stand-alone application for EO data exploitation.
Programminglanguage layer
3rd-party librarylayer
SNAP layer
SNAP Application Modes
Golden Age of Earth Observation
By the end of 2017, the operational Sentinel-1, -2, -3 and -5p satellites alone
will continuously collect a volume of 27 Terabytes per day / 10PB per year.
It could take around 2.5 years to download 1 Petabyteof Sentinel-1 data and a staggering 63 years to pre-
process it on your own computer (Wagner, 2015)
Data Local Processing
Optimising data transfer
Sharing input data
Sharing result
Rapid turn-around cycles
Hadoop approach
Concurrent data-local processing
Tasks are transferred over the network
Good scalability
Archive-centric approach
Network storage
Data are transferred over the network
Risk of network bottleneck
Data Local Processing
Calvalus Adapters
Calvalus Processing System for EO Data
CalvalusOn-demand Portal
CalvalusBulk Production
Calvalus Adapters
SNAP GPF Operators and Graphs
SNAPAggre-gators
LinuxExecutables
Apply Apache Hadoop to earth observation
Transfer the algorithm to the data
(data-local in a narrower sense)
Avoid an archive-centric approach
Add Calvalus software layers
for EO data processing and validation
EO processing workflows
Data processor plug-in framework
Bulk production control
Portal
Integrate data processors
Linux executables
SNAP & BEAM GPF operators and aggregators
Ppen for other frameworks
• distributed file system HDFSon local disks of compute nodes
• transparent, optimised data-localaccess
L1 FileL2 Processor
(Mapper Task)L2 File
L1 FileL2 Processor
(Mapper Task)L2 File
L1 FileL2 Processor
(Mapper Task)L2 File
L1 FileL2 Processor
(Mapper Task)L2 File
L1 FileL2 Processor
(Mapper Task)L2 File
• MERIS RR L1, North Sea, 3 days • CoastColour NN L2 processor • 6 minutes (22 nodes) • output: L2 files• Only Mapper Tasks, no reduce
step necessary
Data Local Processing- Level 2 Processing Workflow on Calvalus -
• distributed file system HDFSon local disks of compute nodes
• transparent, optimised data-localaccess
• Algorithm defined by Google employees Dean + Ghemawat in 2004• Idea: partition data into chunks, compute chunks locally (map),
concatenate intermediate results to final result (reduce)• Allows for high degree of parallelisation• Fits very well with principle of data locality
Map-Reduce on Calvalus
Map-Reduce on Calvalus- Temporal and Spatial Integration -
Temp. Binning (Reducer Task)
• MERIS RR L1, global, 10-day
• SNAP „C2RCC“ Water processor
• 20 mins (100 nodes)
• Output: 1 L3 product
• distributed file system HDFSon local disks of compute nodes
• transparent, optimised data-localaccess
L1 FileL2 Proc. + Spat.
Binning(Mapper Task)
Spatial BinsL1 File
L2 Proc. + Spat. Binning
(Mapper Task)
Spatial Bins
L1 FileL2 Proc. + Spat.
Binning(Mapper Task)
Spatial Bins
L1 FileL2 Proc. + Spat.
Binning(Mapper Task)
Spatial Bins
L1 FileL2 Proc. + Spat.
Binning(Mapper Task)
Spatial Bins
Temp. Binning (Reducer Task)
Temp. BinsTemp.
Bins
L3 Formatting
L3 File
L1 FileL2 Proc. & Matcher
(Mapper Task)OutputRecordsL1 File
L2 Proc. & Matcher(Mapper Task)
OutputRecordsL1 File
L2 Proc. & Matcher(Mapper Task)
OutputRecordsL1 File
L2 Proc. & Matcher(Mapper Task)
OutputRecords
• MERIS RR L1, global, 3 months • „CoastColour C2W“ processor• NOMAD in-situ dataset• 1.5 minutes (100 nodes) • Output: scatter-plots and pixel extraction tables
Matchup Analysis(Reducer Task)
MA ReportInput Records
L1 FileL2 Proc. & Matcher
(Mapper Task)OutputRecords
Map-Reduce on Calvalus- Match-up analysis -
Supported by SNAP Graph Processing Framework
• Access to data via reader/writer objects instead of files
• Operator chaining to build processors from modules
• Tile cache and pull principle for in-memory processing
• Hadoop MapReduce for partitioning and streaming
Streaming on Calvalus- With SNAP -
EO Data & Data-Processing Platforms
European Space Agency & national Space Agencies
• Thematic Exploitation Platforms
• Mission Exploitation Platforms
European Commission:
• Copernicus Data and Information Access Services (DIAS)
Copernicus Collaborative Ground Segments
Private offers
• Google Earth Engine
• Amazon Web Services
The Urban Thematic Exploitation Platform
Visualisation & Analysis
Urban TEP Processing
Centres
Urban TEP portal + gateway
gateway to ...
Datasets and servicesGeo-browser
Processingrequest forms and
result access
Portal functions
Analysis and visualisation
Combination of satellite products and socio-economic dataDerivation of new criteria
Processing request form
IstanbulMoscowSao Paulo
Global binary raster mask showing location of human settlements (12m/75m)
GUF
▪ SAR4Urban (2015-2016)Beijing ERS-2 PRI & ASAR
IMP VV 2002-200315m spatial resolution
48 scenes
Urban growth
Beijing S1A IW GRDH VV2014-2015
10m spatial resolution31 scenes
Urban growth
Urban TEP is ...
• attractive high-quality datasets ...
... that meet space, time and feature dimensions of the domain
• the capability to generate them
• the facilities ...
... to access and use them,
... to generate more of them
Package
Upload
Local test processing
Deployment
Processing
Request
Concurrent
processing
VM for download
Browser for request submission
Urban TEP portal
Urban TEP
processing centres
Processor development model
Systematic or on-demand processing
• Datasets may be pre-generated, providing access to them as product
• for long-running processes
• for global datasets with high complexity/information reduction
• in order to be able to visualise them
• Example: GUF
• Datasets may be processed on-demand, providing a service instead
• for short-running processes
• for selected areas
• in case of user-defined parameterisation
• to avoid storage of large output datasets
• Example: Sentinel-2 timescan service (unless generated systematically)
Urban TEP processing centres
IT4Innovations Brockmann Consult DLR
cluster (Salomon HPC) cluster (Calvalus/Hadoop)YARN scheduler
virtualised env. (GeoFarm)+cluster(Calvalus/Hadoop)
- Sentinel-2 (urban areas, Africa), OLCI, MERIS
Sentinel-1 and other datasets
Geoserver WPS + own backend implementation
Calvalus WPS + Urban TEP config+extension
-
Geoserver WMS Geoserver WMS -
large-scale global Landsat timescanprocessing
GUF subsetting, Sentinel-2 timescanprocessing
GUF and other Urban datasets
fast internet access, HPC.host of portal and analysis/visualisation
distributed data-local processing and concurrent aggregation
systematic generation of datasets
Copernicus Data and Exploitation Platform – Deutschland
National entry point to the EU Copernicus Sentinel Satellite Systems, their data products and the products of the Copernicus Services
Processing facilities on the platform
EU DIAS
Confusing?
YES!
Climate Monitoring Data
Climate change is a global challenge.
Open climate data is crucial.
The objective of the Climate Change Initiative (CCI) is to realise the full potential of the long-term global Earth
Observation archives that ESA together with its Member States have established over the last 30 years, as a
significant and timely contribution to the Essential Climate Variable databases required by the United Nations
Framework Convention on Climate Change (UNFCCC).
ESA Climate Change Initiative (CCI)
ESA Climate Change Initiative (CCI)
16 projects
>300 scientists
>100 organisations
18 countries
Since 2009
7 years.X individualsX organisationsX projects
Climate Monitoring Data
An Overview of Climate Data Production.
Essential Climate Variables have been defined by the global
science community to support the United Nations
Framework Convention on Climate Change (UNFCCC).
Step 1. Deciding what to actually measure.
Criteria of Essential Climate Variables (ECV)
Relevance. Critical for climate monitoring.
Feasibility. Global measurement is feasible.
Cost effective. Using proven technology.
Satellites can help
Global. Observe entire Earth.
Uniformity. Same instrument everywhere.
Rapid Measurement. Constant watch.
Continuity. Long time series to monitor change in climate.
Step 2. Get the raw satellite data.
Current data.
Archived data.
Planning for the future.
Example
Step 3. Process the data.
Gridding, Homogenisation, Calibration & Validation, Quality.
Scientific processing - application of state-of-the–art
algorithms distilled from the very latest scientific reasoning.
Step 4. Distribution of Climate Data Products.
“Just give me the data”.
ESA Climate Change Initiative (CCI)
Managing Complexity of Climate Data Production.
Open Data Challenges & Approaches
Meaningfulness & Community.
Managing Open Data Complexity
Ease of Data Access.
Managing Open Data Complexity
Bespoke Open Standards.
Managing Open Data Complexity
Machine & Human Readable Standards.
Managing Open Data Complexity
Interoperability & Collaboration.
Managing Open Data Complexity
Open-source Tooling
Managing Open Data Complexity
“Climate Analysis Toolbox for ESA”
A software to facilitate processing and analysis of all the data products generated by the ESA Climate Change Initiative Programme (CCI).
CATE
ESA UNCLASSIFIED - For Official Use
61
• Data sources• Operations• Workflows
Web Service (WebAPI)
{ RESTful }
Command-Line App (CLI)
Desktop App (GUI)
Python Core Lib
(API)
Python Core Lib
(API)
Plugin 1
Plugin 2
Plugin 1
Plugin 2
ESA O
pen D
ata
Port
al
and o
ther
data
serv
ices
Process 1
Process 2 Process 3
Cate Desktop
62
• Browse datasets published by CCI Open Data Portal
• Download full datasets or just subsets
▪ Temporal subset
▪ Spatial subset
▪ Variable subset
• Manage also your local data sources
63
64
65
66
67
• Every operation is a new
workflow step
• Workflows can be
executed from Python
or from the
Command-Line
Interface (CLI)
68
Python Programming & Batch Processing
Exported CLI Calls
69
• Exported Python Code
Execution Scenarios
70
Cate Desktop
Cate Desktop
Cate Desktop
71
Processing and analysis of Earth Observation data
Managing big EO data is increasingly complex.
But not just technically.
Collaboration. Culture. Organisation. Structure.
Communication. Learning. Exchange.
Sentinel Toolbox SNAP: step.esa.int
CCI Toolbox: github.com/CCI-Tools/CCI-Tools.github.ioect-core.readthedocs.io cci-tools.github.io
Earth System Datacube: earthsystemdatacube.net/