predicting sentinel-2 optical data using multitemporal
TRANSCRIPT
Predicting Sentinel-2 optical data using
multitemporal Sentinel-1 radar data for
cloud gap reconstruction
Hamelberg, MA
12 May 2020
Geo-Information Science and Remote Sensing
Thesis Report GIRS-2020-25
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 1
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for
cloud gap reconstruction
Thesis Report GIRS-2020-25
Author
Hamelberg, MA 1
Registration number: 910824-302-050 | [email protected] 1 MSc, Geo-Information Science and Remote Sensing, Wageningen University
Supervisors
Clevers, JGPW 2
Reiche, J 2
[email protected] 2 Laboratory of Geo-Information Science and Remote Sensing, Wageningen University and Research Centre
12 May 2020, Wageningen, Netherlands
A thesis report submitted in partial fulfilment of the degree of Master of Science at Wageningen University
and Research Centre, The Netherlands
Keywords
Thesis; WUR; Deep learning; Predicting; Sentinel-1; radar data; Sentinel-2; optical data; Cloud gap
reconstruction; Remote Sensing
Abstract
Dense and continuous land surface mapping and monitoring is hampered by cloud cover. The Sentinel-
2 satellite mission provides optical data that suffers from this problem, creating so called cloud gaps. There
is a demand to predict missing optical data using alternative sources to reconstruct these cloud gaps. The
spatiotemporally similar Sentinel-1 satellite mission is one of these sources providing radar data that is
able to bypass clouds. The research objective is to predict incomplete regions of Sentinel-2 optical data
using multitemporal Sentinel-1 radar data from cloud free regions. Dissimilarities between the data types
render this a difficult task. A U-Net deep learning model is applied to provide an advanced solution. The
model is designed for semantic segmentation and image reconstruction and has been successfully applied
in previous research in the field of remote sensing. Initial preprocessing of the Sentinel data is performed
to prepare for an optimal training and prediction phase. Google Earth Engine and various Python libraries
are the tools of choice. A basic machine learning random forest regressor is implemented to form a
prediction performance baseline. The performance of both prediction models are tested on two study
areas with iterations of 10%, 20%, and 30% artificial cloud cover. The U-Net has a promising performance
with consistent results for both study areas (R2 > 0.70; MSE ~0.02; RPD > 2.00; SSIM > 0.70). Significant
differences are observed between the prediction models (p-value < 0.05), favoring the U-Net. No
significant differences are observed in prediction performances between cloud coverages, suggesting
consistent performances when cloud cover varies. The U-Net retains the image structure between cloud
cover iterations, outperforming the baseline model that generally sees noisy and subpar results. More
(artificial) study areas and cloud cover iterations need to be added for rigorous model performance testing
and methodological approaches need to be aligned to other research for valid comparisons.
Thesis code number: GRS-80436
Thesis Report: GIRS-2020-25
Wageningen University and Research Centre
Laboratory of Geo-Information Science and Remote Sensing
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 2
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 3
Contents
1 Introduction ................................................................................................................................. 4
1.1 The Digital Earth ...................................................................................................................... 4
1.2 Sentinel missions ...................................................................................................................... 4
1.3 Deep learning ........................................................................................................................... 6
1.4 Literature review....................................................................................................................... 6
1.5 Problem statement ................................................................................................................... 7
1.6 Research objective .................................................................................................................... 7
Research questions ......................................................................................................................... 7
2 Methodology................................................................................................................................. 8
2.1 Overview ................................................................................................................................... 8
2.2 Materials .................................................................................................................................... 8
Google Earth Engine and Python ................................................................................................. 8
Data specifications ......................................................................................................................... 8
2.3 Methods .................................................................................................................................... 9
Study areas ...................................................................................................................................... 9
Preprocessing Sentinel data ......................................................................................................... 10
Artificial clouds ............................................................................................................................ 11
Predicting with deep learning ..................................................................................................... 12
U-Net performance testing .......................................................................................................... 13
Comparison to basic machine learning ...................................................................................... 13
3 Results ......................................................................................................................................... 14
3.1 U-Net predictions ................................................................................................................... 14
3.2 Comparing prediction models .............................................................................................. 14
3.3 Comparing cloud cover predictions ...................................................................................... 15
3.4 Test results & visualizations ................................................................................................... 16
4 Discussion ................................................................................................................................... 21
4.1 Interpreting the U-Net results ............................................................................................... 21
Response labels ............................................................................................................................ 21
Cloud cover iterations ................................................................................................................. 22
4.2 Literature comparison ............................................................................................................ 22
4.3 Methodological approach ...................................................................................................... 23
Study areas and preprocessing .................................................................................................... 23
Artificial cloud improvements .................................................................................................... 23
Prediction models and testing .................................................................................................... 24
5 Conclusion .................................................................................................................................. 25
6 Appendix ..................................................................................................................................... 26
6.1 Extended tables ....................................................................................................................... 26
7 References ................................................................................................................................... 28
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 4
1 Introduction
1.1 The Digital Earth
Earth is constantly evolving, captured by numerous remote sensing satellites (ESDS, 2019). Al Gore
coined ‘The Digital Earth’ in his 1998 speech, ushering in the age for spatial information by a data driven
and computer-generated twin Earth, which could greatly advance our understanding of the natural
processes and human influences on the planet (Gore, 1998). This digital Earth’s representation of its
surface should be dense, continuous, and contemporary. These goals become a reality, thanks to the
increasing amount of public and private satellite imagery, as well as the capabilities to store, process,
manage, and analyze this huge amount of data on a planetary scale (Gorelick et al., 2017; Mateo-García et
al., 2018; Stuhler et al., 2016). These satellites are equipped with many types of sensors, capturing different
aspects of Earth’s surface. A majority of the sensors capture visible and infrared light reflected from the
sun. These, so-called passive optical sensors, register biophysical and chemical surface properties, making
them useful for a multitude of sectors, including agriculture (Gao et al., 2016; Wang et al., 2017; Zheng
et al., 2016) and ecology (Nagendra et al., 2013; Pettorelli et al., 2014). Data from these optical sensors
offer a human perspective of interpreting and understanding Earth’s surface (Wang and Patel, 2018).
Many tools have been developed that rely on optical data, for example navigation and visualization
applications (i.e. Google Maps), radiative transfer models (e.g. PROSAIL by Jacquemoud et al. (2009)),
and landcover classification libraries (Chen et al., 2017; Wu et al., 2017).
However, there are limitations to optical sensors, namely, atmospheric interferences obstructing
solar reflectance. One of these interferences is cloud cover, causing a lack of information forming ‘cloud
gaps’, hampering a dense and continuous optical perspective of Earth’s surface, especially in the tropics
(Loff, 2015). An alternative data source could be used to reconstruct these cloud gaps. A good candidate
is radar, which is the process of detecting and ranging radio waves that are able to bypass cloud cover.
Radar consists of an active sensor transmitting beams of radio waves scattering on a surface that (partially)
reflect back to a receiver. This process is called backscatter. Radar looks at physical properties of a surface
and is therefore different than the chemically orientated optical data. The goal is to transform radar data
to pseudo-optical data, or in other words predict optical data using radar data, to reconstruct cloud gaps
using the relationship between the dissimilar datasets from nearby cloud free regions.
1.2 Sentinel missions
This thesis will focus on the publicly available Sentinel-1 (S1) and Sentinel-2 (S2) missions by the
European Space Agency. S1 has radar sensors and S2 optical sensors. Each mission has two satellites,
providing a fine spatiotemporal resolution with near global coverage. A spatiotemporal resolution has
two components. The first component is the spatial resolution, which is the smallest possible feature
detectable within a resolution cell (i.e. pixel) at surface level (Liang et al., 2012). The second component
is the temporal resolution, which is the revisit time of a satellite platform capturing the same area (Small
et al., 2018). S1 and S2 have spatial resolutions of 5-20m and 10-60m respectively and temporal resolutions
of 6 and 5 days at the equator respectively (2–3 days at mid-latitudes). A fine spatiotemporal resolution is
hereby defined as the resolutions of S1 and S2. The data provided by the missions is freely available and
accessible from various geoportals, benefiting research and real-world applications.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 5
The radar systems equipped by S1 are active sensors using the technique of synthetic aperture radar
(SAR) to register C-band electromagnetic radiation within the microwave spectrum. Roughness and water
content on a surface can be inferred from the backscatter values that make up radar data (Woodhouse,
2017). Surface roughness is measured by obliquely transmitted microwaves that either deflect away from
a transmitter or scatter back to a receiver close to the transmitter. For example, a flat surface (e.g. a calm
lake) deflects most microwaves resulting in lower backscatter values, whilst a rough surface (e.g. a forested
area) causes diffusion, scattering more microwaves towards the receiver resulting in higher backscatter
values. Backscatter of water content is affected by the dielectric properties of water: higher dielectric
constants increase backscatter. These two processes differ slightly depending on the polarization of
microwaves when transmitted and received. Two common polarizations are vertical-vertical (VV) and
vertical-horizontal (VH), each giving particular cues about surface properties.
Objects in a resolution cell scatter transmitted microwaves in all sort of directions before making
their way back to a receiver. This causes granular interference called speckle, appearing as spike noise (i.e.
“salt & pepper” noise), even on a seemingly flat surface. Another disadvantage of radar is its heavy
influence by topography. To capture distances by a radar system, the sensor must be at an oblique angle,
resulting in slant-range scale distortion (features compress closer to the sensor), foreshortening (feature
slopes are compressed), layover (sloped features appear closer), and radar shadow effects (sloped features
obstruct radar beams). The speckle and topographic effects of radar data make it difficult to interpret,
preprocess, analyze, and geographically co-register to other satellite data (Mou et al., 2017; Schmitt et al.,
2017; Tzouvaras et al., 2019). These disadvantages limit the usage of radar data. However, the
aforementioned capability to bypass most atmospheric interferences and continuously capture the surface
remains a huge advantage over optical sensors.
The optical sensors equipped on S2 register visible (blue, green, red), near infrared (nir), and short-
wave infrared (swir) electromagnetic radiation. The sensors include the vegetation red edge spectral
domain, cloud screening, and atmospheric correction bands. Bands are attributes with digital pixel values
of sections on the electromagnetic spectrum. Optical data by S2 is captured close to the sensors’ nadir
mitigating angular effects. The data is easy to interpret, and spatial resolutions are higher due to the
smaller waveforms. Speckle is not an issue, reflecting the surface more accurately relative to radar data.
Figure 1 shows the difference between optical and radar data captured in the same area.
Figure 1 (left) S2 optical data displaying visible light (captured on 2019-08-26); (right) VV backscatter values of
descending polar orbit S1 radar data (captured on 2019-08-25).
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 6
1.3 Deep learning
Accurately predicting optical data using dissimilar radar data is challenging considering their
differences and limitations. Deep learning (DL), which is an advanced subset of machine learning (ML),
could provide a solution to this transformation problem. Even though DL is often considered as a ‘black
box’ approach, a considerable amount of research has been done using DL algorithms with superior
results compared to traditional ML algorithms. Nowadays, DL is making its way into the field of remote
sensing and is becoming more popular due to its high performance in satellite image analysis (Belgiu and
Stein, 2019; Ma et al., 2019; Zhu et al., 2018, 2017). A common DL algorithm used for image analysis is a
convolutional neural network (CNN), which is especially well-suited for image object recognition
(Krizhevsky et al., 2012), and with its many iterations and improvements, such as convolutional
autoencoders (CAEs), greatly applicable to semantic segmentation and image reconstruction (Cresson et
al., 2019; Shelhamer et al., 2017).
1.4 Literature review
Reconstruction of missing information in optical data using radar data has seen a rise in the past
decade (Gao et al., 2020; Schmitt et al., 2017; Shen et al., 2015). Eckardt et al. (2013) made initial strides
by transforming pixel values in extensive radar data to pseudo-optical data using spatial statistics. The
composites show promising performances with different percentages and distributions of artificial cloud
cover. The technique is developed for multitemporal and multifrequency radar data with very fine
resolutions. The approach may yield limited results on the simpler S1 data. Nevertheless, it did set a trend
in furthering this research field and should not be overlooked for potential alternative implementations.
It was followed by several ML approaches using dictionary learning with sparse representation for pixel-
based reconstruction (Huang et al., 2015; Li et al., 2014; Xu et al., 2016). With the advent of advanced DL
approaches in remote sensing (Ma et al., 2019; Zhu et al., 2017), a deviation from earlier statistical and
basic ML algorithms towards these ‘deeper’ and more extensive algorithms came to fruition. In late 2017,
Zhang et al. (2017) looked at the effective fusion of multimodal remote sensing data using a fully CNN
for semantic segmentation. This was followed shortly by research using this technique for transforming
and compositing radar and optical data. Early 2018, Scarpa et al. (2018) used a compact CNN to transform
radar to vegetation indices, where after Wang & Patel (2018) used a cascade architecture of CNNs
extended by a generative adversarial network (GAN) to generate pseudo-optical data from radar data for
interpretation purposes. With the increased interest in radar to optical data transformation, a preprocessed
dataset of co-registered radar and optical data was offered by Schmitt et al. (2018) to advance research, this
included complex urban areas that are heavily subdued to the topographic effects (Wang and Zhu, 2018).
A conference in the same month of July 2018 had Grohnfeldt et al. (2018) and Liu and Lei (2018) address
the transformation topic, suggesting further improvements of previous research and discussed future
directions, as well as addressing the use of more advanced DL algorithms, such as conditional GANs
(cGANs). Bermudez et al. (2018) and He and Yokoya (2018) used a CNN and a cGAN for advanced optical
data prediction using radar data, where Bermudez et al. (2018) included a cloud gap reconstruction after
the prediction phase using DL algorithms, improving their method in further research (Bermudez et al.,
2019). A cGAN has limitations according to Cresson et al. (2019) due to its generative nature, suggesting
the use of CAEs that rely more on reconstruction by estimation. These CAEs often have a U-Net
architecture (Ronneberger et al., 2015) and are widely used in DL applications, reaping the
aforementioned benefits of high performances in semantic segmentation and image reconstruction.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 7
During the writing of this thesis, Gao et al. (2020) published a paper describing the use of a U-Net for
optical data using several radar data sources, introducing a novel method of iteratively tweaking the
reconstructed cloud gaps (that were slightly deviating from the ground truth) with a cGAN before the
final fusion, obtaining prediction results with proper spectral information and fine textures.
1.5 Problem statement
Predicting optical data using radar data with the help of DL algorithms is still a young research field
where all its potentials have not yet been fully exploited. For example, Gao et al. (2020) mentions to start
using multitemporal radar data to incorporate changes over time as explanatory variables, whilst Cresson
et al. (2019) still relies on optical data after the prediction period. Research before these novel methods
do not use the powerful U-Net risking suboptimality. Most recent research limit their reconstruction to a
single percentage of (artificial) cloud cover and are not systematically testing over varying percentages.
Aside from previous research being performed to reconstruct cloud gaps, room for improvement
to the proposed prediction methods is welcomed. Current prediction models still do not perfectly predict
optical data using radar data. The physical obstruction by clouds in the optical range of the
electromagnetic spectrum remains a problem to sectors that rely on optical remote sensing data.
1.6 Research objective
The cloud penetrating capability of radar provides a solution to reconstruct cloud gaps in optical
data. This can be achieved by forming a relationship between the VV and VH bands of radar data
(hereafter named training features) and the visible and infrared bands of optical data (hereafter named
response labels) in cloud free regions. For this process, a U-Net is implemented. The S1 training features
are multitemporal and solitarily use information during and before the prediction time interval within
cloud free regions. The optical data prediction results, spanning over multiple study areas, are extensively
and systematically tested whilst iterating over varying artificial cloud cover percentages.
Research questions
The research objective can be summarized to one main research question with two sub-questions:
How accurately can a U-Net predict optical S2 data in cloud gaps using multitemporal S1 radar data
from cloud free regions?
1. Does the advanced U-Net model outperform a basic machine learning model?
2. To what degree do increasing cloud cover percentages affect prediction performances?
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 8
2 Methodology
2.1 Overview
Figure 2 Scheme of the core methodological steps.
2.2 Materials
Google Earth Engine and Python
S1 and S2 (S1/2) provide a large influx and a constant stream of globally and dynamically available
data preprocessed and accessible through Google Earth Engine (GEE). GEE is a spatial cloud computing
platform and is freely available for research, education, and nonprofit use (Google Earth Engine, n.d.).
When registered for GEE, a Python application programming interface (API) can be used in this
programming language’s coding environment with an authentication key. For this thesis, the freely
available ‘Jupyter’ notebook coding environment ‘Colaboratory’ (Google Colab, n.d.) is used with Python
3.x. Colaboratory allows access to virtual machines with graphical processing units that enable fast
training of DL algorithms. TensorFlow (TF) wrapped in Keras is the DL library of choice. Other libraries
include Folium and Matplotlib for visualization, Numpy and Pandas for data structuring, and Scipy and
Scikit-Learn/Image for statistics and ML processes. A single web-based notebook addresses the research
question auxiliary to the thesis report. To run the notebook, a Google account with a GEE registration
and an active Google Cloud Platform subscription is required.
Data specifications
S1 data stored in GEE is available as level-1 C-band SAR ground range detected images. These are
calibrated and orthorectified using the S1 Toolbox. S2 data is available as level-2A surface reflectance
images in the GEE database. These are atmospherically corrected and orthorectified. Other relevant
specifications are displayed in Table 1.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 9
Table 1 Relevant S1/2 specifications.
Specification Sentinel-1 (S1) Sentinel-2 (S2)
Selected bands
VV (Vertical-Vertical polarized, ~5.404
GHz); VH (Vertical-Horizontal
polarized, ~5.404 GHz); and
𝜇 (i.e. mean) of VV and VH (𝜇𝑉𝑉,𝑉𝐻)
B2 (blue ~493nm); B3 (green ~560nm);
B4 (red ~665nm); B8 (nir ~833nm); and
B11 (swir1 ~1610nm); B12 (swir2
~2190nm)
Spatial
resolution VV and VH: 10x10m
B2-B8: 10x10m; and
B11 and B12: 20x20m
Temporal
resolution
6 days at the equator
(2-3 days at mid-latitudes)
5 days at the equator
(2-3 days at mid-latitudes)
Availability 2014-10-03 to present; and
near global coverage
2017-03-28 to present; and
near global coverage
Other
instrument mode: interferometric wide
swath (IW); orbit pass: ascending and
descending; and resolution: high
2.3 Methods
Study areas
Two study areas (i.e. test sites) are selected based on a study time interval where zero to minimal
atmospheric interferences are present, providing clean reference data for testing. The study time interval
can be of arbitrary length as long as it contains S1/2 data; for this thesis, an interval of 7 days is used to
ensure complete coverage of S1/2 data. The size of the study area must be large enough to encompass at
least two distinct features, such as a forested patch and an urban area or various agricultural fields with
different growth stages. The terrain in a study area must be topographically simple, minimizing added
uncertainties by the aforementioned limitations of optical and radar data. It is assumed that both datasets
are correctly preprocessed and devoid of atmospheric interferences. An in situ ‘laboratory’ situation is
created considering these assumptions as close as possible. The study area images are visually inspected
for each study time interval to meet the assumptions. It is difficult to achieve complete elimination of all
uncertainties, as preprocessing tools and atmospheric interference indications are not perfect. Table 2
provides specifications for the two selected study areas, which are partially visualized in Figure 4.
Table 2 Two study areas that meet uncertainty assumptions as close as possible. The study areas are in the
coordinate reference system of EPSG:4326.
Specification Flevoland, NL (study area 1) Amazon, BR (study area 2)
Description Various rectangular agricultural fields
with different growth stages.
Partially deforested area with rivers
and a town.
upper left (x, y),
upper right (x, y),
lower left (x, y),
lower right (x, y)
(5.670, 52.765),
(5.670, 52.725),
(5.735, 52.725),
(5.735, 52.765)
(-69.865, -6.650),
(-69.865, -6.686),
(-69.828, -6.686),
(-69.828, -6.650)
Area / Perimeter ~19.46 km2 / ~17.68 km ~16.36 km2 / ~16.14 km
Prediction time
interval [2019-08-20, 2019-08-27) [2019-07-25, 2019-08-01)
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 10
Preprocessing Sentinel data
S1/2 data is loaded from the GEE database and filtered based on metadata and spatial properties,
which include the required image bands, desired study time interval (i.e. 7 days), and the boundaries of
the study area. S1’s instrument mode, orbit pass, and resolution (as mentioned in Table 1) are filtered
based on the metadata as well. The filtered S1/2 data is each loaded in an image collection (stack of images)
for further processing using tools within GEE. The stacked images in each image collection are aggregated
to a median value at each pixel if there are two or more images overlaying. This empirically shows that
extreme values (e.g. clouds, haze, shadows, sensor artifacts, and speckle) are advantageously reduced or
eliminated, especially with larger time intervals, as is commonly practiced within the GEE community.
Furthermore, S2 is resampled to 30x30m by a single focal pass sampling the median within a circular
kernel, whilst S1 retains its 10x10m spatial resolution. This means that for 9 training feature pixels, one
response label pixel value is available, reducing the amount of variation during training. The higher
resample size of S2 data makes it heuristically more similar to the naturally fuzzier S1 data. It also reduced
spatial inaccuracies within the S2 data. Testing on prediction performance, the 30x30m resample size
outperforms the 10x10m resample size. However, when resample sizes become too large, say 60x60m,
spatial detail is lost. The final predicted pseudo-optical images are of a spatial resolution of 30x30m.
Additionally, the statistical Lee speckle filter (Lee, 1980) is applied to S1 data, reducing speckle whilst
retaining feature fidelity. This renders it more similar to the relative smooth S2 data. Moreover, the S1
images taken by ascending and descending orbit passes are aggregated by their median at each pixel,
resulting in reduced speckle and denser representations within a time interval. Fuzziness may increase
when fusing the orbital passes, as they capture the surface from different angles and directions. This effect
is especially apparent in complex terrains where topographic effects distort the ground truth. Finally, extra
temporal datapoints are added to the S1 data by repeating the above mentioned S1 preprocessing steps on
more than one time interval to indicate changes in backscatter values (VV and VH) over time, thus adding
a multitemporal component to the S1 training features (see Figure 3).
Figure 3 Scheme of multitemporal S1 datapoints. The first time interval (t1) of S1 is the same as the time interval
of S2 and has a length of 7 days. This is the time interval where S2 optical data is predicted and contains the first S1
temporal training features. Extra features are extracted by a time interval (t2) one week prior to the prediction time
interval. Lastly, three consecutive features are extracted from time intervals of a month each (t3, t4, and t5).
Individual radar images are heavily subdued to topographic effects and speckle. Aggregation of
multiple images will result in a median value that resembles spectral reality more accurately as extremes
are reduced and recurring features dominate. However, surface changes are washed out over longer time
intervals. This is the reason why both long (see t3, t4 and t5 in Figure 3) and short (see t1 and t2 in Figure
3) time intervals are used as training features. This effect is visualized in Figure 4, where the left S1 images
(short time interval) are fuzzier and noisier as compared to the center S1 images (long time interval). The
right S2 images are within the same prediction time interval as the left S1 image.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 11
Figure 4 Top to bottom: Study area 1; Study area 2. Left to right: False color representation of S1 training
features in t1 (red: VV, green: VH, blue: 𝜇𝑉𝑉,𝑉𝐻); False color representation of S1 training features in t3 to t5 (red:
VV of t3, green: VV of t4, blue: VV of t5); True color of S2 response labels in t1.
Artificial clouds
Artificial clouds are generated using various magnitudes of gradient (Perlin) noise. The clouds can
vary in scatteredness and patch size. Shadows are not modeled for simplicity sake as the final mask is of
importance. A threshold can be set to indicate the cloud cover percentage in a study area. For the
experiments in this thesis, an initial seed driven pseudo-random state of artificial cloud dispersion is
generated resembling a single layer of medium sized cumulus cloud formations. This dispersion is
consistent, and the boundary of the initial state expands evenly when the cloud cover threshold increases
(see Figure 5). The artificial clouds allow extensive testing on reference data (i.e. ground-truth data) in
masked regions.
Figure 5 Artificial clouds cover in study area 1. Left to right: 10%; 20%; 30%.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 12
Predicting with deep learning
All bands (i.e. response labels) in the preprocessed S1/2 datasets are masked on regions with artificial
cloud cover. These partially masked datasets will be used as input data to the U-Net (see Figure 6). The
model architecture of this U-Net is based on TF documentation (tensorflow/models, n.d.) and is similar
to the code provided in a notebook demo by Google (google/earthengine-api, n.d.).
Figure 6 A schematic overview of a U-Net architecture (Meados et al., 2019). The pixelwise relation between
training features (multitemporal S1 data) and response labels (S2 data) are encoded to a latent space where the
input (x) is spatially reduced to granular features. From this latent space the input is decoded (reconstructed) with
input and encoder layer concatenations to a predicted output (y).
The U-Net uses the process of semantic segmentation to transform continuous pixel values from
one image to another, in this case, from multitemporal S1 to S2 data. It looks at the relation between
individual pixels, but also considers their spatial context by convolving each layer to a lower spatial
dimension using different convolutional filters. Therefore, elevating its predictive power to a larger set of
granular features within the data. To be able to extract training features and the response label, a single
multidimensional image (𝑤𝑖𝑑𝑡ℎ (𝑖𝑛 𝑝𝑖𝑥𝑒𝑙𝑠) × ℎ𝑒𝑖𝑔ℎ𝑡 (𝑖𝑛 𝑝𝑖𝑥𝑒𝑙𝑠) × 𝑑𝑒𝑝𝑡ℎ (𝑖. 𝑒. 𝑙𝑎𝑦𝑒𝑟𝑠)) is created of the
preprocessed S1/2 image collections (11 layers (i.e. training features and response labels) in total). At each
pixel of this image a neighborhood array (i.e. patch) with a kernel size of 64 × 64 pixels is stored with a
depth of the training features and response labels (64 × 64 × 11). The kernel size should not be too small,
as spatial contexts may be lost due to limited training features. It should not be too large either, as cloud
masked regions may distort feature continuation within the patch, as well as to prevent the patch to extent
too far beyond the boundaries of a study area.
The patches are pseudo-randomly sampled from the image and stored into a TF record, which is a
special file format for the TF library to optimize DL workflows. The TF record is written into a TF dataset
in random access memory that shuffles the patches randomly up to a buffer size of 2000. This to prevent
a bias when iterating over the training data during the training phase. The buffer has 2000 batches of 16
shuffled patches with the training features (16 × 64 × 64 × 10) and the response label (16 × 64 × 64 ×
1). This is fed into the U-Net as input nodes. The model runs for an optimal number of epochs (the loss
usually stabilizes between 30 and 50 epochs with 30% cloud cover) and with 1000 steps per epoch.
The U-Net provides faster results with normalization of input data as this accelerates the process of
gradient decent. Before training, each input band for S1/2 is linearly normalized to a range of 0 and 1
between the 1st and 99th percentile of the original data range. This excludes extremes and often centers the
data around the median more evenly. The normalization can be denormalized with the original band
ranges and the extremes outside the percentile range will be lost after denormalization. This can be
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 13
avoided by normalizing to the minimum and maximum value. However, retained extremes will have a
negative effect on training times and prediction performance.
U-Net performance testing
The performance by the U-Net is tested by relating cloud free optical reference data to the predicted
optical data. Input data with increasing percentages (10%, 20%, and 30%) of artificial cloud cover are
trained in the U-Net. Each iteration has a pseudo random distribution of a similar cloud cover dispersion
by an initial seed to generate consistent results. The outputs are tested using several statistical metrics.
Basic metrics include the coefficient of determination (R2), the mean squared error (MSE) and the residual
prediction deviation (RPD). R2 values close to 1 indicate a correlation between datasets, lower MSE values
indicating a narrow fit around the identity line, and RPD values indicate prediction strength where values
above 3.00 are preferred. These metrics, together with their associated scatter plots, provide insight in the
predictive power of the model for each response label and increasing cloud cover percentages. However,
these metrics are limited in assessing complex image characteristics, such as image structure. Two images
can have the same MSE (e.g. by auto cancellation of spike noise) whilst having a completely different
image structure (e.g. texture). The structural similarity index (SSIM) solves this problem (Wang et al.,
2004; Wang and Bovik, 2009). This advanced metric elaborates upon the preceding peak signal-to-noise
ratio (PSNR) metric that only assesses noise. The SSIM provides an index ranging from 0 to 1, where 1 is
an identical structural similarity between images and 0 indicates large differences in image structure. All
metrics assess differences in continuous data as found in the pixels from the predicted images and together
provide a comprehensive indication of the model performance.
It should be noted that before applying the statistical metrics, the test images are dimensionally
reduced (2D to 1D) and encompass purely the regions within the 10% cloud cover mask. This to test the
absolute prediction performance of the models without interference of cloudless reference data or
additional data found in the larger cloud cover percentage masks. The image is flattened before testing
because the SSIM image quality model does not take empty pixels (i.e. masked regions) as inputs. The
null values with the masked regions are removed in the flattened image array.
Comparison to basic machine learning
The basic ML method of random forest regression is used to compare the U-Net performance to a
baseline. The random forest regressor (RFR), by the Python library Scikit-Learn with default settings,
takes as training features the multitemporal S1 data and as response labels the S2 data per pixel. The
response label predictions are then tested using the same performance testing methods as applied to the
U-Net predictions. The RFR uses the same preprocessed S1/2 data as the U-Net, and the RFR predictions
are compared to the same reference data. Furthermore, as a control, both the RFR and U-Net predictions
are compared to random noise generated from the input data ranges.
The comparison of each statistical metric derived from the RFR and U-Net predictions are
quantified by indicating metric value differences expressed in percentages. The differences between
prediction performances are further quantified by applying an independent two-sample t-test between the
statistical metrics of the models, as well as between different cloud cover percentages.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 14
3 Results
3.1 U-Net predictions
The statistical metrics as seen in Table 5 display consistent results for the U-Net predictions in both
study areas. These results apply for most response labels and cloud cover percentages. The R2 is usually
above 0.70, suggesting a correlation between the reference and prediction data. The MSE is mostly below
0.02 in a data range from 0.00 to 1.00. Study area 2 has similar MSE values for all response labels, while
study area 1 shows a lower performance at the response labels in the visible range of the electromagnetic
spectrum. This pattern is seen at the RPD values as well. The RPD does not exceed the desired value of
3.00 for all response labels and averages around 2.20. The last metric, the SSIM, is stable and hovers around
0.70, indicating similarities in image structures even when other metrics show variation.
It is noticeable that response labels with long wavelengths, such as nir, swir1 and swir2, outperform
the other labels. The nir label has an especially good performance in study area 1 and swir1 performs
relatively well in study area 2. The MSAVI vegetation index has a consistent and decent performance in
both study areas.
A small difference, usually between 1% to 5%, is observed in most statistical metrics when
considering cloud coverages. Counterintuitively, decreasing cloud cover percentages do not always follow
an expected pattern of increasing performances. For example, in study area 1, the blue label always
performs better with a 30% cloud cover as compared to the 20% cloud cover. This effect differs per
response label and study area.
3.2 Comparing prediction models
Overall, the U-Net outperforms the RFR. The statistical metric values between the predictions of
both models differ from around 5% to 50% for study area 1. Even higher differences are observed in study
areas 2, ranging from around 10% to above a 100%. Performance between increasing cloud cover
percentages are small based on the U-Net predictions. Higher differences between increasing cloud cover
percentages for the RFR predictions are observed. Table 3 displays a significant difference (p-value < 0.05,
n=21) between the statistical metrics applied to the RFR and U-Net predictions when comparing all labels
and cloud coverages. The differences mostly favor the U-Net. An exception between the significant
difference concerns the nir label when grouping the statistical metrics based on individual response labels
in study area 1. This also accounts for the swir1 and MSAVI response label in study area 1 excluding the
SSIM metric, suggesting similar average pixel values whilst the image structure differs. Moreover, a
significant difference (p-value < 0.05, n=7) can be observed in the SSIM results when grouping cloud
coverages between the RFR and U-Net predictions. This can be explained by an increase in noise seen in
the RFR predictions with higher cloud coverages (see Figure 13), as compared to the more consistent
predictions of the U-Net. Study area 2 shows significant differences in most groups, excluding the nir,
swir2, and 10% group considering the SSIM results.
The differences between the models are further visualized for certain response labels (see Figure 9
and Figure 11). The selected response labels of the U-Net prediction show a narrower fit around the
identity line, with fewer outliers compared to the RFR predictions. The narrow fit is especially noticeable
in the histograms, where the edges of the marginal distribution are more aligned. The histograms of the
U-Net also show fewer spikes in study area 2.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 15
Lastly, visually inspecting Figure 13 and Figure 14, the differences between the RFR and U-Net
predictions solidify. RFR predictions are noisier, display fewer features (especially visible in the urban
region found in the west of study area 2), and have more outliers (mainly visible in the supposedly smooth
agricultural fields of study area 1). For both the RFR and U-Net, the prediction regions are fussier
compared to the reference data. This is because the predictions are based on the naturally fuzzy radar data.
Table 3 The p-values quantify the difference in performance between the RFR and U-Net predictions based on
(grouped) statistical metrics. The p-values are derived from results found in Table 5.
Type Grouped p-values | Study area 1 p-values | Study area 2 Sample
size (n) R2 MSE RPD SSIM R2 MSE RPD SSIM
Response
labels
blue 0.07 0.07 0.06 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 3
green 0.08 0.08 0.07 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 3
red < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 3
nir 0.56 0.56 0.6 0.17 < 0.05 < 0.05 < 0.05 0.22 3
swir1 0.16 0.16 0.17 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 3
swir2 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 3
MSAVI 0.12 0.12 0.13 < 0.05 < 0.05 < 0.05 < 0.05 0.41 3
Cloud
cover
10% 0.31 0.23 0.25 < 0.05 < 0.05 < 0.05 < 0.05 0.1 7
20% 0.2 0.13 0.17 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 7
30% 0.07 0.06 0.06 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 7
all < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 21
3.3 Comparing cloud cover predictions
The differences between cloud coverage performances are not significant (see Table 4) considering
the statistical metrics found in Table 5. This is noticeable in the graphs visualizing certain response labels
of the U-Net predictions (see Figure 10 and Figure 12) and can be observed in the geographic
representations (see Figure 13 and Figure 14). As aforementioned, the U-Net displays an unexpected
pattern where lower cloud cover percentages do not necessarily produce better prediction results. For
study area 1, the 10% cloud cover predictions always outperform the higher coverages, while in study area
2, 30% cloud cover predictions sometimes outperform the lower cloud coverages (e.g. the red response
label). It contrasts the RFR predictions, as these follow the expected decreasing performance pattern and
smaller cloud cover percentages always outperform the larger ones. The U-Net shows consistent and high
p-values between cloud cover variations, and the p-value are larger compared to the RFR predictions.
Table 4 The p-values between statistical metrics as shown in Table 5. These compare the RFR and U-Net prediction
performance between cloud cover percentages over all response labels.
Model Grouped p-values | Study area 1 p-values | Study area 2 Sample
size (n) R2 MSE RPD SSIM R2 MSE RPD SSIM
U-Net
10%-20% 0.42 0.34 0.37 0.51 0.78 0.86 0.85 0.79 7
20%-30% 0.96 0.86 0.84 0.97 0.40 0.69 0.40 0.91 7
10%-30% 0.35 0.26 0.24 0.51 0.69 0.86 0.61 0.70 7
RFR
10%-20% 0.24 0.23 0.23 0.50 0.36 0.52 0.43 0.46 7
20%-30% 0.60 0.58 0.58 0.79 0.81 0.86 0.84 0.94 7
10%-30% 0.11 0.09 0.09 0.35 0.26 0.42 0.33 0.41 7
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 16
3.4 Test results & visualizations
Table 5 The statistical metrics of S2 optical data predictions within artificially cloud masked regions. Random
noise control metrics are included. Appendix 6.1 extends the table by displaying percent differences between the
metrics. Figure 7 and Figure 8 in this paragraph visualize the table. Figure 9 to Figure 12 visualize the regression
and histogram graphs of some metrics. Figure 13 and Figure 14 geographically display the prediction results.
Study
area
Response
labels
Cloud
cover Noise
R2 Noise
MSE Noise
RPD Noise
SSIM
RFR U-Net RFR U-Net RFR U-Net RFR U-Net
1
blue
10%
-0.45
0.74 0.78
0.1297
0.0231 0.0195
0.83
1.96 2.14
0.09
0.66 0.71
20% 0.69 0.74 0.0275 0.0229 1.80 1.97 0.64 0.69
30% 0.67 0.76 0.0296 0.0217 1.74 2.03 0.64 0.71
green
10%
-0.51
0.67 0.70
0.1085
0.0236 0.0213
0.81
1.74 1.83
0.10
0.68 0.72
20% 0.58 0.68 0.0298 0.0233 1.55 1.76 0.65 0.71
30% 0.54 0.69 0.0329 0.0223 1.48 1.79 0.64 0.71
red
10%
-0.33
0.79 0.84
0.1494
0.0239 0.0179
0.87
2.17 2.51
0.08
0.65 0.73
20% 0.74 0.83 0.0297 0.0192 1.95 2.42 0.63 0.73
30% 0.71 0.80 0.0329 0.0227 1.85 2.23 0.63 0.71
nir
10%
-0.92
0.86 0.85
0.1239
0.0093 0.0094
0.72
2.63 2.62
0.09
0.78 0.80
20% 0.82 0.82 0.0113 0.0113 2.38 2.39 0.76 0.78
30% 0.81 0.84 0.0123 0.0102 2.29 2.51 0.76 0.78
swir1
10%
-0.50
0.77 0.80
0.1086
0.0168 0.0142
0.82
2.07 2.26
0.10
0.73 0.78
20% 0.72 0.75 0.0203 0.0180 1.89 2.00 0.72 0.77
30% 0.69 0.76 0.0224 0.0171 1.80 2.06 0.72 0.78
swir2
10%
-0.39
0.81 0.88
0.1293
0.0174 0.0115
0.85
2.31 2.85
0.09
0.72 0.80
20% 0.77 0.85 0.0213 0.0142 2.09 2.56 0.70 0.79
30% 0.76 0.83 0.0227 0.0154 2.02 2.46 0.70 0.78
MSAVI
10%
-0.50
0.85 0.88
0.1551
0.0154 0.0122
0.82
2.59 2.91
0.08
0.74 0.79
20% 0.82 0.86 0.0188 0.0142 2.35 2.70 0.72 0.77
30% 0.80 0.84 0.0211 0.0167 2.21 2.49 0.71 0.76
2
blue
10%
-1.42
0.48 0.79
0.1763
0.0381 0.0155
0.64
1.38 2.17
0.05
0.54 0.66
20% 0.39 0.78 0.0444 0.0160 1.28 2.13 0.50 0.68
30% 0.37 0.77 0.0459 0.0165 1.26 2.10 0.51 0.67
green
10%
-1.14
0.45 0.78
0.1509
0.0386 0.0158
0.68
1.35 2.11
0.07
0.55 0.67
20% 0.37 0.79 0.0447 0.0150 1.26 2.17 0.52 0.67
30% 0.34 0.77 0.0464 0.0161 1.23 2.10 0.51 0.67
red
10%
-1.09
0.47 0.73
0.1713
0.0437 0.0222
0.69
1.37 1.92
0.06
0.50 0.58
20% 0.39 0.74 0.0498 0.0211 1.28 1.97 0.47 0.61
30% 0.37 0.75 0.0518 0.0206 1.26 2.00 0.47 0.65
nir
10%
-0.96
0.73 0.77
0.1051
0.0143 0.0122
0.72
1.94 2.10
0.11
0.67 0.67
20% 0.70 0.76 0.0160 0.0129 1.83 2.04 0.63 0.66
30% 0.70 0.75 0.0163 0.0134 1.82 2.00 0.63 0.67
swir1
10%
-0.64
0.62 0.81
0.0907
0.0210 0.0107
0.78
1.63 2.27
0.11
0.67 0.76
20% 0.57 0.80 0.0239 0.0112 1.52 2.22 0.64 0.76
30% 0.55 0.79 0.0247 0.0117 1.50 2.17 0.64 0.76
swir2
10%
-1.11
0.52 0.72
0.1243
0.0285 0.0167
0.69
1.44 1.88
0.08
0.65 0.72
20% 0.44 0.75 0.0328 0.0148 1.34 1.99 0.62 0.72
30% 0.42 0.72 0.0341 0.0166 1.31 1.88 0.62 0.72
MSAVI
10%
-0.80
0.69 0.79
0.1154
0.0198 0.0135
0.75
1.80 2.18
0.10
0.69 0.68
20% 0.65 0.79 0.0227 0.0133 1.68 2.20 0.66 0.68
30% 0.63 0.78 0.0238 0.0140 1.64 2.14 0.65 0.67
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 17
Figure 7 Visualizing statistical metrics of study area 1.
Figure 8 Visualizing statistical metrics of study area 2.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 18
Figure 9 The regressions (top) and histograms (bottom) visualize the relationship between predictions and cloud
free reference data of the swir1 response label in study area 1 with a cloud coverage of 20%. Left to right: random
noise; RFR predictions; U-Net predictions.
Figure 10 The regressions (top) and histograms (bottom) visualize the relationship between the U-Net predictions
and cloud free reference data of the red response label in study area 1 with increasing cloud coverages. Left to
right: 10%; 20%; 30%.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 19
Figure 11 The regressions (top) and histograms (bottom) visualize the relationship between predictions and cloud
free reference data of the green response label in study area 2 with a cloud coverage of 20%. Left to right: random
noise; RFR predictions; U-Net predictions.
Figure 12 The regressions (top) and histograms (bottom) visualize the relationship between the U-Net predictions
and cloud free reference data of the MSAVI response label in study area 2 with increasing cloud coverages. Left to
right: 10%; 20%; 30%.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 20
Figure 13 Images visualize a subsection in study area 1 of the response labels red, green, and blue. Top to bottom:
cloud coverages: 10%; 20%; 30%. Left to right: Cloud free reference image; artificially cloud masked image; RFR
prediction composite; U-Net prediction composite.
Figure 14 Images visualize the complete study area 2 of the response labels nir, swir1, and swir2. Top to bottom:
cloud coverages: 10%; 20%; 30%. Left to right: Cloud free reference image; artificially cloud masked image; RFR
prediction composite; U-Net prediction composite.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 21
4 Discussion
4.1 Interpreting the U-Net results
As of now, the experiments indicate similar results between the two study areas with different
features. However, there are some differences. Study area 2 has more consistent results compared to study
area 1, whereas the latter has a higher performance. The first could be explained by the lack of variation
in historic training data in study area 2 (as seen in the bottom-middle image of Figure 4). Too much
variation in historic training data could potentially cause completely covered response labels to be
misclassified in the prediction phase. This because the prediction model was not fed with the
corresponding set of training features. The higher performance in study area 1 may be explained by the
relative smoothness of agricultural fields and their reoccurring patterns. Even though the study area is
heterogenous, many fields are similar to each other. The major outliers are likely the cloud covered fields
with a unique set of historic training feature information. The U-Net outperforms the RFR in both study
areas. This is likely because the RFR only predicts on an individual pixel level, ignoring the spatial context.
This is visible in the noisier RFR predictions that are caused by the spike noise in the radar data. Reducing
this initial spike noise in the radar data while maintain feature fidelity could improve prediction results.
Alternative DL algorithms could step in as denoisers to replace the currently used Lee filter, where GANs
are good candidates (Chen et al., 2018, 2020).
Although the prediction models are extensively tested, a lack of diverse and numerous study areas
still pose a major limitation to the rigorousness of the results. The results should therefore be seen as
experiments that provide an empirical approximation of the performance by the proposed U-Net. More
study areas of different sizes should be tested with variations in artificial cloud cover distributions and
percentages.
Response labels
A pattern is observed where some response labels consistently outperform other labels, mainly seen
in study area 1. The nir response label has a good performance compared to labels (i.e. bands) in the visible
part of the electromagnetic spectrum. This may be explained by its relative high reflectance on cellular
structures (e.g. from foliage) possibly creating easier to distinguish classification features as the structural
reflectance increases representation of physical surface properties. A similar principle contributes to the
decent performance of the MSAVI label, that additionally rationalizes between multiple bands to
eliminate non vegetation elements providing less variation in training features. The better performance
seen in the swir1/2 label could be explained by similar effects of the nir label as well as the larger resolution
cell sizes found in these response labels, avoiding geographic mismatches. The lower resolution makes the
swir1/2 response label more similar to the fuzzier S1 data. This also means that higher resample sizes of
S2 data could lead to improved prediction performances, at a cost of spatial accuracy. Lastly, the R2, MSE
and RPD values often have more variation compared to the SSIM, for example the MSE values of nir
outperform swir2 prediction results in study area 1, while the SSIM is almost similar. This could mean
that the U-Net retains decent prediction image structures, even when pixel values differ as a whole. The
same could be said for the small performance differences between cloud cover percentages, especially seen
in study area 2.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 22
Cloud cover iterations
Decreasing cloud cover percentages do not necessarily lead to improved prediction performances
regarding the U-Net. Several factors could be at play, including the neighborhood random sampling
method of the U-Net input data preparation, the variation in the study area (as this effect differs per
response label for each study area), the cloud cover distribution, or other undetected reasons. For example,
the blue response label in study area 1 always sees a decreased performance in the 20% cloud cover
predictions compared to 10% and 30%. This may be due to training features that are partially included in
the 20% cloud cover data, whilst they are completely excluded in the 30% cloud cover data and completely
included in the 10% cloud cover data. Certain outlier features could disturb prediction performance by
incorrectly predicting the response labels in the 20% cloud cover data. This effect may be apparent in
study area 1 as seen in Figure 13 (top right to bottom right images), where at 20% cloud cover data, some
agricultural fields are cut off by a cloud gap transition, whilst at the 30% cloud cover data this transition
is coincidentally conveniently located. This also accounts for study area 2 as seen in Figure 14, where
occasionally improved performances (e.g. red response label) are observed with higher cloud cover
percentages. In this case it could be explained by the less detailed prediction outputs at higher cloud
coverages, potentially reducing sharp outliers.
4.2 Literature comparison
The statistical metrics used to indicate performance of the model are used in other research as well
and provide an insight of its prediction accuracy. Comparing these predictions to other research is
challenging, mainly because of the usage of different study areas, artificial cloud cover
variations/techniques, resample sizes, preprocessing steps, input data sources, selected statistical metrics,
etcetera. For example, Cresson et al. (2019) uses a single statistical metric for testing, the PSNR, to
determine prediction performance. This does not extensively analyses their model performance.
Furthermore, their preprocessing and testing methods differ, such as using cloud cover consisting of
relatively small and similarly sized rectangles over larger study areas. Their method also differs by the
inclusion of posterior training data, instead of purely utilizing historic or contemporary data. This renders
their predictions less timely compared to the approach of this thesis. Posterior data points could be
included to the multitemporal training features suggested in this thesis to improve results, at a cost of
timeliness. Recent research by Gao et al. (2020) uses the R2 and SSIM metrics seen in this thesis. However,
they leave out the prediction results by S1/2 data, rendering a direct comparison impossible. For the
advanced radar data utilized, a decent performance is observed with R2 and SSIM values above 0.90,
potentially by their addition of a cGAN for finetuning U-Net prediction results as well as the selection of
relatively homogenous study areas. This extra step of adding a cGAN could be added to the method of
this thesis to further improve prediction performances. Lastly, earlier research poses problems where
methods deviate significantly, for example by Wang and Petal (2018), who did not use a U-Net for
prediction purposes and did not generate artificial cloud cover for testing. They did use similar statistical
metrics for testing as seen in this thesis, namely the SSIM, which for their study areas resulted in a mean
value of 0.853. The bottom line is that direct comparisons with previous research is difficult. Therefore,
further research should align the proposed models, study areas, and testing strategies of this thesis and the
above mentioned research to gain a comprehensive overview of their differences in performance.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 23
4.3 Methodological approach
Study areas and preprocessing
The S1/2 data in the study areas need to be well preprocessed in order to get optimal results from
the U-Net. One of these preprocessing steps is input normalization. Prediction results without input
normalization had reduced performances and longer training durations. It must be noted that when
normalizing data, additional input data should be normalized using a similar range in order to work
properly. Another preprocessing step concerns multitemporal time interval selection of the S1 radar data.
Currently, these intervals last one week up to a month and are relatively arbitrarily selected with two
criteria to not use posterior training data and the assurance of total coverage of S1/2 data. Additional time
intervals of various lengths could be added that may improve prediction performance. Alternatively,
individual images could be used as training features instead of the currently used median aggregations
over the mentioned time intervals. This results in temporally discrete training features sampled from
individual S1/2 images within a study area. These individual images provide more training features.
However, each training image is subject to an increase in spike noise, topographic effects, and atmospheric
interference because extremes are not filtered out. It also often results in incomplete coverage of the study
area. These issues pose a problem with individual images, both due to the nature of datasets and the
selected study area.
The quality of the training images highly varies within each study area contributing to a set of
uncertainties. In practice, these uncertainties are never completely eliminated, even with longer time
intervals. When studying topographically complex terrains or clouded areas, prediction performances may
be severely hampered. For example, relating to topographic effects, current methods applying
orthorectification on S1 data and terrain shadow removal of S2 data are often insufficient. Masking clouds
in S2 data is not perfect either, even with tools developed dealing with this problem, a perfect and
automated elimination of atmospheric interference still has limitations (Coluzzi et al., 2018; Zhu and
Helmer, 2018). The method suggested in this thesis could be expanded to study areas that incorporate
more uncertainties to further test the performance, such as mountainous or hazy areas (e.g. long lasting
forest fires). Even though the current study areas are empirically examined on uncertainties, a complete
elimination of all possible interferences and effects is unrealistic. This could possibly be solved by
artificially generating clean S1/2 data to train and test prediction models.
Artificial cloud improvements
Artificial cloud cover generation could be improved as well. Currently, a simple gradual noise is
used somewhat emulating one type of cloud cover. This could be expanded by adding additional layers of
cloud cover with variations in opacity, distribution, shape, etcetera. Cloud shadows should also be added
for a more realistic approach, where potentially these cloud shadows, together with parts of the artificial
clouds, could be left in the training data to simulate real world imperfections in cloud masking. Another
product arising from the cloud cover generation is scatteredness of the clouds, these affect the continuity
within the training patches. Increased cloud scatteredness means less continuity in the training patches
fed into the U-Net. This is also the reason why generating artificial clouds was preferred over just selecting
rectangular cloud free training areas and cloud covered prediction areas. By generating somewhat realistic
clouds within one study area, a real world situation is created where certain irregular parts of a study area
are covered by clouds that need to be reconstructed.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 24
Prediction models and testing
For this thesis, a mildly modified preexisting U-Net architecture was used. The main contributions
were hyperparameter optimization and custom input data preprocessing. The performance of the U-Net
may be improved by adjusting its hyperparameters using Bayesian optimization and structured prediction
(Zhang et al., 2015). Further improvements can be made by adjusting its architecture and training
durations based on empirically optimal loss values to avoid over or underfitting instead of the fixed 30
epochs. The U-Net model should be compared to applying other models found in Python libraries (e.g.
Skikit-learn/image), where expert knowledge of the model is disregarded focusing mainly on the
application and results of the model. This is similar to applying the RFR model, developed by experts in
the field and used by researchers within different domains. Nevertheless, prediction results of these
models should be scientifically assessed and tested in order to prove their validity. Having a solid test
scheme for prediction models is important to provide insight in their accuracy. The artificial cloud cover
provided clean reference data for prediction testing on a pixel level within masked regions. This gives the
approach of this thesis an edge over prediction problems where reference data (i.e. verification or ground-
truth data) is hard to acquire, inaccurate, or incomplete. Using this approach gives freedom to expand to
other study areas, improving and testing the prediction models even further.
A final note on the testing results is that all statistical metrics are based on flattened prediction
images, as described in the methodology. This dimension reduction was necessary to use the prediction
data in the SSIM tool. Not flattening the images could result in different statistical metric values,
potentially being higher than the current values. Future research should custom build the SSIM tool that
is able to exclude non-numerical values (e.g. the unmasked training regions) in a 2D image.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 25
5 Conclusion
The prediction results of the U-Net model display promising performances for both study areas.
The performance is consistent for all response labels and cloud cover percentages. It significantly
outperforms the RFR model for most response labels. This suggest that using advanced DL is preferred
over basic ML when prediction accuracy is important. A similar conclusion has been made by Gao et al.
(2020). This is further encouraged by the fact that most convolutional DL models, such as the U-Net, are
able to process large amounts of imagery data. It allows for scalability to larger and numerous study areas.
The discussed U-Net uses solitary historic and contemporary multitemporal training data, making the
approach, together with it being entirely cloud computing based, appropriate for applications where
timely and dynamic predictions are required. Predicting optical data using radar data with the help of DL
is still relatively new, where this thesis hopes to contribute to maturing this research field.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 26
6 Appendix
6.1 Extended tables
Table 6 Study area 1 statistical metrics of Table 5 extended with percent differences. Horizontal arrows with a
percentage in the middle indicate the difference between the RFR and U-Net performance for the relating metric.
The vertical arrows with percentages on each side indicate the difference between the performance of each cloud
cover iteration.
10%
↓ ∆%
20%
↓ ∆%
30%
10%
↓ ∆%
10%
↓ ∆%
30%
10%
↓ ∆%
20%
↓ ∆%
30%
10%
↓ ∆%
30%
↓ ∆%
30%
10%
↓ ∆%
10%
↓ ∆%
30%
10%
↓ ∆%
30%
↓ ∆%
30%
10%
↓ ∆%
10%
↓ ∆%
30%
Study
area
1
0.69 → +6.9% → 0.74
0.67 → +11.8% → 0.76
-3.5% ↓ +1.9%
0.74 → +5.1% → 0.78
RFR → ∆% → U-Net
-7.1% ↓ -5.1%
-0.4
5
Lab
el Cloud
cover
blu
e
No
ise R2
RFR → ∆% → U-Net
0.8
3
1.96 → +8% → 2.14
-9.1% ↓ -8.3%
1.8 → +8.7% → 1.97
-3.8% ↓ +2.8%
1.74 → +14.5% → 2.03
RFR → ∆% → U-Net
0.1
29
7
0.0231 → -18.3% → 0.0195
+16% ↓ +14.8%
0.0275 → -20% → 0.0229
+7.1% ↓ -5.8%
0.0296 → -36.6% → 0.0217
No
ise RPD
No
ise MSE
RFR → ∆% → U-Net
0.0
9
0.66 → +7.1% → 0.71
-3.4% ↓ -3.4%
0.64 → +7.1% → 0.69
-0.8% ↓ +3.1%
0.64 → +10.8% → 0.71
No
ise SSIM
1.55 → +11.7% → 1.76 0.65 → +8.2% → 0.71
-8% ↓ +1.9% +9.5% ↓ -4.3% -5.1% ↓ +2.1% -1.4% ↓ +0%
1.74 → +5% → 1.83
0.1
0
0.68 → +5.2% → 0.72
-14.9% ↓ -4.1% +20.9% ↓ +8.5% -12.4% ↓ -4.5% -4% ↓ -0.6%
0.67 → +4.6% → 0.7
0.1
08
5
0.0236 → -10.8% → 0.0213
0.8
1
0.58 → +13.5% → 0.68 0.0298 → -28.1% → 0.0233
1.48 → +17.7% → 1.79 0.64 → +9.5% → 0.71
red
-0.3
3
0.79 → +6.3% → 0.84
0.1
49
4
0.0239 → -33.3% → 0.0179
0.8
7
2.17 → +13.4% → 2.51
gre
en
-0.5
1
0.54 → +21.5% → 0.69 0.0329 → -47.6% → 0.0223
0.0297 → -55% → 0.0192 1.95 → +19.7% → 2.42 0.63 → +12.9% → 0.73
-4% ↓ -3.9% +9.8% ↓ +15.6% -5.3% ↓ -8.9% -1.2% ↓ -2.1%
0.0
8
0.65 → +11.2% → 0.73
-7% ↓ -1.3% +19.5% ↓ +6.4% -11.4% ↓ -3.4% -2.6% ↓ -0.7%
0.74 → +11.3% → 0.83
0.71 → +11.3% → 0.8 0.0329 → -44.9% → 0.0227 1.85 → +16.9% → 2.23 0.63 → +12.2% → 0.71
nir
-0.9
2
0.86 → -0.2% → 0.85
0.1
23
9
0.0093 → +1.1% → 0.0094
-3.8% ↓ -3.6% +17.9% ↓ +16.8% -10.4% ↓ -9.6% -2.9% ↓ -2.4%
0.82 → +0.1% → 0.82 0.0113 → -0.3% → 0.0113 2.38 → +0.1% → 2.39 0.76 → +2.2% → 0.780.7
2
2.63 → -0.5% → 2.62
0.0
9
0.78 → +1.7% → 0.8
-1.8% ↓ +2% +7.7% ↓ -10.3% -4.1% ↓ +4.8% -0.4% ↓ 0%
0.81 → +3.8% → 0.84 0.0123 → -19.9% → 0.0102 2.29 → +8.7% → 2.51 0.76 → +2.6% → 0.78
1.89 → +5.7% → 2 0.72 → +6.6% → 0.77
-4.2% ↓ +1.8% +9.4% ↓ -5.7% -5.1% ↓ +2.8% -0.9% ↓ +0.3%
2.07 → +8.2% → 2.26
0.1
0
0.73 → +7.1% → 0.78
-6.6% ↓ -7.1% +16.9% ↓ +21.4% -9.7% ↓ -12.8% -0.6% ↓ -1.2%
0.77 → +4.6% → 0.8
0.1
08
6
0.0168 → -18.7% → 0.0142
0.8
2
0.72 → +4.1% → 0.75 0.0203 → -12.3% → 0.018
1.8 → +12.7% → 2.06 0.72 → +7.7% → 0.78
swir
2
-0.3
9
0.81 → +7.3% → 0.88
0.1
29
3
0.0174 → -51.9% → 0.0115
0.8
5
2.31 → +18.9% → 2.85
swir
1
-0.5
0
0.69 → +9.6% → 0.76 0.0224 → -31.1% → 0.0171
0.0213 → -50.2% → 0.0142 2.09 → +18.4% → 2.56 0.7 → +10.6% → 0.79
-2.1% ↓ -1.6% +6.4% ↓ +8% -3.4% ↓ -4.3% -0.7% ↓ -0.2%
0.0
9
0.72 → +10.6% → 0.8
-5.4% ↓ -3.5% +18.3% ↓ +19.2% -10.6% ↓ -11.3% -1.9% ↓ -2%
0.77 → +9% → 0.85
0.76 → +9.5% → 0.83 0.0227 → -47.6% → 0.0154 2.02 → +17.7% → 2.46 0.7 → +11% → 0.78
MS
AV
I
-0.5
0
0.85 → +3.5% → 0.88
0.1
55
1
0.0154 → -26.3% → 0.0122
-4% ↓ -2.2% +17.8% ↓ +13.7% -10.3% ↓ -7.6% -2.9% ↓ -2.9%
0.82 → +5.2% → 0.86 0.0188 → -32.7% → 0.0142 2.35 → +13.2% → 2.7 0.72 → +5.6% → 0.770.8
2
2.59 → +11% → 2.91
0.0
8
0.74 → +5.6% → 0.79
-2.8% ↓ -3% +11.1% ↓ +15.3% -6.1% ↓ -8.7% -1.9% ↓ -0.3%
0.8 → +5.1% → 0.84 0.0211 → -26.3% → 0.0167 2.21 → +11% → 2.49 0.71 → +7% → 0.76
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 27
Table 7 Study area 2 statistical metrics of Table 5 extended with percent differences. Horizontal arrows with a
percentage in the middle indicate the difference between the RFR and U-Net performance for the relating metric.
The vertical arrows with percentages on each side indicate the difference between the performance of each cloud
cover iteration.
10%
↓ ∆%
20%
↓ ∆%
30%
10%
↓ ∆%
30%
↓ ∆%
30%
10%
↓ ∆%
20%
↓ ∆%
30%
10%
↓ ∆%
30%
↓ ∆%
30%
10%
↓ ∆%
10%
↓ ∆%
30%
10%
↓ ∆%
20%
↓ ∆%
30%
10%
↓ ∆%
10%
↓ ∆%
30%
MSE
No
ise RPD
No
ise SSIM
RFR → ∆% → U-Net RFR → ∆% → U-Net RFR → ∆% → U-Net RFR → ∆% → U-Net
2
Study
area Lab
el Cloud
cover No
ise R2
No
ise
0.1
0
0.69 → -1.7% → 0.68
-7.1% ↓ +0.3% +12.9% ↓ -1.2% -7.2% ↓ +0.6% -5% ↓ +0.1%
0.65 → +18.5% → 0.79 0.0227 → -70.8% → 0.0133 1.68 → +23.5% → 2.2 0.66 → +3.2% → 0.68
-2.7% ↓ -1.4% +4.6% ↓ +5% -2.4% ↓ -2.6% -1.1% ↓ -0.7%
0.63 → +19.6% → 0.78 0.0238 → -70.2% → 0.014 1.64 → +23.4% → 2.14 0.65 → +3.6% → 0.67
MS
AV
I
-0.8
0
0.69 → +12.5% → 0.79
0.1
15
4
0.0198 → -46.9% → 0.0135
0.7
5
1.8 → +17.5% → 2.18
0.0
8
0.65 → +9.4% → 0.72
-16.5% ↓ +4.4% +13.1% ↓ -13% -7.3% ↓ +5.9% -4.4% ↓ +0.6%
0.44 → +40.9% → 0.75 0.0328 → -121.9% → 0.0148 1.34 → +32.9% → 1.99 0.62 → +13.7% → 0.72
-5.2% ↓ -4.3% +3.8% ↓ +11% -2% ↓ -6% -0.7% ↓ +0.1%
0.42 → +41.5% → 0.72 0.0341 → -105.4% → 0.0166 1.31 → +30.2% → 1.88 0.62 → +14.4% → 0.72
swir
2
-1.1
1
0.52 → +28% → 0.72
0.1
24
3
0.0285 → -70.7% → 0.0167
0.6
9
1.44 → +23.5% → 1.88
0.1
1
0.67 → +11.9% → 0.76
-9.1% ↓ -1.1% +12% ↓ +4.4% -6.6% ↓ -2.3% -4.7% ↓ +0%
0.57 → +28.6% → 0.8 0.0239 → -112.7% → 0.0112 1.52 → +31.4% → 2.22 0.64 → +15.8% → 0.76
-2.8% ↓ -1.2% +3.5% ↓ +4.4% -1.8% ↓ -2.3% -0.8% ↓ -0.8%
0.55 → +29.7% → 0.79 0.0247 → -110.9% → 0.0117 1.5 → +31.1% → 2.17 0.64 → +15.8% → 0.76
swir
1
-0.6
4
0.62 → +23% → 0.81
0.0
90
7
0.021 → -95.8% → 0.0107
0.7
8
1.63 → +28.5% → 2.27
0.1
1
0.67 → -0.3% → 0.67
-4.4% ↓ -1.7% +10.3% ↓ +5.4% -5.6% ↓ -2.8% -6.5% ↓ -1.6%
0.7 → +7.5% → 0.76 0.016 → -23.6% → 0.0129 1.83 → +10% → 2.04 0.63 → +4.3% → 0.66
-0.8% ↓ -1.3% +1.9% ↓ +3.8% -1% ↓ -1.9% -0.5% ↓ +0.8%
0.7 → +7.1% → 0.75 0.0163 → -21.3% → 0.0134 1.82 → +9.2% → 2 0.63 → +5.6% → 0.67
nir
-0.9
6
0.73 → +5.1% → 0.77
0.1
05
1
0.0143 → -17.2% → 0.0122
0.7
2
1.94 → +7.6% → 2.1
0.0
6
0.5 → +13% → 0.58
-18.9% ↓ +1.8% +12.2% ↓ -5.2% -6.7% ↓ +2.5% -6% ↓ +6%
0.39 → +47% → 0.74 0.0498 → -135.5% → 0.0211 1.28 → +34.8% → 1.97 0.47 → +22.9% → 0.61
-6.5% ↓ +0.9% +3.8% ↓ -2.6% -2% ↓ +1.3% -0.3% ↓ +6%
0.37 → +50.7% → 0.75 0.0518 → -151.2% → 0.0206 1.26 → +36.9% → 2 0.47 → +27.8% → 0.65
red
-1.0
9
0.47 → +35.8% → 0.73
0.1
71
3
0.0437 → -96.4% → 0.0222
0.6
9
1.37 → +28.6% → 1.92
0.0
7
0.55 → +17.7% → 0.67
-23.4% ↓ +1.4% +13.6% ↓ -5.1% -7.6% ↓ +2.5% -6.4% ↓ +1.1%
0.37 → +53.3% → 0.79 0.0447 → -197.2% → 0.015 1.26 → +42% → 2.17 0.52 → +23.5% → 0.67
-7% ↓ -1.9% +3.6% ↓ +6.5% -1.9% ↓ -3.4% -0.5% ↓ -1.1%
0.34 → +55.5% → 0.77 0.0464 → -188.3% → 0.0161 1.23 → +41.1% → 2.1 0.51 → +23% → 0.67
gre
en
-1.1
4
0.45 → +41.6% → 0.78
0.1
50
9
0.0386 → -144.3% → 0.0158
0.6
8
1.35 → +36% → 2.11
0.0
5
0.54 → +19.1% → 0.66
-22.1% ↓ -0.9% +14.2% ↓ +3.2% -8% ↓ -1.6% -6.1% ↓ +2.3%
0.39 → +49.9% → 0.78 0.0444 → -177.1% → 0.016 1.28 → +39.9% → 2.13 0.5 → +25.5% → 0.68
-5.7% ↓ -0.9% +3.3% ↓ +3.1% -1.7% ↓ -1.6% +0.1% ↓ -1.4%
0.37 → +52.1% → 0.77 0.0459 → -177.9% → 0.0165 1.26 → +40% → 2.1 0.51 → +24.5% → 0.67
blu
e
-1.4
20.48 → +39.3% → 0.79
0.1
76
3
0.0381 → -145.5% → 0.0155
0.6
4
1.38 → +36.2% → 2.17
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 28
7 References
Belgiu, M. and Stein, A.: Spatiotemporal Image Fusion in Remote Sensing, Remote Sens., 11(7), 818,
doi:10.3390/rs11070818, 2019.
Bermudez, J. D., Happ, P. N., Oliveira, D. A. B. and Feitosa, R. Q.: SAR to Optical Image Synthesis for
Cloud Removal with Generative Adversarial Networks, ISPRS Ann. Photogramm. Remote Sens. Spat.
Inf. Sci., IV–1, 5–11, doi:10.5194/isprs-annals-IV-1-5-2018, 2018.
Bermudez, J. D., Happ, P. N., Feitosa, R. Q. and Oliveira, D. A. B.: Synthesis of Multispectral Optical
Images From SAR/Optical Multitemporal Data Using Conditional Generative Adversarial Networks,
IEEE Geosci. Remote Sens. Lett., 16(8), 1220–1224, doi:10.1109/LGRS.2019.2894734, 2019.
Chen, B., Huang, B. and Xu, B.: Multi-source remotely sensed data fusion for improving land cover
classification, ISPRS J. Photogramm. Remote Sens., 124, 27–39, doi:10.1016/j.isprsjprs.2016.12.008, 2017.
Chen, J., Chen, J., Chao, H. and Yang, M.: Image Blind Denoising with Generative Adversarial Network
Based Noise Modeling, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.
3155–3164, IEEE, Salt Lake City, UT., 2018.
Chen, Z., Zeng, Z., Shen, H., Zheng, X., Dai, P. and Ouyang, P.: DN-GAN: Denoising generative
adversarial networks for speckle noise reduction in optical coherence tomography images, Biomed. Signal
Process. Control, 55, 101632, doi:10.1016/j.bspc.2019.101632, 2020.
Coluzzi, R., Imbrenda, V., Lanfredi, M. and Simoniello, T.: A first assessment of the Sentinel-2 Level 1-C
cloud mask product to support informed surface analyses, Remote Sens. Environ., 217, 426–443,
doi:10.1016/j.rse.2018.08.009, 2018.
Cresson, R., Ienco, D., Gaetano, R., Ose, K. and Ho Tong Minh, D.: Optical image gap filling using deep
convolutional autoencoder from optical and radar images, IEEE, Yokohama, Japan., 2019.
Eckardt, R., Berger, C., Thiel, C. and Schmullius, C.: Removal of Optically Thick Clouds from Multi-
Spectral Satellite Images Using Multi-Frequency SAR Data, Remote Sens., 5(6), 2973–3006,
doi:10.3390/rs5062973, 2013.
ESDS: Remote Sensors | Earthdata, [online] Available from: https://earthdata.nasa.gov/learn/remote-
sensors (Accessed 8 October 2019), 2019.
Gao, F., Anderson, M. C. and Xie, D.: Spatial and temporal information fusion for crop condition
monitoring, in 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp.
3579–3582., 2016.
Gao, J., Yuan, Q., Li, J., Zhang, H. and Su, X.: Cloud Removal with Fusion of High Resolution Optical
and SAR Images Using Generative Adversarial Networks, Remote Sens., 12(1), 191,
doi:10.3390/rs12010191, 2020.
Google Colab: Welcome To Colaboratory - Colaboratory, [online] Available from:
https://colab.research.google.com/notebooks/welcome.ipynb (Accessed 3 October 2019), n.d.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 29
Google Earth Engine: FAQ – Google Earth Engine, [online] Available from:
https://earthengine.google.com/faq/ (Accessed 3 October 2019), n.d.
google/earthengine-api: google/earthengine-api, GitHub [online] Available from:
https://github.com/google/earthengine-api (Accessed 10 February 2020), n.d.
Gore, A.: The digital earth, Aust. Surv., 43(2), 89–91, doi:10.1080/00050326.1998.10441850, 1998.
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, D., Thau, D. and Moore, R.: Google Earth Engine:
Planetary-scale geospatial analysis for everyone, Remote Sens. Environ., 202, 18–27,
doi:10.1016/j.rse.2017.06.031, 2017.
Grohnfeldt, C., Schmitt, M. and Zhu, X.: A Conditional Generative Adversarial Network to Fuse Sar And
Multispectral Optical Data For Cloud Removal From Sentinel-2 Images, in IGARSS 2018 - 2018 IEEE
International Geoscience and Remote Sensing Symposium, pp. 1726–1729., 2018.
He, W. and Yokoya, N.: Multi-Temporal Sentinel-1 and -2 Data Fusion for Optical Image Simulation,
ISPRS Int. J. Geo-Inf., 7(10), 389, doi:10.3390/ijgi7100389, 2018.
Huang, B., Li, Y., Han, X., Cui, Y., Li, W. and Li, R.: Cloud Removal From Optical Satellite Imagery With
SAR Imagery Using Sparse Representation, IEEE Geosci. Remote Sens. Lett., 12(5), 1046–1050,
doi:10.1109/LGRS.2014.2377476, 2015.
Jacquemoud, S., Verhoef, W., Baret, F., Bacour, C., Zarco-Tejada, P. J., Asner, G. P., François, C. and
Ustin, S. L.: PROSPECT+SAIL models: A review of use for vegetation characterization, Remote Sens.
Environ., 113, S56–S66, doi:10.1016/j.rse.2008.01.026, 2009.
Krizhevsky, A., Sutskever, I. and Hinton, G. E.: ImageNet Classification with Deep Convolutional Neural
Networks, in Advances in Neural Information Processing Systems 25, edited by F. Pereira, C. J. C. Burges,
L. Bottou, and K. Q. Weinberger, pp. 1097–1105, Curran Associates, Inc. [online] Available from:
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
(Accessed 9 February 2020), 2012.
Lee, J.-S.: Digital Image Enhancement and Noise Filtering by Use of Local Statistics, IEEE Trans. Pattern
Anal. Mach. Intell., PAMI-2(2), 165–168, doi:10.1109/TPAMI.1980.4766994, 1980.
Li, X., Shen, H., Zhang, L., Zhang, H., Yuan, Q. and Yang, G.: Recovering Quantitative Remote Sensing
Products Contaminated by Thick Clouds and Shadows Using Multitemporal Dictionary Learning, IEEE
Trans. Geosci. Remote Sens., 52(11), 7086–7098, doi:10.1109/TGRS.2014.2307354, 2014.
Liang, S., Li, X. and Wang, J., Eds.: Chapter 1 - A Systematic View of Remote Sensing, in Advanced
Remote Sensing, pp. 1–31, Academic Press, Boston., 2012.
Liu, L. and Lei, B.: Can SAR Images and Optical Images Transfer with Each Other?, in IGARSS 2018 -
2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 7019–7022., 2018.
Loff, S.: Cloudy Earth, NASA [online] Available from: http://www.nasa.gov/image-feature/cloudy-earth
(Accessed 12 February 2020), 2015.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 30
Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G. and Johnson, B. A.: Deep learning in remote sensing applications:
A meta-analysis and review, ISPRS J. Photogramm. Remote Sens., 152, 166–177,
doi:10.1016/j.isprsjprs.2019.04.015, 2019.
Mateo-García, G., Gómez-Chova, L., Amorós-López, J., Muñoz-Marí, J. and Camps-Valls, G.:
Multitemporal Cloud Masking in the Google Earth Engine, Remote Sens., 10(7), 1079,
doi:10.3390/rs10071079, 2018.
Meados, C., Nair, R. and Plascencia, D.: Pix2pix Utilizing the Deep Learning Reference Stack, 01.org
[online] Available from: https://01.org/blogs/2019/pix2pix-utilizing-deep-learning-reference-stack
(Accessed 8 February 2020), 2019.
Mou, L., Schmitt, M., Wang, Y. and Zhu, X.: A CNN for the Identification of Corresponding Patches in
SAR and Optical Imagery of Urban Scenes., 2017.
Nagendra, H., Lucas, R., Honrado, J. P., Jongman, R. H. G., Tarantino, C., Adamo, M. and Mairota, P.:
Remote sensing for conservation monitoring: Assessing protected areas, habitat extent, habitat condition,
species diversity, and threats, Ecol. Indic., 33, 45–59, doi:10.1016/j.ecolind.2012.09.014, 2013.
Pettorelli, N., Laurance, W. F., O’Brien, T. G., Wegmann, M., Nagendra, H. and Turner, W.: Satellite
remote sensing for applied ecologists: opportunities and challenges, J. Appl. Ecol., 51(4), 839–848,
doi:10.1111/1365-2664.12261, 2014.
Ronneberger, O., Fischer, P. and Brox, T.: U-Net: Convolutional Networks for Biomedical Image
Segmentation, ArXiv150504597 Cs [online] Available from: http://arxiv.org/abs/1505.04597 (Accessed 8
February 2020), 2015.
Scarpa, G., Gargiulo, M., Mazza, A. and Gaetano, R.: A CNN-Based Fusion Method for Feature Extraction
from Sentinel Data, Remote Sens., 10(2), 236, doi:10.3390/rs10020236, 2018.
Schmitt, M., Tupin, F. and Zhu, X. X.: Fusion of SAR and optical remote sensing data — Challenges and
recent trends, in 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp.
5458–5461., 2017.
Schmitt, M., Hughes, L. H. and Zhu, X. X.: The SEN1-2 Dataset for Deep Learning in SAR-Optical Data
Fusion, ArXiv180701569 Cs [online] Available from: http://arxiv.org/abs/1807.01569 (Accessed 16 January
2020), 2018.
Shelhamer, E., Long, J. and Darrell, T.: Fully Convolutional Networks for Semantic Segmentation, IEEE
Trans. Pattern Anal. Mach. Intell., 39(4), 640–651, doi:10.1109/TPAMI.2016.2572683, 2017.
Shen, H., Li, X., Cheng, Q., Zeng, C., Yang, G., Li, H. and Zhang, L.: Missing Information Reconstruction
of Remote Sensing Data: A Technical Review, IEEE Geosci. Remote Sens. Mag., 3(3), 61–85,
doi:10.1109/MGRS.2015.2441912, 2015.
Small, C., Okujeni, A., van der Linden, S. and Waske, B.: 6.07 - Remote Sensing of Urban Environments,
in Comprehensive Remote Sensing, edited by S. Liang, pp. 96–127, Elsevier, Oxford., 2018.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 31
Stuhler, S., Leiterer, R., Joerg, P., Wulf, H. and Schaepman, M.: Generating a cloud-free, homogeneous
Landsat-8 mosaic of Switzerland using Google Earth Engine., 2016.
tensorflow/models: tensorflow/models, GitHub [online] Available from:
https://github.com/tensorflow/models (Accessed 10 February 2020), n.d.
Tzouvaras, M., Kouhartsiouk, D., Agapiou, A., Danezis, C. and Hadjimitsis, D. G.: The Use of Sentinel-1
Synthetic Aperture Radar (SAR) Images and Open-Source Software for Cultural Heritage: An Example
from Paphos Area in Cyprus for Mapping Landscape Changes after a 5.6 Magnitude Earthquake, Remote
Sens., 11(15), 1766, doi:10.3390/rs11151766, 2019.
Wang, P. and Patel, V. M.: Generating High Quality Visible Images from SAR Images Using CNNs,
ArXiv180210036 Cs [online] Available from: http://arxiv.org/abs/1802.10036 (Accessed 15 February 2020),
2018.
Wang, X., Jia, K., Liang, S., Li, Q., Wei, X., Yao, Y., Zhang, X. and Tu, Y.: Estimating Fractional Vegetation
Cover From Landsat-7 ETM+ Reflectance Data Based on a Coupled Radiative Transfer and Crop Growth
Model, IEEE Trans. Geosci. Remote Sens., 55(10), 5539–5546, doi:10.1109/TGRS.2017.2709803, 2017.
Wang, Y. and Zhu, X. X.: The SARptical Dataset for Joint Analysis of SAR and Optical Image in Dense
Urban Area, in IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium, pp.
6840–6843, IEEE, Valencia., 2018.
Wang, Z. and Bovik, A. C.: Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,
IEEE Signal Process. Mag., 26(1), 98–117, doi:10.1109/MSP.2008.930649, 2009.
Wang, Z., Bovik, A. C., Sheikh, H. R. and Simoncelli, E. P.: Image quality assessment: from error visibility
to structural similarity, IEEE Trans. Image Process., 13(4), 600–612, doi:10.1109/TIP.2003.819861, 2004.
Woodhouse, I. H.: Introduction to microwave remote sensing, CRC press., 2017.
Wu, B., Huang, B., Cao, K. and Zhuo, G.: Improving spatiotemporal reflectance fusion using image
inpainting and steering kernel regression techniques, Int. J. Remote Sens., 38(3), 706–727,
doi:10.1080/01431161.2016.1271471, 2017.
Xu, M., Jia, X., Pickering, M. and Plaza, A. J.: Cloud Removal Based on Sparse Representation via
Multitemporal Dictionary Learning, IEEE Trans. Geosci. Remote Sens., 54(5), 2998–3006,
doi:10.1109/TGRS.2015.2509860, 2016.
Zhang, W., Huang, H., Schmitz, M., Sun, X., Wang, H. and Mayer, H.: Effective Fusion of Multi-Modal
Remote Sensing Data in a Fully Convolutional Network for Semantic Labeling, Remote Sens., 10(1), 52,
doi:10.3390/rs10010052, 2017.
Zhang, Y., Sohn, K., Villegas, R., Pan, G. and Lee, H.: Improving Object Detection With Deep
Convolutional Networks via Bayesian Optimization and Structured Prediction, pp. 249–258. [online]
Available from:
http://openaccess.thecvf.com/content_cvpr_2015/html/Zhang_Improving_Object_Detection_2015_CVP
R_paper.html (Accessed 2 April 2020), 2015.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 32
Zheng, Y., Wu, B., Zhang, M. and Zeng, H.: Crop Phenology Detection Using High Spatio-Temporal
Resolution Data Fused from SPOT5 and MODIS Products, Sensors, 16(12), 2099, doi:10.3390/s16122099,
2016.
Zhu, X. and Helmer, E. H.: An automatic method for screening clouds and cloud shadows in optical
satellite image time series in cloudy regions, Remote Sens. Environ., 214, 135–153,
doi:10.1016/j.rse.2018.05.024, 2018.
Zhu, X., Cai, F., Tian, J. and Williams, T. K.-A.: Spatiotemporal Fusion of Multisource Remote Sensing
Data: Literature Survey, Taxonomy, Principles, Applications, and Future Directions, Remote Sens., 10(4),
527, doi:10.3390/rs10040527, 2018.
Zhu, X. X., Tuia, D., Mou, L., Xia, G.-S., Zhang, L., Xu, F. and Fraundorfer, F.: Deep Learning in Remote
Sensing: A Comprehensive Review and List of Resources, IEEE Geosci. Remote Sens. Mag., 5(4), 8–36,
doi:10.1109/MGRS.2017.2762307, 2017.
Predicting Sentinel-2 optical data using multitemporal Sentinel-1 radar data for cloud gap reconstruction
Hamelberg, MA | 12 May 2020 | pg. 33