an extensible and easy-to-use toolbox for deep learning

AN EXTENSIBLE AND EASY-TO-USE TOOLBOX FOR DEEP LEARNING BASED ANALYSISOF REMOTE SENSING IMAGES

Raian Vargas Maretto, Thales Sehn Korting, Leila Maria Garcia Fonseca

National Institute for Space Research (INPE, Brazil)

ABSTRACT

Deep Learning (DL) methods are currently the state-of-the-art in Machine Learning and Pattern Recognition. In recentyears, DL has been successfully applied to Remote Sensing(RS) image processing for several tasks, from pre-processingto classification. This paper presents DeepGeo, a toolbox thatprovides state-of-the-art DL algorithms for RS image classi-fication and analysis. DeepGeo focuses on providing easy-to-use and extensible methods, making it easier to those RSanalysts without strong programming skills. It is distributedas free and open source package and is available at https://github.com/rvmaretto/deepgeo.

Index Terms— Deep Learning, Convolutional NeuralNetworks, Remote Sensing, Semantic Segmentation

1. INTRODUCTION

With the recent growing accessibility of new generation Re-mote Sensing (RS) sensors, a large bulk of data has becomeavailable. Having access to an incredible amount of data havebrought the opportunity to widen our ability to understandthe Earth. At the same time, it turned to be impracticablethe traditional non-automatic analysis, increasing the focuson the ability to automatically extract valuable informationfrom those images.

In recent years, Deep Learning (DL) has become a hotspotin the Machine Learning and Pattern Recognition communi-ties. It is characterized by a set of Artificial Neural Networks(ANN) composed of multiple feature extraction layers, alsocalled Deep Neural Networks (DNN). These layers are re-sponsible for extracting features in different levels of abstrac-tion, starting from the raw data [1]. In these feature extractionlayers, each level transforms the representation of the previ-ous ones into a more abstract model, hierarchically combin-ing them, and then being able to model and explore intrinsiccorrelations in the data [2].

In RS Image Processing, DL methods have been suc-cessfully applied for many different purposes, such as pan-sharpening [3] and semantic segmentation (pixelwise clas-

Thanks to the World Bank and to the Sao Paulo Research Foundation(FAPESP) for funding through the FIP (Forest Investment Program) programand the project #2017/24086-2 respectively.

sification) [4, 5, 6]. Therefore, these methods has beenconsidered by [7] as crucial for the future of RS data analysis,specially in the age of big data.

Several toolboxes are available for DL development, likeTensorFlow[8], Keras [9], Theano [10] and PyTorch [11]. Al-though very powerful, the currently available toolboxes forDL development are hard to use by analysts without strongprogramming skills. Due to the complexity of the conceptsinvolved, they demand from the analyst a strong backgroundin Computer Science to be able to implement a DNN and per-form tasks like classification and analysis.

Focusing on facilitating the access to DL techniques byRS analysts with as few lines of source code as possible, thispaper presents DeepGeo toolbox. It provides configurablebuilding blocks to perform the entire cycle of DL based anal-ysis of RS data.

2. DEEPGEO TOOLBOX

DeepGeo is a Python toolbox that provides, as configurablebuilding blocks, tools to perform spatial and multi-temporalDL based analysis of RS images. It integrates tools to per-form the following tasks: pre-process data; generate training,evaluation and validation datasets; train predefined DNNs;easily customize and implement new DNNs; apply DL classi-fication based on a trained DNN; and analyse and visualizeresults.

DeepGeo is distributed as a free and open source softwareunder the terms of the GNU General Public License version3.0 or later, running on multiple platforms, e.g., Windows,Mac OS X and Linux. The system works as a package forPython programming language, which provides a high leveland easy-to-use API (Application Programming Interface).DeepGeo API was developed with focus on making it easy toperform the entire DL analysis cycle with few lines of sourcecode, taking advantages of TensorFlow parallelism to makeit easily scalable to process large amounts of data, remain-ing flexible and easily extensible. After defined the input dataand the DNN model to be used, we consider that the cycle ofDL based classification and analysis of RS data, as shown inFigure 1, have the following main steps:

1. defining the input data and DNN model;

9815978-1-5386-9154-0/19/$31.00 ©2019 IEEE IGARSS 2019

2. preprocessing input data;

3. generating training dataset;

4. training the Model;

5. evaluating training results, repeating the training step,if necessary, until having satisfactory results;

6. perform the classification;

7. visualize and analyze classification results, repeatingthe training step if necessary.

Fig. 1. Cycle for Remote Sensing image classification andanalysis using Deep Learning.

In the next sub-sections, we describe, based on this cycle,the conceptual modules of DeepGeo and present some exam-ples of how to use its functionalities.

2.1. Preprocessing Module

When dealing with RS images, having it properly preparedto input to a DNN is often a great challenge. This moduleprovides easy ways to perform a wide range of preprocessingoperations, like performing mosaics, crop images, rasterizevector layers of ground truth data and compute spectral in-dices.

Normalize input images is an important step for ANN,once it accelerates the training process, making the conver-gence faster [12]. This module also provides functions toautomatically perform standardization or normalization ofthe input images with several strategies. The code snippetpresented in Figure 2 shows the definition of a Preprocessorstructure.

In RS classification tasks, the ground truth data is fre-quently provided as maps in vector format instead of raster.The Rasterizer type allows the user to easily convert a vec-tor ground truth data to raster to make it possible to input itas labels to the DNN. Figure 3 presents a code snippet thatrasterizes an input vector file and save it in a new raster file.

1 import deepgeo.dataset.preprocessor as prep2 raster_file = "my_raster.tif"3 # Define a Preprocessor for file "my_raster.tif"4 preproc = prep.Preprocessor(raster_file,no_data=0)5 # Compute Vegetation indices (NDVI and EVI2)6 preproc.compute_indices({7 "ndvi": {"idx_b_red": 3, "idx_b_nir": 4},8 "evi2": {"idx_b_red": 3, "idx_b_nir": 4}})9 # Standardize the image. Subtracts from the mean

and then divides by the standard deviation10 preproc.standardize_image("mean_std")11 preproc.save_stacked_raster("output.tiff")

Fig. 2. Defining and using a Preprocessor.

1 import deepleeo.dataset.rasterizer as rast2 # Defines input data3 raster_file = "my_raster.tif"4 labels_shp = "my_labels.shp"5 # Defines the column in shape file containing the

classes6 class_column = "class"7 # Define the classes to be rasterized8 classes_of_inter = ["deforestation", "forest"]9 # Defines the Rasterize

10 rasterizer = rast.Rasterizer(shape_file,11 labels_shp,12 class_column=class_column,13 classes_interest=classes_of_inter)14 # Rasterizes the data and save at "my_labels.tif"15 rasterizer.rasterize_layer()16 rasterizer.save_labeled_raster_to_gtiff("my_labels

.tif")

Fig. 3. Rasterizing vector ground truth data.

2.2. Dataset Generation Module

Due to its depth and complexity, DL models are usually com-putationally hard to process, making it impossible to processan entire RS Image at once. Due to this limitation, it is com-mon to split the images into smaller processing units, calledchips or patches, i.e., small windows in the original image.The Dataset Generation Module provides a simple API to se-quentially or randomly split the image and save it into a train-ing dataset, to sequentially split images for the classificationprocess and to reconstitute a classified image from a sequen-tial set of classified patches.

2.3. Deep Learning Module

The Deep Learning module provides several DNNs modelsalready implemented, like the U-Net [13] and the Fully Con-volutional Networks (FCN) proposed by [14], and some adap-tations of these networks for multi-temporal analysis. It alsoprovides an easy way to define new models, using the prede-fined DeepGeo structure to train them and perform classifi-cation, without the need of large experience with TensorFlowAPI. The code snippet presented in Figure 4 defines a FCN

9816

and perform the training process based on a previously gen-erated dataset.

Despite powerful, DL methods are highly prone to over-fitting, being necessary a huge amount of samples and someregularization techniques to avoid it. Some RS applications,due to the difficulties to acquire huge amounts of labeled sam-ples, are even more prone to this problem [1]. Attempting tocounteract overfitting, a common regularization technique iscalled data augmentation, which artificially increase the sizeof the training dataset synthetically modifying existing sam-ples. It is also important to make the model more invariantto the position of the target object in the image. This moduleprovides operations to perform data augmentation applyingon the samples different angles of rotation and flipping them,substantially increasing the number of training samples. Tak-ing advantages of the parallelism and the structure of Tensor-Flow Data Input Pipeline, the data augmentation is appliedwhile loading the images for the training process.

1 import deepgeo.networks.model_builder as mb2 import deepgeo.dataset.utils as dsutils3 # Datasets are stored in TFRecords4 train_ds = "train_dataset.tfrecord"5 test_ds = "test_dataset.tfrecord"6 valid_ds = "valid_dataset.tfrecord"7 model_dir = "trained_model"8 test_img = "img.tif"9 output = "classif.tif"

10 # Defines some parameters of the DNN.11 params = {"epochs": 600,12 "batch_size": 20,13 "learning_rate": 0.0001,14 "data_aug_ops": ["rot90", "rot180",15 "rot270", "flip_left_right",16 "flip_up_down", "flip_transpose"]}17 # Defines a FCN8s model and train it.18 model = mb.ModelBuilder("fcn8s")19 model.train(train_ds, test_ds, params, model_dir)20 # Test in validation dataset21 model.validate(valid_ds, params, model_dir)22 # Perform classification23 model.predict(test_img, params, model_dir, output)

Fig. 4. Defining and training a FCN8s model.

2.4. Visualization and Classification Analysis Module

The Visualization and Classification Analysis module focuseson providing tools to visualize and analyze the quality of theinput dataset and the classification results. It provides severalmetrics to measure the classification accuracy, like pixel-wiseaccuracy, Receiver Operating Characteristics (ROC) curve,F1-score and cross-entropy. In addition, to make it possible tovisually analyze the quality of the input dataset and the classi-fied labels, this module provides tools to easily plot the image,ground truth labels, classified labels, histograms, confusionmatrices, and patches distribution in the original image.

3. EXPERIMENTAL RESULTS: MAPPINGDEFORESTED AREAS IN BRAZILIAN AMAZON

In this section we present a case study to illustrate the effec-tive use of DeepGeo toolbox. The focus here is not to obtaina high accurate classification model, but to exemplify the useof DeepGeo in a practical application. We used the system toproduce a classification of deforested areas in an small area ofthe Brazilian Amazon for the year 2017, taking the PRODESdata [15] as ground truth to train the Fully Convolutional Net-work FCN8s, proposed by [14]. PRODES is a program devel-oped by INPE 1 that provides a large database of yearly mapsof deforested areas in the Brazilian Amazon since 2000.

The area to be classified corresponds to one scene ofLandsat 8 OLI 2 sensor. Figure 5 shows this image in (a),the ground truth labels in (b) and the classification results in(c). Preprocessing steps were made through the source codepresented in Figure 2. The ground truth data was rasterizedthrough the source code presented in Figure 3. Based on theinput raster, 2000 patches were randomly generated as theinput training dataset. Data augmentation was performed ro-tating and flipping the generated patches, thus totaling 14000input patches for the training process. The training processwas performed through the source code shown in Figure 4.

Some advantages of using this system can be pointed out,as the facility to perform preprocessing steps in the input im-age, standardizing it, computing spectral indices, generatingtraining and classification datasets, perform data augmenta-tion and classification without much programming expertise.

4. CONCLUSIONS AND FUTURE WORKS

DeepGeo toolbox was presented in this paper. By provid-ing DL methods, preprocessing, dataset generation and resultanalysis functionalities as easy-to-use building boxes, it al-lows to easily perform the entire cycle of DL based analysison RS data. In this way, DeepGeo can make DL methodsmore accessible to those RS analysts without strong back-ground in computer science. It also provides easy ways to ex-tend the current functionalities by adding new strategies foreach step of the analysis cycle. Besides that, taking advan-tages of the flexibility and expressiveness of Python program-ming language, DeepGeo provides easy ways to be extendedand coupled to another tools.

The system can also deal with several types of geospa-tial data formats for raster and vector data, that can be usedas ground truth. For now, it only provides tools based onconvolutional encoder-decoders for semantic segmentation.Future works includes extending DeepGeo to provide moreDL approaches and applications, like Recurrent Neural Net-works for Time Series Analysis or Autoencoders for pan-sharpening.

1National Institute for Space Research, Brazil2Operational Land Imager

9817

Fig. 5. Classification of deforested areas: a) Input Image; b) Ground truth; c) Classification output

5. REFERENCES

[1] X. X. Zhu, D. Tuia, L. Mou, et al., “Deep Learningin Remote Sensing: A Comprehensive Review and Listof Resources,” IEEE Geoscience and Remote SensingMagazine, vol. 5, no. 4, pp. 8–36, 2017.

[2] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,”Nature, vol. 521, pp. 436–444, 2015.

[3] W. Huang, L. Xiao, Z. Wei, H. Liu, and S. Tang, “A NewPan-Sharpening Method With Deep Neural Networks,”IEEE Geoscience and Remote Sensing Letters, vol. 12,no. 5, pp. 1037–1041, 2015.

[4] G. Fu, C. Liu, R. Zhou, T. Sun, and Q. Zhang, “Classifi-cation for high resolution remote sensing imagery usinga fully convolutional network,” Remote Sensing, vol. 9,no. 5, pp. 1–21, 2017.

[5] R. Kemker, C. Salvaggio, and C. Kanan, “Algorithmsfor semantic segmentation of multispectral remote sens-ing imagery using deep learning,” ISPRS Journal ofPhotogrammetry and Remote Sensing, vol. 145, no.March, pp. 60–77, 2018.

[6] M. Zhai, Z. Bessinger, S. Workman, and N. Jacobs,“Predicting ground-level scene layout from aerial im-agery,” Proceedings - 30th IEEE Conference on Com-puter Vision and Pattern Recognition, CVPR 2017, vol.2017-January, pp. 4132–4140, 2017.

[7] L. Zhang, L. Zhang, and V. Kumar, “Deep learning forRemote Sensing Data: A technical tutorial on the stateof the art,” IEEE Geoscience and Remote Sensing Mag-azine, vol. 4, no. 2, pp. 22–40, 2016.

[8] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis,et al., “TensorFlow: Large-Scale Machine Learningon Heterogeneous Distributed Systems,” arXiv preprintarXiv:1603.04467, 2016.

[9] F. Chollet et al., “Keras,” 2015.

[10] Theano Development Team, “Theano: A Python frame-work for fast computation of mathematical expressions,”arXiv e-prints, vol. abs/1605.02688, May 2016.

[11] A. Paszke, S. Chintala, R. Collobert, et al., “Pytorch:Tensors and dynamic neural networks in python withstrong gpu acceleration, may 2017,” .

[12] Y. LeCun, L. Bottou, G. B. Orr, and K. R. Muller, “Ef-ficient backprop,” in Neural Networks: Tricks of theTrade. Lecture Notes in Computer Science, vol. 7700,pp. 9–48. Springer, Berlin, Heidelberg, 2012.

[13] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Con-volutional Networks for Biomedical Image Segmen-tation,” Medical Image Computing and Computer-Assisted Intervention MICCAI 2015, vol. 9351, pp.234–241, 2015.

[14] J. Long, E. Shelhamer, and T. Darrell, “Fully Con-volutional Networks for Semantic Segmentation,” inThe IEEE Conference on Computer Vision and Pat-tern Recognition (CVPR), pp. 3431–3440. IEEE Xplore,2015.

[15] INPE, PRODES Deforestation estimates in BrazilianAmazon, National Institute for Space Research, 2019.

9818

an extensible and easy-to-use toolbox for deep learning

Documents