deep learning-driven geospatial object detection for real ... · learning models from four variants...

DEEP LEARNING-DRIVEN GEOSPATIAL OBJECT

DETECTION FOR REAL-WORLD INSIGHT

Geospatial Object Detection Overview The broadening availability of geospatial data and advances in computational

ability are helping organizations in a wide range of industries leverage geospatial

data in new ways. As adoption of techniques based on artificial intelligence

(AI) continues to permeate virtually every application, tasks such as geospatial

object detection are becoming more automated than ever before. However,

distinct challenges across the entire AI workflow for geospatial object detection

are requiring computing architectures comprising heterogeneous compute

nodes, hybrid storage, and software that supports AI.

Cray’s geospatial reference architecture addresses the compute and storage

requirements of this sector by providing a highly flexible and effective way

to apply advanced analytics and AI methods to new problems involving

geospatial data.

This white paper explores the industry trends accelerating geospatial data

usage, the compute and storage requirements of the AI workflow for geospatial

object detection, and the systems-level approach to designing deep learning

platforms for geospatial object detection.

Turning Geospatial Data into Business InsightGeospatial data is playing an increasingly critical role among both govern-

mental and commercial organizations for a range of applications, including

environmental monitoring, disaster support, land use planning, geographic

information system (GIS) updates, precision agriculture, urban planning, and

many more. The ability to ingest, store, process, analyze, and make predictions

using geospatial data is becoming ever more important to addressing real-world

concerns for a variety of organizations, such as weather centers, national labs,

defense agencies, local governments, and healthcare, agriculture, insurance, and

transportation companies.

In the past, geospatial data was difficult to access and limited largely to

capital-intensive industries like oil & gas or for governmental purposes such as

land-use planning, intelligence, and defense. But now, geospatial data is being

https://searchsqlserver.techtarget.com/definition/spatial-data

adopted on a much broader level and by any organization looking to derive

actionable intelligence from the ability to analyze the location of earth objects.

This shift is being driven by a dramatic increase in the availability of geospatial

data. In recent years, government entities have invested in making geospatial

data more available, and the popularity of private data set providers like Google

Earth have helped to expand its accessibility to anyone with a laptop and an

internet connection. Increasing commercial interest in bolstering private launch

capabilities has helped a number of private enterprises launch their own earth

observation systems or remote sensing technologies in order to augment publicly

available geospatial data. These trends are enabling a wide range of potential

use cases that leverage geospatial data to drive business value, and that extend

beyond the standard industries that are already accustomed to using it.

The explosion of geospatial data usage has also been supported by processor

technology advances, lower-cost storage architectures, and availability of

frameworks/toolsets for machine and deep learning. The ability to more easily

apply analytics and AI to geospatial data is accelerating time-to-value for

geospatial applications and driving the burgeoning use of AI-based techniques.

Object Detection: A Highly Complex Vision TaskGeospatial analysis has always been a true “big data” use case. Most earth

observation data consists of highly detailed imagery and time-series data in

large file sizes. In fact, some geospatial images can exceed 300 gigabytes,

quickly outgrowing the memory capacity of a single processor. As a result,

processing ever-expanding volumes of geospatial data has required advanced

processors and workload parallelization. For decades, the compute capabilities

offered by supercomputers and other large-scale systems have been well-suited

to handle the intense demands of these highly parallel big data workloads.

Deep learning has emerged as a promising way to improve computer-driven

visual recognition tasks. In the geospatial industry, the availability of deep learning

tools has enabled the use of AI in the labor-intensive task of identifying features

or objects in detailed imagery. Today a variety of commercial organizations are

using it to drive improvements to their processes (like an insurance company

https://www.cray.com/solutions/artificial-intelligence/deep-learning/

using satellite imagery analysis to improve actuarial modeling, or a logistics

company using geospatial data to visually track equipment and deliveries).

Object detection in optical remote sensing images has a wide range of applica-

tions. Geospatial object detection with deep learning can not only determine

whether an aerial image contains one or more objects belonging to a class

of interest, but also pinpoint how many objects are present along with the

geographical position of each. In the images below, a deep learning model is

used to identify different objects, including airplanes, tennis/baseball courts, and

vehicles traveling along a highway.

While techniques like deep learning have made extracting meaningful and

actionable information from images easier, detecting specific objects or features

within an image is still a highly complex vision task. Objects belonging to the

same category can vary widely in visual appearance, and aerial images can

sometimes be poor quality or contain visual obstructions such as trees or

large buildings. Objects can sometimes be masked by variations in the image

viewpoint, background clutter, illumination, or shadows. These factors can have

a substantial effect on model accuracy and must be considered when training a

deep learning model for geospatial object detection.

There are many examples of companies applying deep learning models to

geospatial data to solve real-world problems. In 2017, The U.S Intelligence

Advanced Research Projects Activity (IARPA), to drive progress in geospatial

object detection, sponsored the Functional Map of the World Challenge, calling

for new algorithms to classify facility, building, and land use from satellite

Figure 1. Examples of Deep Learning-Driven Geospatial Object Detection (Source: “Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network,” Remote Sensing, 2018)

https://www.cray.com/sites/default/files/Cray-Deep-Learning-Geospatial-Object-Detection.pdf

https://www.mdpi.com/2072-4292/10/1/131/htm

https://www.iarpa.gov/challenges/fmow.html

imagery. For this challenge, companies employed deep learning to create the

fastest and most accurate classification algorithms possible, analyzing approx-

imately 4 terabytes of data. The winner generated a total of 12 trained deep

learning models from four variants of the dataset and three variant models.

The AI Workflow for Geospatial Object Detection Organizations interested in applying advanced analytics and AI to new problems

using geospatial data require a system configuration that combines hetero-

geneous compute nodes, hybrid storage, and software that supports AI. This

system configuration addresses the requirements of a workflow geared toward

developing a model that can successfully and accurately convert raw data into

meaningful insights.

To help customers quickly move from experimentation to production, Cray

examined the AI workflow for geospatial object detection and developed a

reference configuration matched to the workflow requirements. The AI workflow

for geospatial object detection is very similar to the standard AI workflow used

for computer-driven image analysis and object detection:

Figure 2. Bounding Box Used for Model Training

The AI workflow for geospatial object detection is comprised of four general

stages: data generation, data preparation, model development, and model

implementation. Each stage has distinct infrastructure requirements in terms

of both compute and storage, and a balanced system is required in order to

cost-effectively handle both data capture and preprocessing at-scale and model

training within a defined time frame.

In the data generation stage, the storage subsystem for preprocessing ingests

geospatial data generated by remote sensing systems. From a compute

perspective, ingest performance and sufficient throughput are important in this

stage to prevent bottlenecks while bringing massive amounts of data into the

system quickly. Hard disk drive (HDD) storage offers higher capacity per dollar

than flash making it most appropriate for this stage of the workflow.

The data preparation stage is the most labor-intensive portion of the

geospatial object detection AI workflow. Raw data must first be converted

into an appropriate format for inclusion in the model, bounding boxes must

be constructed, and more must occur before the data is in a suitable format

for training and validating one or more models. Data preprocessing of this

nature is almost always done using a CPU node because CPU nodes are lower

Figure 3. The AI Workflow for Geospatial Object Detection with Infrastructure Requirements

DATAGENERATION

COMPUTEIngest from data sources

(e.g. remote sensing, satellite images, maps)

CPU Nodes Dense GPU Nodes CPU Nodes Dense GPU Nodes

Cray CS500 Cray CS-Storm Cray CS500 Cray CS-Storm

STORAGEHDD Storage Flash Storage HDD Storage Flash Storage

Cray ClusterStorL300N

HDD Storage


Cray ClusterStorL300F


Cray ClusterStorL300F

DATAPREPARATION

MODELDEVELOPMENT

Model Retraining Model RetrainingInference

MODELIMPLEMENTATION

cost and it avoids tying up GPU nodes with CPU tasks. Additionally, most data

preprocessing code has been written for CPUs and would require significant

reworking to leverage a GPU. Achieving high input/output operations per

second (IOPS) to and from the storage sub-system isn’t a primary concern

during data preparation which again makes cost-effective HDD storage the

most appropriate storage technology for this stage of the workflow.

In the model development stage, the dataset is used to begin creating and

training deep learning models. This highly iterative stage involves a large

amount of trial and error, with models requiring repeated re-training until they

achieve the desired degree of accuracy. To achieve the highest effectiveness and

shortest time-to-results, this training stage runs best using dense GPU nodes.

Model training typically employs software tools that either can only run, or run

much faster, using a GPU node. GPU-based training is also I/O-bound, meaning

the ability to train models within a defined time frame is dependent on the

storage system’s ability to continually feed data into the model. These factors

make flash-based storage the most appropriate storage technology for this

stage of the workflow.

In the model implementation stage, the completed model is integrated into

the application which will then be used to solve a real-world problem. Model

inference follows, where model results are compared to real-life data to test the

accuracy of the model. During this stage, the finished model is applied to new

data continuously and allowed to infer things (i.e., learn). CPU-based nodes and

HDD storage are adequate for the less computationally intensive sub-phase

of new data processing, while GPU nodes and flash storage are required for

model retraining.

Infrastructure for the Entire AI WorkflowCray’s geospatial reference architecture is designed for AI applications where

deep learning is used to prepare and interpret image data, develop complex

neural network models, and locate or classify objects, such as in geospatial

object detection. The Cray geospatial reference configuration is shown below.

CAPABILITY COMPONENT

AI deep learning node(s) 2 x Cray CS-Storm 500NX nodes, each with NVIDIA Tesla “Volta” 32GB V100 SXM2 GPU accelerators and 1 Intel Xeon Scalable “Skylake” processors

Data preparation and AI machine learning node(s)

8 x Cray CS500 2829XT nodes with 18-core Intel Xeon Scalable processors and 192GB memory

Management node(s) Cray 2828X 2U server with dual Intel Xeon processors

Storage Cray ClusterStor HPC scalable storage for data preparation and model developement including 35 TB of high-performance L300F scalable Flash storage and 640 TB of L300N hybrid (SSD/HDD) storage

NetApp™ E-Series E2800 storage for general purpose use (home directories, Jupyter notebooks, etc.)

Networking 100 GB/s InfiniBand EDR and 10 GB/s ethernet

Usability software Jupyter Notebook, TensorBoard

AI frameworks for model development and training

TensorFlow, PyTorch, Keras, BigDL for Spark

Cray Distributed Training Framework (Cray PE ML Plugin for distributed training, Horovod)

Cray Hyperparameter Optimization (HPO)

Tools for data preparation Apache Spark, Anaconda® Distribution (Conda® and Data Science Libraries), Python, distributed DASK, Programming Big Data with R (pbdR)

Geospatial Reference Configuration

APC AR3300

48P GigE Switch (Operations)

Switch External Power Supply

Blank

24P GigE Switch (Storage)

24P GigE Switch (Storage)

Blank

36P InfiniBand EDR Switch

Blank

Blank

Blank

Blank

CS-Storm 500NX 8xGPU

CS-Storm 500NX 8xGPU

CS500 2820XT 4-node CPU Chassis

CS500 2820XT 4-node CPU Chassis

KVM

CS500 2828X Management/Login Node

NetApp E2800 Shared Storage

ClusterStor Management Unit(CMU)

ClusterStor L300F with38TBs Flash Storage

ClusterStor Scalable Storage Unit(SSU) with 82TBs

Figure 4. The Cray Geospatial Reference Configuration

The Cray geospatial reference architecture begins with heterogeneous

compute — specifically, a mix of Cray® CS500™ CPU nodes and Cray®

CS-Storm™ accelerated GPU nodes. Cray CS500 CPU nodes are highly scalable

and modular compute platforms based on the latest x86 processing technol-

ogies from Intel and AMD. Cray CS-Storm GPU-accelerated nodes are deep

learning platforms featuring NVIDIA® Tesla® V100 GPUs, the NVIDIA® NVLink™

GPU-to-GPU interconnect, and a robust deep learning environment. Mixed use

of CPU- and GPU-oriented nodes addresses the varying requirements of the AI

workflow for geospatial object detection by providing the optimal processor for

each stage of the job. Cray’s geospatial reference architecture is highly flexible

and tailorable to the specific needs of each customer, allowing researchers,

developers, and data scientists to tailor their data preparation and training

approaches to take advantage of multi-node architecture and train multiple

models simultaneously.

Cray® ClusterStor® HPC storage is a hybrid storage solution that deploys the

right storage technology at the right time for each stage of the AI workflow

for geospatial object detection. The ability to match each storage technology’s

strengths to specific activities and data requirements within the workflow creates

a storage solution that maximizes performance at the lowest overall cost.

The Cray ClusterStor L300N storage system is a hybrid SSD/HDD solution with

flash-accelerated NXD software that redirects I/O to the appropriate storage

medium. This storage system excels at handling the raw geospatial data in the

data generation and data preparation stages of the workflow.

The ClusterStor L300F scalable storage system adds a flash storage pool to

create a truly hybrid system, a capability crucial to the model development and

model implementation stages of the workflow.

https://www.cray.com/products/computing/cs-series/cs500

https://www.cray.com/products/computing/cs-series/cs-storm

https://www.cray.com/products/computing/cs-series/cs-storm

https://www.cray.com/products/storage/clusterstor

Below is a sample Cray ClusterStor configuration:

Finally, the Urika®-CS AI and analytics software suite helps IT administrators and

data scientists reduce the complexity of deploying new AI applications. For IT

administrators, the software simplifies the process of deploying open-source

AI frameworks and tools into their specific AI environment. For data scientists,

it comes pre-integrated with the right tools for each stage of the AI workflow,

including the most common data preparation tools (Anaconda Python tools), AI

frameworks (TensorFlow, PyTorch, etc.), and Cray-developed and/or integrated

tools for distributed machine or deep learning training.

IB EDR Network SwitchesManagement Network SwitchesSystem Management Unit (SMU)Metadata Management Unit (MMU)

ClusterStor L300F Flash Storage (2U)• Flash pool in Lustre file system• 32 TB useable capacity (with 3.2 TB SSD)• 10/20 GB/sec read/write• IB EDR

ClusterStor L300N Disk Storage (5U)• HDD pool in Lustre file system• 760 TB useable capacity (with 12 TB HDD)• 10/10 GB/sec read/write• IB EDR

Note: Home directories of the data scientist for Jupyternotebooks etc. provided by separate NAS share

Singlenamespace

REFERENCECONFIGURATION

Scale out linear to hundreds of ClusterStor L300N/F

Figure 5. Cray ClusterStor Configuration

https://www.cray.com/products/analytics/urika-cs

Helping Geospatial Customers Solve Complex Computing Challenges As a longtime leader in the supercomputing industry, Cray has a solid track

record of helping researchers, businesses, and government organizations

solve highly complex computing challenges. We recognize that our customers

increasingly want to leverage AI-based techniques like deep learning, but we

also know that designing a computing architecture around deep learning can be

a complex and arduous task. Our design approach is geared toward intelligently

designing an entire solution that recognizes the entire workflow, not just a single

stage, for maximum cost-effectiveness and computational efficiency.

Our flexible and customizable geospatial reference architecture gives customers

a solution that is most appropriate for their specific geospatial data use. A

system configuration that teams heterogeneous compute nodes, hybrid storage,

and software that supports AI brings together the optimal components to tackle

the complexity and size of geospatial data, from data preparation to model

development at-scale. And employing a workflow approach makes it easier to

assign the right technology to the right job at the right time, ultimately resulting

in a solution that maximizes performance at the lowest overall cost. Additionally,

Cray’s strong presence in the government, defense/intelligence, and weather/

climate markets offers proof that customers using geospatial data every day

already rely on Cray systems and solutions.

Summary Geospatial object detection turns data into insight that can be leveraged in a

broad range of real-world applications. But the complexity of ingesting, storing,

processing, and analyzing geospatial data requires a flexible and balanced

computing architecture. Cray’s geospatial reference architecture addresses the

needs of object detection applications by teaming heterogeneous compute,

hybrid storage, and AI software in a highly flexible and configurable architecture.

At Cray, our goal is to provide you with the right thing to solve your specific

challenges. We’re committed to designing and delivering fully configured and

tightly integrated storage and compute solutions that offer unmatched perfor-

mance at-scale to enable any organization to apply advanced analytics and AI

to new problems involving geospatial data.

©2019 Cray Inc. All rights reserved. www.cray.com, Cray, the Cray logo, ClusterStor, and Urika are registered trademarks and CS500 and CS-Storm are trademarks of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20190522

deep learning-driven geospatial object detection for real ... · learning models from four variants...

Documents