deep learning-driven geospatial object detection for real ... · learning models from four variants...
TRANSCRIPT
DEEP LEARNING-DRIVEN GEOSPATIAL OBJECT
DETECTION FOR REAL-WORLD INSIGHT
PAGE 2
Geospatial Object Detection Overview The broadening availability of geospatial data and advances in computational
ability are helping organizations in a wide range of industries leverage geospatial
data in new ways. As adoption of techniques based on artificial intelligence
(AI) continues to permeate virtually every application, tasks such as geospatial
object detection are becoming more automated than ever before. However,
distinct challenges across the entire AI workflow for geospatial object detection
are requiring computing architectures comprising heterogeneous compute
nodes, hybrid storage, and software that supports AI.
Cray’s geospatial reference architecture addresses the compute and storage
requirements of this sector by providing a highly flexible and effective way
to apply advanced analytics and AI methods to new problems involving
geospatial data.
This white paper explores the industry trends accelerating geospatial data
usage, the compute and storage requirements of the AI workflow for geospatial
object detection, and the systems-level approach to designing deep learning
platforms for geospatial object detection.
Turning Geospatial Data into Business InsightGeospatial data is playing an increasingly critical role among both govern-
mental and commercial organizations for a range of applications, including
environmental monitoring, disaster support, land use planning, geographic
information system (GIS) updates, precision agriculture, urban planning, and
many more. The ability to ingest, store, process, analyze, and make predictions
using geospatial data is becoming ever more important to addressing real-world
concerns for a variety of organizations, such as weather centers, national labs,
defense agencies, local governments, and healthcare, agriculture, insurance, and
transportation companies.
In the past, geospatial data was difficult to access and limited largely to
capital-intensive industries like oil & gas or for governmental purposes such as
land-use planning, intelligence, and defense. But now, geospatial data is being
PAGE 3
adopted on a much broader level and by any organization looking to derive
actionable intelligence from the ability to analyze the location of earth objects.
This shift is being driven by a dramatic increase in the availability of geospatial
data. In recent years, government entities have invested in making geospatial
data more available, and the popularity of private data set providers like Google
Earth have helped to expand its accessibility to anyone with a laptop and an
internet connection. Increasing commercial interest in bolstering private launch
capabilities has helped a number of private enterprises launch their own earth
observation systems or remote sensing technologies in order to augment publicly
available geospatial data. These trends are enabling a wide range of potential
use cases that leverage geospatial data to drive business value, and that extend
beyond the standard industries that are already accustomed to using it.
The explosion of geospatial data usage has also been supported by processor
technology advances, lower-cost storage architectures, and availability of
frameworks/toolsets for machine and deep learning. The ability to more easily
apply analytics and AI to geospatial data is accelerating time-to-value for
geospatial applications and driving the burgeoning use of AI-based techniques.
Object Detection: A Highly Complex Vision TaskGeospatial analysis has always been a true “big data” use case. Most earth
observation data consists of highly detailed imagery and time-series data in
large file sizes. In fact, some geospatial images can exceed 300 gigabytes,
quickly outgrowing the memory capacity of a single processor. As a result,
processing ever-expanding volumes of geospatial data has required advanced
processors and workload parallelization. For decades, the compute capabilities
offered by supercomputers and other large-scale systems have been well-suited
to handle the intense demands of these highly parallel big data workloads.
Deep learning has emerged as a promising way to improve computer-driven
visual recognition tasks. In the geospatial industry, the availability of deep learning
tools has enabled the use of AI in the labor-intensive task of identifying features
or objects in detailed imagery. Today a variety of commercial organizations are
using it to drive improvements to their processes (like an insurance company
PAGE 4
using satellite imagery analysis to improve actuarial modeling, or a logistics
company using geospatial data to visually track equipment and deliveries).
Object detection in optical remote sensing images has a wide range of applica-
tions. Geospatial object detection with deep learning can not only determine
whether an aerial image contains one or more objects belonging to a class
of interest, but also pinpoint how many objects are present along with the
geographical position of each. In the images below, a deep learning model is
used to identify different objects, including airplanes, tennis/baseball courts, and
vehicles traveling along a highway.
While techniques like deep learning have made extracting meaningful and
actionable information from images easier, detecting specific objects or features
within an image is still a highly complex vision task. Objects belonging to the
same category can vary widely in visual appearance, and aerial images can
sometimes be poor quality or contain visual obstructions such as trees or
large buildings. Objects can sometimes be masked by variations in the image
viewpoint, background clutter, illumination, or shadows. These factors can have
a substantial effect on model accuracy and must be considered when training a
deep learning model for geospatial object detection.
There are many examples of companies applying deep learning models to
geospatial data to solve real-world problems. In 2017, The U.S Intelligence
Advanced Research Projects Activity (IARPA), to drive progress in geospatial
object detection, sponsored the Functional Map of the World Challenge, calling
for new algorithms to classify facility, building, and land use from satellite
Figure 1. Examples of Deep Learning-Driven Geospatial Object Detection (Source: “Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network,” Remote Sensing, 2018)
PAGE 5
imagery. For this challenge, companies employed deep learning to create the
fastest and most accurate classification algorithms possible, analyzing approx-
imately 4 terabytes of data. The winner generated a total of 12 trained deep
learning models from four variants of the dataset and three variant models.
The AI Workflow for Geospatial Object Detection Organizations interested in applying advanced analytics and AI to new problems
using geospatial data require a system configuration that combines hetero-
geneous compute nodes, hybrid storage, and software that supports AI. This
system configuration addresses the requirements of a workflow geared toward
developing a model that can successfully and accurately convert raw data into
meaningful insights.
To help customers quickly move from experimentation to production, Cray
examined the AI workflow for geospatial object detection and developed a
reference configuration matched to the workflow requirements. The AI workflow
for geospatial object detection is very similar to the standard AI workflow used
for computer-driven image analysis and object detection:
Figure 2. Bounding Box Used for Model Training
PAGE 6
The AI workflow for geospatial object detection is comprised of four general
stages: data generation, data preparation, model development, and model
implementation. Each stage has distinct infrastructure requirements in terms
of both compute and storage, and a balanced system is required in order to
cost-effectively handle both data capture and preprocessing at-scale and model
training within a defined time frame.
In the data generation stage, the storage subsystem for preprocessing ingests
geospatial data generated by remote sensing systems. From a compute
perspective, ingest performance and sufficient throughput are important in this
stage to prevent bottlenecks while bringing massive amounts of data into the
system quickly. Hard disk drive (HDD) storage offers higher capacity per dollar
than flash making it most appropriate for this stage of the workflow.
The data preparation stage is the most labor-intensive portion of the
geospatial object detection AI workflow. Raw data must first be converted
into an appropriate format for inclusion in the model, bounding boxes must
be constructed, and more must occur before the data is in a suitable format
for training and validating one or more models. Data preprocessing of this
nature is almost always done using a CPU node because CPU nodes are lower
Figure 3. The AI Workflow for Geospatial Object Detection with Infrastructure Requirements
DATAGENERATION
COMPUTEIngest from data sources
(e.g. remote sensing, satellite images, maps)
CPU Nodes Dense GPU Nodes CPU Nodes Dense GPU Nodes
Cray CS500 Cray CS-Storm Cray CS500 Cray CS-Storm
STORAGEHDD Storage Flash Storage HDD Storage Flash Storage
Cray ClusterStorL300N
HDD Storage
Cray ClusterStorL300N
Cray ClusterStorL300F
Cray ClusterStorL300N
Cray ClusterStorL300F
DATAPREPARATION
MODELDEVELOPMENT
Model Retraining Model RetrainingInference
MODELIMPLEMENTATION
PAGE 7
cost and it avoids tying up GPU nodes with CPU tasks. Additionally, most data
preprocessing code has been written for CPUs and would require significant
reworking to leverage a GPU. Achieving high input/output operations per
second (IOPS) to and from the storage sub-system isn’t a primary concern
during data preparation which again makes cost-effective HDD storage the
most appropriate storage technology for this stage of the workflow.
In the model development stage, the dataset is used to begin creating and
training deep learning models. This highly iterative stage involves a large
amount of trial and error, with models requiring repeated re-training until they
achieve the desired degree of accuracy. To achieve the highest effectiveness and
shortest time-to-results, this training stage runs best using dense GPU nodes.
Model training typically employs software tools that either can only run, or run
much faster, using a GPU node. GPU-based training is also I/O-bound, meaning
the ability to train models within a defined time frame is dependent on the
storage system’s ability to continually feed data into the model. These factors
make flash-based storage the most appropriate storage technology for this
stage of the workflow.
In the model implementation stage, the completed model is integrated into
the application which will then be used to solve a real-world problem. Model
inference follows, where model results are compared to real-life data to test the
accuracy of the model. During this stage, the finished model is applied to new
data continuously and allowed to infer things (i.e., learn). CPU-based nodes and
HDD storage are adequate for the less computationally intensive sub-phase
of new data processing, while GPU nodes and flash storage are required for
model retraining.
PAGE 8
Infrastructure for the Entire AI WorkflowCray’s geospatial reference architecture is designed for AI applications where
deep learning is used to prepare and interpret image data, develop complex
neural network models, and locate or classify objects, such as in geospatial
object detection. The Cray geospatial reference configuration is shown below.
CAPABILITY COMPONENT
AI deep learning node(s) 2 x Cray CS-Storm 500NX nodes, each with NVIDIA Tesla “Volta” 32GB V100 SXM2 GPU accelerators and 1 Intel Xeon Scalable “Skylake” processors
Data preparation and AI machine learning node(s)
8 x Cray CS500 2829XT nodes with 18-core Intel Xeon Scalable processors and 192GB memory
Management node(s) Cray 2828X 2U server with dual Intel Xeon processors
Storage Cray ClusterStor HPC scalable storage for data preparation and model developement including 35 TB of high-performance L300F scalable Flash storage and 640 TB of L300N hybrid (SSD/HDD) storage
NetApp™ E-Series E2800 storage for general purpose use (home directories, Jupyter notebooks, etc.)
Networking 100 GB/s InfiniBand EDR and 10 GB/s ethernet
Usability software Jupyter Notebook, TensorBoard
AI frameworks for model development and training
TensorFlow, PyTorch, Keras, BigDL for Spark
Cray Distributed Training Framework (Cray PE ML Plugin for distributed training, Horovod)
Cray Hyperparameter Optimization (HPO)
Tools for data preparation Apache Spark, Anaconda® Distribution (Conda® and Data Science Libraries), Python, distributed DASK, Programming Big Data with R (pbdR)
Geospatial Reference Configuration
APC AR3300
48P GigE Switch (Operations)
Switch External Power Supply
Blank
24P GigE Switch (Storage)
24P GigE Switch (Storage)
Blank
36P InfiniBand EDR Switch
Blank
Blank
Blank
Blank
CS-Storm 500NX 8xGPU
CS-Storm 500NX 8xGPU
CS500 2820XT 4-node CPU Chassis
CS500 2820XT 4-node CPU Chassis
KVM
CS500 2828X Management/Login Node
NetApp E2800 Shared Storage
ClusterStor Management Unit(CMU)
ClusterStor L300F with38TBs Flash Storage
ClusterStor Scalable Storage Unit(SSU) with 82TBs
Figure 4. The Cray Geospatial Reference Configuration
PAGE 9
The Cray geospatial reference architecture begins with heterogeneous
compute — specifically, a mix of Cray® CS500™ CPU nodes and Cray®
CS-Storm™ accelerated GPU nodes. Cray CS500 CPU nodes are highly scalable
and modular compute platforms based on the latest x86 processing technol-
ogies from Intel and AMD. Cray CS-Storm GPU-accelerated nodes are deep
learning platforms featuring NVIDIA® Tesla® V100 GPUs, the NVIDIA® NVLink™
GPU-to-GPU interconnect, and a robust deep learning environment. Mixed use
of CPU- and GPU-oriented nodes addresses the varying requirements of the AI
workflow for geospatial object detection by providing the optimal processor for
each stage of the job. Cray’s geospatial reference architecture is highly flexible
and tailorable to the specific needs of each customer, allowing researchers,
developers, and data scientists to tailor their data preparation and training
approaches to take advantage of multi-node architecture and train multiple
models simultaneously.
Cray® ClusterStor® HPC storage is a hybrid storage solution that deploys the
right storage technology at the right time for each stage of the AI workflow
for geospatial object detection. The ability to match each storage technology’s
strengths to specific activities and data requirements within the workflow creates
a storage solution that maximizes performance at the lowest overall cost.
The Cray ClusterStor L300N storage system is a hybrid SSD/HDD solution with
flash-accelerated NXD software that redirects I/O to the appropriate storage
medium. This storage system excels at handling the raw geospatial data in the
data generation and data preparation stages of the workflow.
The ClusterStor L300F scalable storage system adds a flash storage pool to
create a truly hybrid system, a capability crucial to the model development and
model implementation stages of the workflow.
PAGE 10
Below is a sample Cray ClusterStor configuration:
Finally, the Urika®-CS AI and analytics software suite helps IT administrators and
data scientists reduce the complexity of deploying new AI applications. For IT
administrators, the software simplifies the process of deploying open-source
AI frameworks and tools into their specific AI environment. For data scientists,
it comes pre-integrated with the right tools for each stage of the AI workflow,
including the most common data preparation tools (Anaconda Python tools), AI
frameworks (TensorFlow, PyTorch, etc.), and Cray-developed and/or integrated
tools for distributed machine or deep learning training.
IB EDR Network SwitchesManagement Network SwitchesSystem Management Unit (SMU)Metadata Management Unit (MMU)
ClusterStor L300F Flash Storage (2U)• Flash pool in Lustre file system• 32 TB useable capacity (with 3.2 TB SSD)• 10/20 GB/sec read/write• IB EDR
ClusterStor L300N Disk Storage (5U)• HDD pool in Lustre file system• 760 TB useable capacity (with 12 TB HDD)• 10/10 GB/sec read/write• IB EDR
Note: Home directories of the data scientist for Jupyternotebooks etc. provided by separate NAS share
Singlenamespace
REFERENCECONFIGURATION
Scale out linear to hundreds of ClusterStor L300N/F
Figure 5. Cray ClusterStor Configuration
PAGE 11
Helping Geospatial Customers Solve Complex Computing Challenges As a longtime leader in the supercomputing industry, Cray has a solid track
record of helping researchers, businesses, and government organizations
solve highly complex computing challenges. We recognize that our customers
increasingly want to leverage AI-based techniques like deep learning, but we
also know that designing a computing architecture around deep learning can be
a complex and arduous task. Our design approach is geared toward intelligently
designing an entire solution that recognizes the entire workflow, not just a single
stage, for maximum cost-effectiveness and computational efficiency.
Our flexible and customizable geospatial reference architecture gives customers
a solution that is most appropriate for their specific geospatial data use. A
system configuration that teams heterogeneous compute nodes, hybrid storage,
and software that supports AI brings together the optimal components to tackle
the complexity and size of geospatial data, from data preparation to model
development at-scale. And employing a workflow approach makes it easier to
assign the right technology to the right job at the right time, ultimately resulting
in a solution that maximizes performance at the lowest overall cost. Additionally,
Cray’s strong presence in the government, defense/intelligence, and weather/
climate markets offers proof that customers using geospatial data every day
already rely on Cray systems and solutions.
PAGE 12
Summary Geospatial object detection turns data into insight that can be leveraged in a
broad range of real-world applications. But the complexity of ingesting, storing,
processing, and analyzing geospatial data requires a flexible and balanced
computing architecture. Cray’s geospatial reference architecture addresses the
needs of object detection applications by teaming heterogeneous compute,
hybrid storage, and AI software in a highly flexible and configurable architecture.
At Cray, our goal is to provide you with the right thing to solve your specific
challenges. We’re committed to designing and delivering fully configured and
tightly integrated storage and compute solutions that offer unmatched perfor-
mance at-scale to enable any organization to apply advanced analytics and AI
to new problems involving geospatial data.
©2019 Cray Inc. All rights reserved. www.cray.com, Cray, the Cray logo, ClusterStor, and Urika are registered trademarks and CS500 and CS-Storm are trademarks of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20190522