large-scale spatial query processing on gpu-accelerated big data...

Large-Scale Spatial Query Processing

on GPU-Accelerated Big Data Systems

Jianting Zhang1,2 Simin You2

1 Depart of Computer Science, CUNY City College (CCNY)

2 Department of Computer Science, CUNY Graduate Center

http://www-cs.ccny.cuny.edu/~jzhang/

Outline•Introduction

•Spatial data, GIS, BigData and HPC

•Taxi trip data in NYC and Global Biodiversity Applications

•Spatial query processing on GPUs

•ISP-GPU

•Architecture and Implementations

•Experiment Results

•Alternative Techniques

•SpatialSpark

•Lightweight Distributed Execution (LDE) Engine

•Summary an Future Work

Computational

Geometry

Spatial Databases:

data modeling,

indexing, query

processing

Scientific Data/Information Visualization

Statistics/Machine learning

Geographical Information System

Environmental

Modeling

Air quality

Remote

Sensing

Hydrology

Ecology

Social-

Economic

Modeling Urban planning

Transportation

Census/Taxation

Social Studies

Computer

Graphics

Image Processing/Computer Vision

Big Geospatial Data Challenges

– Event Locations, trajectories and O-D data

• E.g., Taxi trip records (GPS traces or O-D locations)

• 0.5 million in NYC (medallion taxi cab only) and 1.2 million in Beijing per day

• From O-D locations to trajectories to frequent patterns

– Satellite: e.g., from GOES to GOES-R (2015/2016) [$11B]

• http://www.goes-r.gov/downloads/GOES-R-Tri-10-06-09_v7.pdf

• Spectral (3X)*spatial (4X)* temporal (5X)=60X

• 2km*2km*5min*16bands(360*60)*(180*60)*(12*24)*16~ 1+ trillion pixels per day

• Derived thematic data products (vector)

– http://www.goes-r.gov/products/baseline.html

– http://www.goes-r.gov/products/option2.html

– Species distributions

• E.g. 400+ million occurrence records (GBIF)

• E.g. 717,057 polygons and 78,929,697 vertices for 4148 birds distribution data (NatureServe)

Cloud computing+MapReduce+Hadoop

Thread Block

CPU Host (CMP)

Local Cache

Shared Cache

HDD SSD

Ring Bus

Local Cache

Core ... Core

Core Core

GDRAM GDRAM

Core Core Core... ...

4-Threads

In-Order

16 Intel Sandy Bridge CPU cores+ 128GB RAM + 8TB disk + GTX TITAN + Xeon Phi 3120A ~ $9,994

ASCI Red: 1997 First 1 Teraflops

(sustained) system with 9298 Intel Pentium

II Xeon processors (in 72 Cabinets)

•Feb. 2013

•7.1 billion transistors (551mm²)

•2,688 processors •4.5 TFLOPS SP and 1.3 TFLOPS DP

•Max bandwidth 288.4 GB/s

•PCI-E peripheral device

•250 W (17.98 GFLOPS/W -SP)

• Suggested retail price: $999

What can we do today using a device that is more

powerful than ASCI Red 16 years ago?

Geospatial Technologies and

Environmental CyberInfrastructure

(GeoTECI) Lab

Dr. Jianting Zhang

Department of Computer Science

The City College of New York

Affiliated Institutions

Collaborating Institutions

Students: Simin You (Ph.D. 2009 -), Siyu Liao (Ph.D. 2014-), Costin Vicoveanu (Undergraduate, 2014-)

Bharat Rosanlall (Undergraduate, 2014), Jay Yao (MS-thesis, 2011-2012), Chandrashekar Singh (MS

2013), Agniva Banerjee (MS, 2012), Roger King (MS, 2012), Wahyu Nugroho (MS, 2011), Xiao Quan

Cen Feng (MS 2011), Chetram Dasrat (Undergraduate, 2008)

GeoTECI@CCNY

•HIGHEST-DB

•HIgh-performance GrapHics units based Engine for Spatial-Temporal data

•Spatial and Spatiotemporal indexing, query processing and optimization

•Trajectory data management on GPUs

•Segmentation/simplification/compression/Aggregation/Warehousing

•Map matching with road networks

•Data mining (moving cluster, convoy, swarm...)

… when yellow cabs,

green cabs and MTA

buses meet with multi-

core CPUs, GPUs and

MICs in NYC …

$449,845/4yr (08/01/2013-07/31/2017)

GeoTECI@CCNY

High-resolution Satellite ImageryT

In-situ Observation Sensor Data

Global and Regional Climate Model Outputs

Data AssimilationZonal Statistics

TEcological, environmental and administrative zones

Temporal Trends

High-End Computing Facility

Thread Block

… when GOES-R satellites, extratropical

cyclones and hummingbirds meet with TITAN …

GeoTECI@CCNYCCNY Computer Science LAN

Microway

Dual 8-core

128GB memory

Nvidia GTX Titan

Intel Xeon Phi 3120A

8 TB storage

SGI Octane III

Dual Quadcore

48GB memory

Nvidia C2050*2

8 TB storage

Dual-core

8GB memory

Nvidia GTX Titan

3 TB storage

Dell T5400

Dual Quadcore

16GB memory

Nvidia Quadro 6000

1.5 TB storage

Lenovo T400s

Dell T7500

Dual 6-core

24 GB memory

Nvidia Quadro 6000

Dell T7500

Dual 6-core

24 GB memory

Nvidia GTX 480

Dual Quadcore

16GB memory

Nvidia FX3700*2

Dell T5400

Quadcore (Haswell)

16 GB memory

AMD/ATI 7970

Quadcore

8 GB memory

Nvidia Quadro 5000m

HP 8740w

CUNY HPCC

“Brawny” GPU cluster

“Wimmy” GPU cluster

Web Server/

Linux App Server Windows

App Server

...building a highly-configurable experimental computing

environment for innovative BigData technologies…

Taxi trip data in NYC

Taxi trip records

•~170 million trips (300 million

passengers) in 2009

•1/5 of that of subway riders and

1/3 of that of bus riders in NYC

Taxicabs

•13,000 Medallion taxi cabs

•License priced at > $1M

•Car services and taxi services

are separate

Count-Distance Distribution

5000000

10000000

15000000

20000000

Trip Distance (mile)

Count-Time Distribution

5000000

10000000

15000000

20000000

TripTime (Minute)

Count-Speed Distribution

5000000

10000000

15000000

20000000

Speed (MPH)

Count-Fare Distribution

5000000

1000000015000000

20000000

25000000

30000000

( 11.0

, 12.0

( 13.0

, 14.0

( 15.0

, 16.0

( 17.0

, 18.0

( 19.0

, 20.0

( 21.0

, 22.0

( 23.0

, 24.0

( 25.0

, 26.0

( 27.0

, 28.0

( 29.0

, 30.0

( 31.0

, 32.0

( 33.0

, 34.0

( 35.0

, 36.0

( 37.0

, 38.0

( 39.0

, 40.0

( 41.0

, 42.0

( 43.0

, 44.0

( 45.0

, 46.0

( 47.0

, 48.0

( 49.0

, 50.0

Fare ($)

Over all distributions of trip distance, time, speed and fare (2009)

• How to manage taxi trip data?

– Geographical Information System (GIS)

– Spatial Databases (SDB)

– Moving Object Databases (MOD)

• How good are they?

– Pretty good for small amount of data

– But, rather poor for large-scale data

• Example 1: – Loading 170 million taxi pickup locations into PostgreSQL

– UPDATE t SET PUGeo = ST_SetSRID(ST_Point("PULong","PuLat"),4326);

– 105.8 hours!

• Example 2: – Finding the nearest tax blocks for 170 million taxi pickup locations

using open source libspatiaindex+GDAL

– 30.5 hours!

I do not have time to wait...

Can we do better?

Intel Xeon 2.26 GHz processors with 48G memory

Global

Biodiversity

Data at GBIF

SELECT aoi_id, sp_id, sum (ST_area (inter_geom))

SELECT aoi_id, sp_id,

ST_Intersection (sp_geom, qw_geom)

AS inter_geom

FROM SP_TB, QW_TB

WHERE ST_Intersects (sp_geometry, qw_geom)

GROUP BY aoi_id, sp_id

HAVING sum(ST_area(inter_geom)) >T;

http://gbif.org

Spatial Data Processing on GPUs

http://www-cs.ccny.cuny.edu/~jzhang/papers/gpu_spatial_tr.pdf

Spatial query processing on GPUs

Single-Level Grid-File based Spatial Filtering

Vertices

(polygon/

polyline)

Points

•Perfect coalesced

memory accesses

•Utilizing GPU floating

point computing power

Nested-Loop based Refinement

J. Zhang, S. You and L. Gruenwald, "Parallel Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs," Information Systems, vol. 44, p. 134–154, 2014.

Top: grid size =256*256resolution=128 feet Right: grid size =8192*8192resolution=4 feet

Spatial Aggregation

9,424 /326=30X (8192*8192)

Temporal Aggregation

1709/198=8.6X (minute)

1598 /165 = 9.7X (hour)

P2P-TP2N-D P2P-D

147,011

street

segments

38,794

census

blocks

(470,941

points)

735,488 tax

blocks

(4,698,986

points)

P2N-D P2P-T P2P-D

- 15.2 h 30.5 h

10.9 s 11.2 s 33.1 s

- 4,900X 3,200X

CPU time

GPU Time

Speedup

Algorithmic

improvement: 3.7X

Using main-memory

data structures: 37.4X

GPU Acceleration:

•ISP-GPU

•SpatialSpark

ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters

http://www.slideshare.net/hadooparchbook/impala-architecture-presentation

• SQL Frontend: translate SQL

queries into execution plans

• C/C++ backend with SSE4

support (for strings operations)

• Efficient implementations of

hash-joins (partitioned and non-

partitioned)

• LLVM-based JIT

• ….

Attractive Features

Extension is challenging!

class SpatialJoinNode : public BlockingJoinNode {

public:

SpatialJoinNode(ObjectPool* pool, const TPlanNode& tnode, const DescriptorTbl& descs);

virtual Status Prepare(RuntimeState* state);

virtual Status GetNext(RuntimeState* state, RowBatch* row_batch, bool* eos);

virtual void Close(RuntimeState* state);

protected:

virtual Status InitGetNext(TupleRow* first_left_row);

virtual Status ConstructBuildSide(RuntimeState* state);

private:

boost::scoped_ptr<TPlanNode> thrift_plan_node_;

RuntimeState* runtime_state_;

create_rtree(…)

pip_join(…)

nearest_join(…)

http://www-cs.ccny.cuny.edu/~jzhang/papers/isp_gpu_tr.pdf

Scalable and Efficient Spatial Data Management on Multi-Core CPU and GPU Clusters.

IEEE HardBD’15 Workshop

ISP-GPU ISP-MC+ GPU-Standalone MC-Standalone

taxi-nycb (s) 96 130 50 89

GBIF-WWF(s) 1822 2816 1498 2664

Taxi-nycb: ~170 million points, ~40 thousand polygons (9 vertices/polygon)

GBF-WWF: ~375 million points, ~15 thousand polygons (279 vertices/polygon)

Single-node results: 16core CPU/128GB, GTX Titan

Cluster results: 2-10 nodes each with 8 vCPU cores/15GB, 1536 CUDA cores/4 GB

(50 million species locations used due to memory constraint)

•ISP-GPU

•SpatialSpark

Alternative

Techniques

val sc = new SparkContext(conf)

//reading left side data from HDFS and perform pre-processing

val leftData = sc.textFile(leftFile, numPartitions).map(x => x.split(SEPARATOR)).zipWithIndex()

val leftGeometryById = leftData.map(x => (x._2, Try(new WKTReader().read(x._1.apply(leftGeometryIndex)))))

.filter(_._2.isSuccess).map(x => (x._1, x._2.get))

//similarly for right-side data….

//ready for spatial query (broadcast-based)

val joinPredicate =SpatialOperator.Within // NearestD can be applied similarly

var matchedPairs:RDD[(Long, Long)] = BroadcastSpatialJoin(sc, leftGeometryById, rightGeometryById, joinPredicate)

SpatialSpark: Just Open-Sourced

http://simin.me/projects/spatialspark/

http://www-cs.ccny.cuny.edu/~jzhang/papers/spatial_cc_tr.pdf

Large-Scale Spatial Join Query Processing in Cloud (Comparison with ISP-MC)

IEEE CloudDM’15 Workshop

Alternative

Techniques

Lightweight Distributed Execution Engine for Large-Scale Spatial Join Query Processinghttp://www-cs.engr.ccny.cuny.edu/~jzhang/papers/lde_spatial_tr.pdf

Spatial Data Processing and IoT

http://www.sensor-networks.org/

• Cell-phone based sensing and querying 3D world (personal navigation)

• Crowd-sourcing 3D urban infrastructure/traffic monitoring using

RGB-D videos

• Emergency response

and disaster relief

• Building

Information

System and

energy control

Summary and Future Work

• Designs and implementations of an in-memory

spatial data management system on multi-core

CPU and many-core GPU clusters by

extending Cloudera Impala for distributed

spatial join query processing

• Experiments on the initial implementations

have revealed both advantages and

disadvantages of extending a tightly-coupled

big data system to support spatial data types

and their operations.

• Alternative techniques are being developed to

further improve efficiency, scalability,

extensibility and portability.

jzhang@cs.ccny.cuny.edu

http://www-cs.ccny.cuny.edu/~jzhang/

large-scale spatial query processing on gpu-accelerated big data...

Documents

an introduction of big data; big data for beginners;...

big data meets big data

การประยุกต์ใช้ big data ·...

big data madison: architecting for big data

caterpillar big data infrastructure big data, data...

informatica big data management - meetup › 16208282 ›...

· for executive: box big data ussuiu lla:ansnnns1ðxnu...

big data solutions - big data technology

big vulnerabilities + big data = big intelligence

big data, big commerce, big challenge

big data, künstliche intelligenz und data analytics · big...

introduction to big data, big data processing, and big...

big success with big data - executive summary · big...

introduction to big data, big data processing, and big...

introduction to big data. reference: what is “big...

introduction to big data, big data processing, and big

big data to big results - amt-sybex · big data – really?...

big data + big ideas = big impact

unite and free your data making big data big...

2.3 methods for big data what is “big data”? summarizing...