![Page 1: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/1.jpg)
Property graphs with time
Julia Stoyanovich, joint work with Vera Moffitt
Drexel UniversityPhiladelphia, PA USA
stoyanovich.org
openCypher MeetupOctober 25, 2017
![Page 2: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/2.jpg)
openCypher MeetupOctober 25, 2017 2
2008 20092007
20112010
![Page 3: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/3.jpg)
openCypher MeetupOctober 25, 2017 3
https://www.kenedict.com/apples-internal-innovation-network-unraveled-part-1-evolving-networks/
![Page 4: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/4.jpg)
openCypher MeetupOctober 25, 2017 4
https://arxiv.org/abs/1709.06176
![Page 5: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/5.jpg)
openCypher MeetupOctober 25, 2017
Exploratory analysis of evolving graphs
• Which nodes are showing an increasing popularity trend?
• Have any changes in network connectivity been observed?
• At what time scale can interesting trends be observed?
• How can multiple data sources be used jointly to complement or corroborate information about network evolution?
5
![Page 6: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/6.jpg)
openCypher MeetupOctober 25, 2017
Goal
6
Principled and systematics support for usable, scalable and extensible analysis of evolving graphs
![Page 7: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/7.jpg)
openCypher MeetupOctober 25, 2017
Are Alice and Bill connected?
7
TNGP
… by a path?
![Page 8: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/8.jpg)
openCypher MeetupOctober 25, 2017
Snapshot reducibility
8
![Page 9: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/9.jpg)
openCypher MeetupOctober 25, 2017
Are Alice and Bill connected?
extended snapshot reducibility9
… by a journey?
… by a path that persists over >2 time instants
![Page 10: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/10.jpg)
openCypher MeetupOctober 25, 2017
TGraph: an evolving property graph
10
![Page 11: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/11.jpg)
openCypher MeetupOctober 25, 2017
TGA: Temporal Graph Algebra
• Temporal variants of standard graph operators + novel time-specific operators
• Compositional: TGraph (or a pair of TGraphs) as input - TGraph as output
• Operations maintain model integrity
- graph integrity at each time instant: no dangling edges, a node/edge appears at most once
- temporal integrity: semantics of temporal operations are automatically enforced (formally: point semantics)
11
![Page 12: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/12.jpg)
openCypher MeetupOctober 25, 2017
TGA operations
• trim
• temporal versions of
- vertex-map, edge-map
- subgraph, path
- aggregate messages
- union, intersection, difference - binary
• snapshot analytics
- PageRank, connected components,… - Pregel
12
![Page 13: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/13.jpg)
openCypher MeetupOctober 25, 2017
TGA operations
• node creation
• based on temporal window: temporal zoom
• attribute-based: structural zoom
• edge creation
13
![Page 14: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/14.jpg)
openCypher MeetupOctober 25, 2017
Structural zoom
14
add university nodes Drexel and CMU, and edges between students and these universities
![Page 15: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/15.jpg)
openCypher MeetupOctober 25, 2017
Structural zoom
15
![Page 16: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/16.jpg)
openCypher MeetupOctober 25, 2017
Temporal zoom
16
coarsen taxi trip start-times into 10-min intervals
![Page 17: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/17.jpg)
openCypher MeetupOctober 25, 2017
System architecture
17
Portal
InteractiveShell
QueryParser
SparkRuntime
GraphXDataStructures
WorkerSparkRuntime
HDFS
WorkerSparkRuntime
HDFS
…
SystemCatalog
SparkSQL
PortalRuntime(optimizer,operators,etc)
Spark 2.0, interoperable with SparkSQL and with BigDatalog
![Page 18: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/18.jpg)
openCypher MeetupOctober 25, 2017
Physical data representation• On-disk: Apache Parquet
- vertex / edge files
- broken down into snapshot groups
- each file sorted on start time followed by node /edge id
• In-memory:
- nested relational (Vertex-Edge RDDs)
- GraphX-based: RepresentativeGraphs (RG), One Graph (OG), HybridGraph (HG)
18
1 2 3
BitSet(p1,p2,p3,p4) BitSet(p2,p3,p4,p5)
BitSet(p5)
BitSet(p1,p2,p3,p4,p5)
BitSet(p2,p3)
JULIA’S VERSION
![Page 19: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/19.jpg)
openCypher MeetupOctober 25, 2017
Performance highlights
• 16-node Open Stack cluster
• Apache Spark 2.0
• 4 cores, 16GB / RAM per node
19
![Page 20: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/20.jpg)
openCypher MeetupOctober 25, 2017
PageRank on wiki-talk
20
![Page 21: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/21.jpg)
openCypher MeetupOctober 25, 2017
PageRank on nGrams
21
![Page 22: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/22.jpg)
openCypher MeetupOctober 25, 2017
PageRank on Twitter
22
![Page 23: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/23.jpg)
openCypher MeetupOctober 25, 2017
Aggregate messages on wiki-talk
23
![Page 24: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/24.jpg)
openCypher MeetupOctober 25, 2017
Vertex-subgraph on wiki-talk
24
![Page 25: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/25.jpg)
openCypher MeetupOctober 25, 2017
Portal vs. G*
25
average node degree, wiki-talk
![Page 26: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/26.jpg)
openCypher MeetupOctober 25, 2017
Take-aways
• TGraph: a logical model of property graphs with time
• TGA: a compositional temporal graph algebra under point semantics
• Portal: a library on top of Apache Spark, inter-operable with SparkSQL
• Ongoing work on a declarative language, multi-operator query optimization, benchmarking
• Planned open source release this Fall
26
![Page 27: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/27.jpg)
openCypher MeetupOctober 25, 2017
References
• Temporal Graph Algebra, Moffitt & Stoyanovich, DBPL 2017.
• Zooming in on NYC taxi data with Portal, Stoyanovich, Gilbride and Moffitt, DSSG 2017 (arXiv).
• Towards sequenced semantics for evolving graphs, Moffitt & Stoyanovich, EDBT 2017.
• Towards a distributed infrastructure for evolving graph analytics, Moffitt & Stoyanovich, TempWeb 2016.
• Vera Moffitt’s Ph.D. thesis.
27
![Page 28: Property graphs with time - Amazon S3 · October 25, 2017 openCypher Meetup System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker](https://reader034.vdocuments.us/reader034/viewer/2022042300/5ecafe5464db3431087dc949/html5/thumbnails/28.jpg)
openCypher MeetupOctober 25, 2017
Thank you!