graphx is the blue ocean for scala engineers @ scala matsuri 2014

13
GraphX is the blue ocean for Scala Engineers Scala Matsuri 2014 LT @teppei_tosa https://www.flickr.com/photos/exalthim/337922734

Upload: -

Post on 14-Jun-2015

611 views

Category:

Technology


1 download

DESCRIPTION

GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

TRANSCRIPT

Page 1: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

GraphX is the blue ocean for Scala Engineers

Scala Matsuri 2014 LT

@teppei_tosahttps://www.flickr.com/photos/exalthim/337922734

Page 2: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

@ t e p p e i _ t o s a F i n a n c e I T E n g i n e e r !A s a k u s a / H a d o o p / S c a l a / P l a y F r a m e w o r k / S p a r k / G r a p h X

Who am I ?

https://www.flickr.com/photos/exalthim/337922734

Page 3: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

• One of Spark Components

• Graph-parallel computation system.

• Unify graph-parallel and data-parallel

computation in one system with a single

composable API.https://www.flickr.com/photos/exalthim/337922734

Page 4: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Example graph computation : Page Rank

0.33 0.33 0.33Set the values which are divided 1 with the number of vertex

0.170.17

0.33 0.33

Divide the values of each vertex with the number of degrees and send neighbors the values

0.17 0.50 0.33Summarize the values which are sent from neighbors and Set the summarized value

Until the values are converged, repeat these steps

https://www.flickr.com/photos/exalthim/337922734

Page 5: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Difficulty of graph-parallel computation

Because of connection between vertices, distributed computation of vertices needs to communicate between nodes ( Apache Giraph communicates by Zookeeper )

https://www.flickr.com/photos/exalthim/337922734

Page 6: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Unify graph-parallel and data-parallel computation

10

20

30

1

100

3

110

120200

2

[1,10,[2,100]][2,20,[3,110]][3,30,[1,200],[2,120]]

Apache GiraphI D VA L

1 1 0

2 2 0

3 3 0

S R C T G T VA L1 2 1 0 02 3 1 1 03 2 1 2 03 1 2 0 0

GraphX

val graph = Graph.fromEdgesgraph.joinVertices(…)

https://www.flickr.com/photos/exalthim/337922734

Page 7: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Graph data around you

Social Network Train Network Data Network

https://www.flickr.com/photos/exalthim/337922734

Page 8: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

What you will be able to do with graph data

Eveluate Vertex Clustering Graph Shape

Flow on Graph Predict Link

Page 9: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

GraphX is Still young

• Not enough information on web

• Much less functions than other graph lib like igraph of R

https://www.flickr.com/photos/exalthim/337922734

https://www.flickr.com/photos/katedot/8272997562

Page 10: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

My work about GraphX

• Translated GraphX document in Japanese

• https://gist.github.com/ironpeace/9306874

• Graph utility

• https://github.com/ironpeace/graph-web

https://www.flickr.com/photos/exalthim/337922734

Page 11: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Advantage for Scala Engineers

• Handling graph data with API like Scala’s collection’s API

• Easy to implement recursive computation

• Easy to implement function to handle graph data in iteration

https://www.flickr.com/photos/exalthim/337922734

Page 12: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

GraphX is the blue ocean for YOU !

• GraphX is the good solution for graph-parallel computation

• Handling Graph structure data gives you power to work out something which you have never been able to

• GraphX is still Young

• Scala engineers have advantage for graph data

https://www.flickr.com/photos/exalthim/337922734

Page 13: GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014

Get the Graph Power!Thank you !

@teppei_tosahttps://www.flickr.com/photos/exalthim/337922734