Download - Graph Databases for SQL Server Professionals
Graph Databases for SQL Server Professionals
Stéphane FréchetteThursday September 18, 2014
Who am I?
My name is Stéphane Fréchette
SQL Server MVP | Consultant | Speaker | Database & BI Architect | NoSQL. Drums, good food and fine wine. Founder @ukubu, @GatineauOuverte, @TEDxGatineau
I have a passion for architecting, designing and building solutions that matter.
Twitter: @sfrechetteBlog: stephanefrechette.comEmail: [email protected]
Session Outline
• What is a Graph?• What is Neo4j?• Data Modeling – The Property Graph• Cypher Query Language• Importing Data…• Use Cases• Demos• Resources
What is a Graph?
Are these Graphs?
This is a Graph
Node
Relationship
A Property Graph
Organization Project Graph
Twitter Social Graph
What is Neo4j?
An open-source graph database by Neo Technology. Neo4j stores data in nodes connected by directed, typed relationships with properties on both, also know as a Property Graph
• Fully ACID compliant• Massively scalable, up to several billion
nodes/relationships/properties• Highly-available, when distributed across multiple
machines• Accessible by a convenient REST interface or an
object-oriented Java API
Data Modeling
From SQL Server to Graph
Property Graph
Example: Meetup Data In SQL Server
ID Member
1 Daniel
2 Stephane
3 John
4 Randy
ID Name
1 Ottawa SQL Server User Group
2 Ottawa JavaScript
3 Ottawa Visio User Group
4 Ottawa Tableau User Group
5 Dirty Dancing Ottawa
MemberID MeetupID
2 1
1 2
3 3
2 4
3 5
MemberID MeetupID
3 1
3 2
4 2
4 4
1 5
Member MeetupMeetupOrganizer MeetupMember
Example: Meetup Data In a Graph Member Meetup
name: ‘Stephane’
name: ‘Ottawa Tableau User Group’
name: ‘Ottawa SQL Server User Group’
name: ‘John’
name: ‘Ottawa JavaScript’
name: ‘Dirty Dancing Ottawa’
name: ‘Ottawa Visio User Group’
name: ‘Randy’
name: ‘Daniel’
IS_ORGANIZER
IS_ORG
ANIZER
IS_ORGANIZER
IS_ORGANIZER
IS_ORGANIZER
IS_MEMBER
IS_MEMBER
IS_M
EMBE
R
IS_MEM
BER
IS_MEMBER
Cypher Query Language
Cypher is a declarative graph query language that allows for expressive and efficient querying and updating of the graph store
• Pattern-matching• Declarative: what to retrieve, not how to retrieve it• Inspired from other known Language (SQL, SPARQL, Haskell, Python)• Aggregation, Ordering, Limit• Update the Graph
Cypher and T-SQL
Cypher also has a number of keywords that have a direct equivalence with SQL which makes it a curiously familiar language
• WHERE• ORDER BY• LIMIT• SUM, COUNT, STDEVP, MIN, MAX etc…• LTRIM, UPPER, LOWER, REPLACE, LEFT, RIGHT, SUBSTRING• DISTINCT• CASE
(SQL Server Pros) – [:WILL_LOVE] -> (Cypher)
Cypher - Meetup
Neo4j Browser
Demo(let’s query some data…)
Importing Data…
Importing Data…
Some important considerations…Different import scenarios
• Dataset size: 1000s, 100000s, 10000000s• Dataset format (source): Database, File (CSV, Spreadsheet, GraphML, Geoff), Service, Other• Import type: Initial Bulk Load, Incremental Load, Initial Bulk Load + Incremental Load
Different import tools
• Spreadsheet based• Neo4j-shell based: (Cypher, neo4j-shell-tools, Cypher LOAD CSV)• Command-line based: Batch Importer• Neo4j Brower based• ETL Tools: (Talend, Mulesoft, Pentaho Kettle)• Custom software: (Java API, REST API, Spring Data Neo4j)
Many different mappings
Not always clear what you should be using Depends on your skillsets, dataset size… (lots of other stuff)
Choose wisely!
Import Scenarios Import Tools
Demo(walkthrough on importing data…)
The Sample Dataset
Importing using Spreadsheets
Very small size datasets < 1000, easy to use
Format data in spreadsheet
Generate Cypher statements with
formulas
Copy and Execute Cypher in Neo4j
browser
Importing using Spreadsheets
Importing using neo4j-shell-tools
Small to medium size datasetshttps://github.com/jexp/neo4j-shell-tools
Format data in CSV files
Create import-cypher commands for
neo4j-shell-tools
Execute commands from neo4j-shell
Importing using neo4j-shell-tools
Importing using LOAD CSV
Native Cypher
Format data in CSV files
Create “LOAD CSV” commands
Execute command from neo4j-shell or
browser
Additional “cleanup” for
Labels and RelTypes
Importing using LOAD CSV
Importing using Batch Importer
Non-transactional import, suited for very very large datasets
Format data in TSV files
Execute Batch Import command
Copy store files to Neo4j Server
directory
Start Neo4j Server with generated
store files
Use Cases
Principal uses of Graph Database include
• Network and Data Center Management(Queries: Impact Analysis, Root Cause Analysis, Quality-of-Service Mapping, Asset Management)
• Authorization and Access(Queries : Access Management, Interconnected Group Organization, Provenance)
• Social(Queries : Friend Recommendations, Sharing & Collaboration, Influencer Analysis)
• Geo(Queries : Routing, Logistics, Capacity Planning)
• Recommendations(Queries : Product, Social, Service, and Professional Recommendations)
• Fraud Detection
http://www.neotechnology.com/neo4j-use-cases/
Summary
(graphs)-[:ARE]->(everywhere)
Resources• Neo Technology http://www.neotechnology.com/
• Neoj.org (Learn, Develop, Downloads,…) http://www.neo4j.org/
• Neo4j on Vimeo http://vimeo.com/neo4j
• Neo4j on SlideShare http://www.slideshare.net/neo4j
• Neo4j on Github https://github.com/neo4j
• Neo4j Cypher Cheat Sheet http://docs.neo4j.org/refcard/2.1/
• Neo4j Graph Database as a Service http://www.graphenedb.com/
• Linkurious – The easiest way to explore graph databases http://linkurio.us/
• KeyLines- Visualize dynamic networks http://keylines.com/
• Experiments with NEO4J: Using a graph database as a SQL Server metadata hub http://bit.ly/V2PrxN
• Kenny Bastani http://www.kennybastani.com/
• Rik Van Bruggen http://blog.bruggen.com/
• Max de Marzi http://maxdemarzi.com/
• Better Software Development http://jexp.de/blog/
• Graph Databases (Free Book) http://graphdatabases.com/
• Neo4j GraphGist http://gist.neo4j.org/
• GraphConnect Conference http://graphconnect.com/
• Titan – Distributed Graph Database https://thinkaurelius.github.io/titan/
• InfiniteGraph http://www.infinitegraph.com/
• OrientDB http://www.orientechnologies.com/
• Cayley by Google https://github.com/google/cayley
What Questions Do You Have?
Thank YouFor attending this session