a seminar on neo4 j

Post on 05-Jul-2015

155 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

It is a seminar on NEO4J AND CYPHER

TRANSCRIPT

WelcomeSchool of Engineering, CUSAT 1

A SEMINAR ON

NEO4J

Presented by: Vishnu Sanker

Project guide: Dr. Sudheep Elayidom

Contents

• Trends in big data

• NoSQL

• Graphs

• Neo4j

• Brief introduction to Cypher

• Pros and Cons of Neo4j

School of Engineering, CUSAT 3

TRENDS IN BIG DATA

1. Increasing data size (big data)

• “Every 2 days we create as much information as we did up to 2003”

- Eric Schmidt

2. Increasingly connected data (graph data)

• For example, text documents to html

3. Semi-structured data

• Individualization of data, with common sub-set

4. Architecture

• From monolithic to modular, distributed applications

School of Engineering, CUSAT 4

NO SQL

School of Engineering, CUSAT 5

NOSQL

• Carlo Strozzi used the term NoSQL in 1998 to name his lightweight,

open-source relational database that did not expose the standard

SQL interface

• Provides a mechanism for storage and retrieval of data that is

modeled in means other than the tabular relations used in relational

databases.

School of Engineering, CUSAT 6

BENEFITS OF NOSQL

• Large volumes of structured, semi-structured and unstructured data

• Agile sprints, quick iteration, and frequent code pushes

• Flexible, easy to use object-oriented programming

• Efficient, scale-out architecture instead of expensive, monolithic architecture

School of Engineering, CUSAT 7

TYPES OF NOSQL

• Column

- distributed data store is a NoSQL object of the lowest level in a keyspace. It is a tuple (a key-value pair) consisting of three elements

Unique name : Used to reference the column

Value : The content of the column.

Timestamp : Used to determine the valid content

• Document oriented

- designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data

• Key value pairs

- collection of key value pairs

• Graph

- database that uses graph structures with nodes, edges, and properties to represent and store data

School of Engineering, CUSAT 8

GRAPHS

School of Engineering, CUSAT 9

GRAPHS

A GRAPH DATABASE...

NO: not for charts & diagrams, or vector artwork

YES: for storing data that is structured as a graph

School of Engineering, CUSAT 10

Graphs Everywhere

๏Relationships in

•Politics, Economics, History, Science, Transportation

๏Biology, Chemistry, Physics, Sociology

•Body, Ecosphere, Reaction, Interactions

๏Internet

•Hardware, Software, Interaction

๏Social Networks

•Family, Friends

•Work, Communities

•Neighbours, Cities, Society

School of Engineering, CUSAT 11

School of Engineering, CUSAT 12

Good Relationships

๏The world is rich, messy and related data

๏Relationships are as least as important as the things they connect

๏Complex interactions

๏Always changing, change of structures as well

๏Graph: Relationships are part of the data

๏RDBMS: Relationships part of the fixed schema

School of Engineering, CUSAT 13

HOW AN RDB IS REPRESENTED BY GRAPH

RDB PROPERTY GRAPH

School of Engineering, CUSAT 14

NEO4J - A GRAPH DATABASE

NEO4j - A GRAPH DATABASE

School of Engineering, CUSAT 15

GRAPHS

School of Engineering, CUSAT 16

School of Engineering, CUSAT 17

Neo4j is a Graph Database

๏A Graph Database:

•a schema-free Property Graph

•perfect for complex, highly connected data

๏Why NEO4J:

•reliable with real ACID Transactions

•fast with more than 1M traversals / second

•Server with REST API, or Embeddable on the JVM

•scale out for higher-performance reads with High-Availability

School of Engineering, CUSAT 18

DATA MODELING FOR NEO4J

School of Engineering, CUSAT 19

School of Engineering, CUSAT 20

School of Engineering, CUSAT 21

School of Engineering, CUSAT 22

School of Engineering, CUSAT 23

School of Engineering, CUSAT 24

School of Engineering, CUSAT 25

School of Engineering, CUSAT 26

SAMPLE CODE

School of Engineering, CUSAT 27

School of Engineering, CUSAT 28

CYPHER

School of Engineering, CUSAT 29

CYPHER - QUERY LANGUAGE FOR NEO4J

• Declarative query language

• Describe what you want, not how

• Based on pattern matching

• declarative grammar with clauses (like SQL)

• aggregation, ordering, limits

• create, update, delete

School of Engineering, CUSAT 30

Cypher: START + RETURN

๏START <lookup> RETURN <expressions>

๏START binds terms using simple look-up

•directly using known ids

•or based on indexed Property

๏RETURN expressions specify result set

School of Engineering, CUSAT 31

Cypher: MATCH

๏START <lookup> MATCH <pattern> RETURN <expr>

๏MATCH describes a pattern of nodes+relationships

•node terms in optional parenthesis

•lines with arrows for relationships

School of Engineering, CUSAT 32

Cypher: WHERE

๏START <lookup> [MATCH <pattern>]

๏WHERE <condition> RETURN <expr>

๏WHERE filters nodes or relationships

•uses expressions to constrain elements

School of Engineering, CUSAT 33

Cypher: SET

๏SET [<node property>] [<relationship property>]

•update a property on a node or relationship

•must follow a START

School of Engineering, CUSAT 34

Cypher: DELETE

๏DELETE [<node>|<relationship>|<property>]

•delete a node, relationship or property

•must follow a START

•to delete a node, all relationships must be deleted

first

School of Engineering, CUSAT 35

PROS AND CONS OF NEO4J

PROS

• Powerful data model - as generalized as rdbms

• Connected data is locally indexed

• Easy to query

Cons

• Sharding

• Needs new way of thinking

School of Engineering, CUSAT 36

Concluding...

• Neo4j is property graph database

• It is scalable, flexible, and is totally

designed in java

• Cypher is a query language for neo4j,

which is highly declarative and flexible

aswell

School of Engineering, CUSAT 37

School of Engineering, CUSAT 38

School of Engineering, CUSAT 39

top related