it’s all about me…€¦ · 1 it’s all about me… prof. mark whitehorn emeritus professor of...
TRANSCRIPT
![Page 1: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/1.jpg)
1
It’s all about me…
Prof. Mark WhitehornEmeritus Professor of AnalyticsComputingUniversity of Dundee
ConsultantWriter (author)[email protected]
© Whitehorn
![Page 2: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/2.jpg)
2
Graph Databases
• Different database engines are built to be good at a specific set of operations.
Relational engines, for example, are typically optimised for transaction control and protecting data from damage and loss during update.
They are typically not optimised for detecting fraud and performing recommendations (“Customers who bought this book frequently bought….”). Graph databases are essentially the opposite, poor at transactions and good at tasks such as fraud detection and recommendations. The key to using graph databases effectively is understanding not only how they work but why they were designed that way – in other words, understanding what underpins their strengths and weaknesses. So this talk will explore their origins and how and why they work. 2
© Mark Whitehorn
![Page 3: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/3.jpg)
3
Graph Databases
• Origins
3
© Mark Whitehorn
![Page 4: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/4.jpg)
4
Kaliningrad, the city formerly known as Prince; errr, Königsberg.
4
© Mark Whitehorn
![Page 5: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/5.jpg)
5
In the 1700s, Königsberg had seven bridges.
5
© Mark Whitehorn
![Page 6: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/6.jpg)
6
Can you cross each bridge once and once only?
6
© Mark Whitehorn
![Page 7: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/7.jpg)
7
Solved by Leonhard
Euler (1707 – 1783)
Swiss mathematician.
7
© Mark Whitehorn
![Page 8: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/8.jpg)
8
Can you cross each bridge once and once only?
8
© Mark Whitehorn
![Page 9: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/9.jpg)
9
9
© Mark Whitehorn
![Page 10: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/10.jpg)
10
10
© Mark Whitehorn
![Page 11: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/11.jpg)
11
11
© Mark Whitehorn
![Page 12: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/12.jpg)
12
12
© Mark Whitehorn
We can try (and fail) to trace a pathbut failing doesn't prove that it can’tbe done. If we succeed we prove it can be done but no one could succeed.
![Page 13: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/13.jpg)
13
13
© Mark Whitehorn
![Page 14: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/14.jpg)
14
14
© Mark Whitehorn
![Page 15: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/15.jpg)
15
Nodes
Edges
15
© Mark Whitehorn
![Page 16: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/16.jpg)
16
Consider a node with two edges. If you start on the node you must finish ….
1
16
© Mark Whitehorn
![Page 17: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/17.jpg)
17
Consider a node with two edges. If you start on the node you must finish on the node;This remains true no matter to what the edges connect.
2 1
17
© Mark Whitehorn
![Page 18: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/18.jpg)
18
Consider a node with two edges. If you start off the node you must finish ….
1
18
© Mark Whitehorn
![Page 19: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/19.jpg)
19
Consider a node with two edges. If you start off the node you must finish off the node;This remains true no matter to what the edges connect.
What further generalisation can we make?
21
19
© Mark Whitehorn
![Page 20: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/20.jpg)
20
Consider a node with an even number of edges. If you start on the node you must finish on the node;If you start off the node you must finish off the node.This remains true no matter to what the edges connect.
20
© Mark Whitehorn
![Page 21: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/21.jpg)
21
Consider a node with three edges. If you start off the node you must finish ? the node;if you start on the node you must finish ? the node.
3
1 2
21
© Mark Whitehorn
![Page 22: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/22.jpg)
22
Consider a node with three edges. If you start off the node you must finish on the node.
This remains true no matter to what the edges connect.
3
1 2
22
© Mark Whitehorn
![Page 23: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/23.jpg)
23
Consider a node with three edges. If you start on the node you must finish off the node.
This remains true no matter to what the edges connect.And this is true for all nodes with an odd number of edges.
3
12
23
© Mark Whitehorn
![Page 24: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/24.jpg)
24
24
© Mark Whitehorn
Even no. of edges Odd no. of edges
Start on node On Off
Start off node Off On
Se we have a set of rules that logic (or Euler) tells us is irrefutable. The table shows us where you must finish for a given set of starting conditions:
![Page 25: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/25.jpg)
25
Suppose we have two nodes, both having two edges.We can start on and finish on node A,which agrees with the rules. We can start off and finish off node B,which also agrees with the rules.
25
© Mark Whitehorn
A B
![Page 26: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/26.jpg)
26
So, in the Königsberg bridge problem, a really importantgeneral question to ask is:“How many nodes have an even number of edges and how many have an odd number?”
26
© Mark Whitehorn
![Page 27: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/27.jpg)
27
There are four nodes and they all have an odd number of edges.
What do we know about nodes with odd numbers of edges?
If you don’t start on a node, you must finish on that node.27
© Mark Whitehorn
![Page 28: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/28.jpg)
28
Suppose we choose to start on node A, that means we don’t starton B C or D. But the rules tell us that, if we don’t start on a nodewe have to finish on it. So we have to finish on three nodes (B C D).
That is impossible, so the Königsberg bridge problem is unsolvable.
© Mark Whitehorn
A
B
D
C
![Page 29: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/29.jpg)
29
Not only did Euler
induce the general
rules, he developed an
entire branch of
mathematics from this -
Graph. In turn this led
to the development of
graph databases which
are a very important
class of NoSQL
database engines. 29
© Mark Whitehorn
![Page 30: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/30.jpg)
What is a
Graph
Database?
@gerrymcnicol
Slides courtesy of
Gerry McNicol
![Page 31: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/31.jpg)
What is a Graph?
Gerry Tom
FRIENDS_WITH
Tennis MouseFormula 1
LIKES CHASESIS_ALIKESDRIVES_IN
![Page 32: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/32.jpg)
Gerry Tom
FRIENDS_WITH
Tennis MouseFormula 1
LIKES CHASESIS_ALIKESDRIVES_IN
Sport
IS_A
IS_A
![Page 33: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/33.jpg)
Exeter
London
S'hampton
Bristol
Taunton
HORSE
TRAIN
TRAIN TRAIN
TRAIN
BUS
TRAIN
BUS
time:35 time:120
busco:mega
time:37
busco:mega
time:34
time:31
time:65
time:45
time:453
name: buttercup
stn:esd
stn:trs
stn:ssm
stn:btm
stn:lpad
![Page 34: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/34.jpg)
What is a Graph?
• Made up of Nodes and Edges (Relationships)
• Nodes are connected by Edges
• Every Edge has ...
• a starting and ending Node
• a direction
• Both Nodes and Edges can have properties.
• Very flexible data structure
![Page 35: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/35.jpg)
Exeter
London
S'hampton
Bristol
Taunton
HORSE
TRAIN
TRAIN TRAIN
TRAIN
BUS
TRAIN
BUS
time:35 time:120
busco:mega
time:37
busco:mega
time:34
time:31
time:65
time:45
time:453
name: buttercup
stn:esd
stn:trs
stn:ssm
stn:btm
stn:lpad
Gerry
LIKES
![Page 36: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/36.jpg)
Exeter
London
S'hampton
Bristol
Taunton
HORSE
TRAIN
TRAIN TRAIN
TRAIN
BUS
TRAIN
BUS
Gerry
LIKES
Tom
FRIENDS_WITH
Tennis MouseFormula 1
LIKES CHASESIS_ALIKESDRIVES_I
N
Sport
IS_A
IS_A
![Page 37: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/37.jpg)
Use Cases
• Very powerful and flexible data model
• Semantically rich - very descriptive
• Densely-connected data sets
• Variably Structured data sets
![Page 38: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/38.jpg)
Copyright Mark Whitehorn
Graph – Database engines
Clearly there are multiple graph engines and they can differ. However we can talk in generalisations that will apply to most.
The data is stored in both nodes and edges
• Both are equally important
There is no need for nodes (or edges) to store the same data
![Page 39: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/39.jpg)
Copyright Mark Whitehorn
Graph – Database engines
•Data is typically stored as key value pairs (KVPs)
![Page 40: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/40.jpg)
4040
KVPs?
![Page 41: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/41.jpg)
4141
Going back to relational data for a moment
Car
LicenceNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
Columns
Rows
All entities have the same set of attributes, and only one of each.
In practical terms we could also say that each row will have data for each column.
![Page 42: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/42.jpg)
4242
Going back to relational data for a momentCar
LicenceNo Make Model Year ColourCER 162 C Triumph 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Blue/Red
Columns
Rows
Nulls are tolerated, but frowned upon:
• All cars should have a model
Duplicated are not tolerated:
• A car cannot have more than one colour
![Page 43: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/43.jpg)
4343
Nulls can be common in big data
This particular data (sensor data) sits poorly in a table. But note that each reading can be identified by the column name and the row identifier.
So we could store, for each row, only the columns that do have data.
SensorID Manufacturer TimeDate Pressure Humidity Temp Wind Depth And so on
213342332 34 1/1/2016:11:23 23
2-BSDEFF76 12 2016/1/1:11:34 1034 12
![Page 44: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/44.jpg)
4444
{
“SensorID”: “213342332”,
“Manufacturer”: ”34”,
“TimeDate": ” 1/1/2016:11:23”,
“Temp”: “23”
},
{
“SensorID”: “2-BSDEFF76”,
“Manufacturer”: ”12”,
“TimeDate": ” 2016/1/1:11:34:43”,
“Pressure”: “1034”,
“Depth”: “12”
}
SensorID Manufacturer TimeDate Pressure Humidity Temp Wind Depth And so on
213342332 34 1/1/2016:11:23 23
2-BSDEFF76 12 2016/1/1:11:34 1034 12
![Page 45: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/45.jpg)
4545
Key
Value
{
“SensorID”: “213342332”,
“Manufacturer”: ”34”,
“TimeDate": ” 1/1/2016:11:23”,
“Temp”: “23”
},
{
“SensorID”: “2-BSDEFF76”,
“Manufacturer”: ”12”,
“TimeDate": ” 2016/1/1:11:34:43”,
“Pressure”: “1034”,
“Depth”: “12”
}
![Page 46: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/46.jpg)
4646
Key
Value
{
“SensorID”: “213342332”,
“Manufacturer”: ”34”,
“TimeDate": ” 1/1/2016:11:23”,
“Temp”: “23”
},
{
“SensorID”: “2-BSDEFF76”,
“Manufacturer”: ”12”,
“TimeDate": ” 2016/1/1:11:34:43”,
“Pressure”: “1034”,
“Depth”: “12”
}
Key
Value
![Page 47: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/47.jpg)
4747
Key Value PairsKey Value Pairs (KVPs) are a very effective way of storing sparse data (data where we expect a large number of nulls).
They are also excellent in cases where we know the data collected will vary over time.
![Page 48: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/48.jpg)
48
Graph – Database design
Where should we put an attribute such as “occupation”, e.g. Data Scientist?
Three options:
In the person node
in an edge (hard!)
In an occupation node
![Page 49: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/49.jpg)
49
Size of Node = number of customersWidth of Edge = number of errors
SELECT *
FROM graphgen
(ON
(SELECT DISTINCT dmt_act_dslam,
nra_id,
nbr_of_srvid,
errorspersrv,
nbr_of_dslam
FROM wrk.srvid_dslam_err)
PARTITION BY 1
ORDER BY errorspersrv
item_format('cfilter')
item1_col('dmt_act_dslam')
item2_col('nra_id')
score_col('errorspersrv')
cnt1_col('nbr_of_srvid')
cnt2_col('nbr_of_dslam')
output_format('sigma')
directed('false')
width_max(10)
width_min(1)
nodesize_max (3)
nodesize_min (1));
Visualise as a Graph
© Mark Whitehorn
![Page 50: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/50.jpg)
50
ART OF ANALYTICS
Chris Hillman
Yasmeen Ahmad
© Mark Whitehorn
![Page 51: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/51.jpg)
5151
51
© Mark Whitehornhttps://community.teradata.com/t5/Learn-Data-Science/The-Art-of-Analytics-Poster/ta-p/80316
![Page 52: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/52.jpg)
52
NoSQL database systems
•Document – Mexican Insurance
•Column Store – Sensor data
•Graph – Fraud detection
•ART OF ANALYTICS
![Page 53: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/53.jpg)
53
NoSQL database systems
•Document – Mexican Insurance
•Column Store – Sensor data
•Graph – Fraud detection
•ART OF ANALYTICS
“Eye of the Storm” The data is from a recent "twitter storm”, the 21st century playground bullying phenomenon where the “playground” is the social media space.
The eye shows the complete data set where you can see two distinct groups, the core in the centre defending the victim and the larger group outside that were in attack mode.
![Page 54: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/54.jpg)
54
NoSQL database systems
•Document – Mexican Insurance
•Column Store – Sensor data
•Graph – Fraud detection
•ART OF ANALYTICS
![Page 55: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/55.jpg)
55
NoSQL database systems
•Document – Mexican Insurance
•Column Store – Sensor data
•Graph – Fraud detection
•ART OF ANALYTICS
This data visualization is created using mobile phone subscriber calling patterns. Each dot (or node) represents a phone number that is called by a subscriber, the larger the node size the more often it is called. The lines (or edges) between nodes represent a call from one number to another.
![Page 56: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/56.jpg)
Copyright Mark Whitehorn
Graph – Neo4J
• Pros – excellent for examining relationships between objects, think:• Facebook
• Travel problems
• Customers
• Fraud
• Cons – rubbish at anything else
• Tipping points – the need to track nodes and edges
![Page 57: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/57.jpg)
Copyright Mark Whitehorn
Graph – Neo4J
• Schema applied when data stored
• But schema is light(ish) because all of the nodes and edges don’t have to store the same data
•Analytical rather than transactional (although ACID compliant).
![Page 58: It’s all about me…€¦ · 1 It’s all about me… Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk](https://reader034.vdocuments.us/reader034/viewer/2022052423/5f0e54c77e708231d43eb989/html5/thumbnails/58.jpg)