high-performance database technology for rock-solid iot solutions
Post on 07-Jan-2017
528 Views
Preview:
TRANSCRIPT
High-performance database technology
for rock-solid IoT solutions
Gints Ernestsons, Clusterpoint founder
LATA conference, 28.01.2016.
Key facts about Clusterpoint
Founded: 2006
Team size: 32
Engineering: 25
Privately held, VC backed
4.8 m investments to date
Product: database software
Market share: 100s of installations
Partners: 7, Cloud partners: 2
Cloud DBaaS : from Q1/2015
Founder,Visionary
Gints Ernestsons
CTO, Founder
Jurgis Orups
DB Software Architect
Janis Sermulins
CEO
ZigmarsRasscevskis
BusinessDev Director
PeterisJanovskis
Key Personnel
15 years CTO in Lursoft; 8 years CEO in Clusterpoint;25 years as a technology entrepreneur and investor
8 years in Google; Engineering manager of the Web search backend (Zurich); IMO silver medal
9 years runs Clusterpoint core software engineering team, expertin C/C++, NoSQL, Big data search
5 years in Google
MSc from MIT;
Intel Research (USA) IOI 2x Gold medallist;
12 years in Oracle; Alliance & Channel Director Central and East Europe
AlgorithmsArchitect
Martins Krikis
4 years in Intel
(USA), 4 years
in Tieto;
PhD from Yale
University; Lecturer on Algorithms
Selected list of our customers and partners
Ousting ORACLE, Microsoft SQL, MySQL and SEARCH platforms in 24/7 services
We operate cloud database infrastructure in Europe and USA
Dallas, US
Riga, Europe
Already > 5000 users, only 8
months in a program
Cloud DBaaS started in Q1/2015
Gartner, Inc. forecasts that 6.4 billion connected things will be in use worldwide in 2016, up 30 percent from 2015, and will reach 20.8 billion by 2020. In 2016, 5.5 million new things will get connected every day.
0
5000
10000
15000
20000
25000
2014 2015 2016 2020
Internet!of!Things!Units!Installed!Base!by!Category!(Millions!of!Units)!| !Gartner!Nov!2015
Consumer Business:!Cross-Industry Business:!Vertical-Specific
0500100015002000250030003500
2014 2015 2016 2020
Internet!of!Things!Endpoint!Spending!by!Category!(Billions!of!Dollars)!| !Gartner!Nov!2015
Consumer Business:!Cross-Industry Business:!Vertical-Specific
Explosion of IoT data is inevitable: we are at the very beginning!
Product: hybrid operational database, analytics and search platform
Secure, high-performance, distributed data management at scale
Hyper converged platform that uses open standards
XMLJSON
SIEM
WEB HPC
DWH
OLAP
OLTP
Use casesMB ► GB ▶ TB ▶ PB
ACIDTEXT
HybridSQL
We solve performance problems where relational databases fail
Blazing fast performance
Unlimited scalability
Bulletproof transactions, instant text search and security
Reduces your TCO by 80% over your database life-time
Up to 1000x faster MB ► GB ▶ TB ▶ PB
ACID
Distributed architecture delivers high-performance computing
CLUSTERPOINTRDBMS
Tim
e
Reliability of legacy RDBMS without its complexity, at 1000x its speed
Simultaneous execution of
parallel computing tasks using
fast & secure transactions
All-in-one platform: DBMS, SEARCH, one API and one COST
Document database with JavaScript/SQL + high
performance transactions
Search platform with data relevance ranking, including full-text &
geospatial data
Scalable high-availability distributed computing (sharding,
replication)
Real-time online web and mobile analytics in Big data (no need for
map-reduce)
Bulletproof ACID transactions
(patent filed, US)
No systems integration requiredCustom “stitching” all platforms
Kill complexity! Boost performance! Nail search! Cut your cost!
RDBMS w ACID-transactions
ONE API:
JS/SQL
Cut 80% off your TCO
Up to 1000x faster
High availability shards, replicas
Online analyticsplatform
Search platform, full-text index
Tons of your integration efforts and application “spaghetti” code
Budget for 100-users company, in $
Commercial RDBMS + SEARCH
Open source RDBMS + SEARCH
Clusterpoint database
DBMS software license (enterprise edition) 14 000 0 0
DBMS software maintenance (3 years) 20% / 3 x 2800 DIY / 0 3 x 7200
SEARCH PLATFORM (SEARCH) license 10 000 0 0
SEARCH PLATFORM maintenance fee (3 year) 20% / 3 x 2 000
DIY / 0 0
DBMS client software access licenses (100 users)
10 000 0 0
DBMS + SEARCH integration through custom application software code (developer months)
3m / 15 000 6m / 30 000 1m / 5000
DBMS high-availability clustering option orcustom HA integration (developer months)
2m / 10 000 4m / 20 000 0
Operate & scale integrated DBMS + SEARCH + custom application code (developer months)
9m / 45 000 18m / 90 000 0
Replace 2 software platforms with 1 to decrease your TCO by 80%
MySQL Multiple Bugs Let Remote Users
Access and Modify Data and Deny Service
Security Tracker
Attackers targeting Elasticsearch remote code execution hole
The Register
US Department of Homeland Security
Calls On Computer Users to Disable Java
Forbes
The Odd Couple: Hadoop and Data Security
ZDnet
Major security alert as 40,000 MongoDB databases left unsecured on the internet
InformationAge
By using multiple platforms, your security problems are snowballing
Bash bug leaves Linux
users shellshocked
WindowsSecurity
Manage all your data, indexes and replicas with solid security
Ordinary relational SQL database
Big data cluster, replicas, backups
All your mission-critical data in one DBMS, analytics and search platform
XML
JSON
ONE API:
JS/SQL ACID transactions
Search and analytics data/indexes
BLOB
Develop your application software code scalable from day-one
OPEX, TCO
Database life-cycle
Save > 80% WRITE ONCEand decrease life-time cost of your web or
mobile application
Test Year 1 Year 2 Year 3 Year 4 Year N
replica 1
replica 2
replica 3
Why pay extra for high-end features? Use out-of-the-box!
LOAD BALANCING
FAULT-TOLERANCE HIGH-AVAILABILITY
SCALE OUT ABILITY
Why document-oriented database architecture? Flexibility!
Easily includes other data models: tables, text, pictures, graphs, links etc
Manage all your data in open industry standards:
XML and JSON
Life time
Ordinary RDBMS: cost of changes escalates with software stack
10 204015
Cumulative cost
ORM45 d
Search + 90 d
Analytics & Reporting + 6 months
High availability clustering + 1 year
35
75
OPEX cost
Relational database ( ORM software model )
Launch
5
40
Document database: cost of changes goes down to minimum
2040
Cumulative cost
Life time
HA+45 d
Search + 90 d
Analytics & Reporting + 6 months
Document model (de-normalization)
1 year (rebuild application)
7060
OPEX cost
Document database ( XML / JSON data model )
75
5
Launch
Ordinary database stores individual measurements (1000s per meter)
Smart IoT meters: storing data in documents vs database raws
Document database stores all data on individual meters as rich text objects
Fast degrading performance
Billions of measurements
Millions of smart metersMeter Time Volts Amps Cost
1 10:00 220 0.25 0.052 10.00 230 0.50 0.103 10:00 180 0.30 0.03... ... ... ... ...1 10:15 240 0.65 0.102 10.15 230 0.50 0.103 10:15 180 0.30 0.03... ... ... ... ...
Instant search Top performance
Meter A day, a month or a year(s) data1 00:00 { ... } ... 10:00 {220 0.25
0.05 } 10:15 { 240 0.65 0.10 } 10:30 { ... } ... 23:45 { ... } address
2 00:00 { ... } ... 10:00 {230 0.50 0.10 } 10:15 { 230 0.50 0.10 } 10:30 { ... } ... 23:45 { ... } ... name
3 00:00 { ... } ... 10:00 { Not available } 10:15 { Signal loss } 10:30 { ... } ... 23:45 { ... } ... photo
Ordinary database indexing model
<id>
<title>
</title>
indexes
Full database content indexing
Automatically create and maintain fast full-text search index
Web-style free text SEARCH and analytical JS/SQL queries
Complex queries requiring steep learning curve
SQL query: tens of seconds Our query: milliseconds
RANKING INDEX
Your original data in documents
Index tree is organized into a graph, enabling you to set up your own search ranking (weighting)
rules
Distributed storage architecture
words
strings
numbers
dates
names tags
values
relations
XML &
JSON
Ultra-fast database index for ranked search and online analytics
RANKING
INDEX%
Ranking index delivers endless scale out ability to your data
Organized as a modular graph, it enables to distribute data and computing
MB ►
GB
▶ TB ▶
PB
Ordinary databases overload and overwhelm users with data
Two main performance problems with ordinary databases
Disrupt your competition with fast and relevant full-text search
Use ranking
Relevance of search results
Free text queriesat subsecond latency
Programmable filter that delivers superior search relevance
Having
billions of
data?
Scientists: ranking is a game-changing technology in databases
Very Large Data Bases
Conference
7th International Workshop on Ranking in Databases, 2013
“ the sheer amount of data makes it almost impossible to process queries in the traditional compute-then-sort approach ”
“ Facing explosion of data ... the user would be overwhelmed by too many unranked results “
Map application
Address Product
AddressProduct
Company Company
100% 100%
75% 75%
50% 50%
Ranking delivers superior search experience in your database
Shop application
Same data, different ranking
rules for your free text search
queries (think voice in future)
Least relevant
Address
Company
Easily configure your own ranking rules for your business needs
Category
Most relevant
Product
Your own data items (fields) in your XML or JSON database
100%0% 50%
100%0% 50%
100%0% 50%
100%0% 50%
100%0% 50%
When free text search hits data with higher rankings, results are sorted up-front
Simple, super-fast,
user-friendly web-style SEARCH
Enjoy instantly relevant search in your data using only free text
Plain text:
Phrases:
Wildcards:
Patterns:
java developer London
“John Smith”
Joh* Smi* or “John Smi*”
John Sm?th
Substitutes: John Sm[iy]th
<query>
</query>
Two problems solved
w1^100% w2^+30% w3^-20%
integer 0 ..... 232 ( when tag weightings are equal )
With ranking you can implement ranked pagination: 1 2 3 . . .
RANKING INDEX
Real-time Big Data SEARCH
milli- seconds
<id>
<title>
<document>
</title>
50%
Body 10% Comments
100%
Ranking your database structure
Title
Ranking your documents
Ranking your search query terms
..w1...w2 ........ w3 ........
Ranking density of context
hits
Ranked pagination (1 2 3 ..) solves information overload problem
Limited screen estate
Limited network bandwidth
Limited waiting time
by users
Fast and relevant database search in your web and mobile applications
Page: 1 2 3 4 5 more
Constant query latency enables real-time Big data search and analytics
PBGB TB
MB
Milliseconds for a JavaScript/SQL query in Clusterpoint database
Minutes ... hours
for a SQL query in
legacy RDBMS
Scale to billions of documents without search performance loss
RANKING INDEX
XML
JSON
Clusterpoint Cloud Database as a Service (DBaaS)
We safely and efficiently manage your databases for you AND
We instantly scale on-demand!
Our cloud computing is using cost-efficient on-demand model
Cost-EfficientModel, $
Resources
Time
Conventional Provisioning
Model, $
Save 3x-10x
DB
top related