high performance query engine in clay
DESCRIPTION
TRANSCRIPT
tachyon technologies
A High Performance Query Engine
Written in Clay
tachyon technologies
What is Clay?
Tachyon’s General Purpose programming language
Combines the advantages of scripting languages (easy to learn and code) with the advantages of system languages (fast execution
Allows developers to write high performance, small footprint, reusable code effortlessly
tachyon technologies
Motivation behind the Query Engine
Traditional RDBMS is too slow for non-trivial queries; This affects productivity and overuses computing resources
Database access is often the performance bottleneck in most business applications
Designing high quality RDBMS based solutions is complex and needs expert DBAs
Querying an RDBMS doesn’t fit with the natural style of programming in any language
Design of Traditional RDBMS involves a trade off between the database size and the performance
Normanized databases offer optimal size, but suffer from poor response time to queries; Denormalized databases offer reasonable response time to queries but are oversized
tachyon technologies
The Clay approach to Data Modeling
Uses memory mapped files and not RDBMS style tables
Paged data structures are allocated inside memory mapped files
Data represented as tables in RDBMS is simply represented as persistent (residing on the storage devices) vector (linear array of variable size) of records
Creating, appending, deleting and modifying this persistent data is same as handling any in memory data (using variables) in Clay
No more complex than manipulating a contact list in memory; The Query Engine shall memory map the data and hence reading and writing of data on the storage device is transparent to the programmer
Supports natural, programming style logic for data access
tachyon technologies
An example – for benchmarking
An intelligence agency wishes to tracks its agents and their missions (assassination) around the world
Things to keep track of - missions, agents, locations, countries
Non-trivial Queries to be answered
2 million records loaded from CSV dump to Clay, MySQL and Oracle
tachyon technologies
Example, Contd.
• Queries
• Agent with the most kills across all their missions
• Agent with best success to failure ratio• Countries with the most number of missions, • Countries with the most number of kills
tachyon technologies
Results of Benchmarking
TimeSpeed Factor
MySQL 18.62s 1.000
Oracle XE 6.00s 3.103
Clay 0.20s 93.100
Total time taken to answer all the 4 queries
tachyon technologies
Advantages of the Clay approach
Fast by nearly 2 orders of magnitude
Easy to code
Developers can handle the data model and the data access in code
Markedly lower power consumption
Reduced turnaround time for mission critical operations
If querying is all that is needed, then no need for a DB or DBA
tachyon technologies
Possible use cases
Improve the querying performance of applications by a huge margin
Replace complex ETL (extract, transform, load) scripts in data warehouses
Implement real time warehousing
Implement efficient mining algorithms
Note: Clay only implements a query engine and is not a replacement for all the functionality provided by a database, for instance - transactions
tachyon technologies
Contact Details
KS Sreeram
CEO and Founder, Tachyon Technologies
Phone: +91-9900244074
Email: [email protected]
Bharath Rao
VP Marketing, Tachyon Technologies
Phone: +91-9902706060
Email: [email protected]
tachyon technologies
Thank You