tpc-h in mongodb

17
TPC-H in MongoDB Aung Thu Rha Hein(g5536871)

Upload: aung-thu-rha-hein

Post on 10-Nov-2014

1.442 views

Category:

Technology


2 download

DESCRIPTION

Run TPC-H queries in MongoDB and benchmark against MySQL RDBMS

TRANSCRIPT

Page 1: TPC-H in MongoDB

TPC-H in MongoDB

Aung Thu Rha Hein(g5536871)

Page 2: TPC-H in MongoDB

Agenda• Introduction to MongoDB• TPC-H Data Setup• Schema• Advantages and Disadvantages of New Schema• Queries

o Pricing Summary Recordo National Market Share Queryo Total Supplier Queryo Potential Part Promotion Queryo Suppliers who kept orders waiting queryo Global Sales Opportunity Query

• Benchmark result• Discussion • Demonstration

Page 3: TPC-H in MongoDB

Introduction to MongoDB

• Open source, document-oriented and schema-free• Store data in BSON format• Easy to understand• Flexible, Scalable & lightweight• Ease of use• No ‘join’ operation

• SQL to MongoDB Sample Query• Select * from users where status = “A” ORDER BY USER_ID DESC

• db.users.find( { status: "A" } ).sort( { user_id: -1 } )

Page 4: TPC-H in MongoDB

TPC-H Data Setup• Import data into MongoDB

o Use MongoVue to import from MySQLo Time consuming and difficult

• To achieve flexibility:o Embedded all tables into single collectiono Replace all foreign keys with objects from lineitem tableo Choose lineitem table because of

• No primary keys

Page 5: TPC-H in MongoDB

Schema• Final Schema of TPC-H in MongoDB

lineitemOrder CustomerNation Region PartsuppPart supplier N R

Page 6: TPC-H in MongoDB

Advantages and Disadvantages of New

Schema• Advantages

o Easier to understand than SQL schemao One document: one recordo No need to join tables

• Disadvantageso Higher memory usageo Update operation becomes more demandingo Converting to BSON takes time o Require lot of computational powero Only around 300,000(5%) count of lineitem able to

convert

Page 7: TPC-H in MongoDB

Queries• Select 6 queries to run on MongoDB with Map-

Reduce & Aggregation Framework• Compare the result with MySQL

PROBLEMS• Outputs are not the same because of failure

during converting data• Aggregation framework is still in development

Page 8: TPC-H in MongoDB

Q1: Pricing SummaryRecord Query

Page 9: TPC-H in MongoDB

Q8:National Market Share Query

Page 10: TPC-H in MongoDB

Q15:Top Supplier Query

Page 11: TPC-H in MongoDB

Q20:Potential part Promotion Query

Page 12: TPC-H in MongoDB

Q21:Supplier who kept order waiting

Page 13: TPC-H in MongoDB

Q22:Global Sales Opportunity

Page 14: TPC-H in MongoDB

Benchmark result • All benchmarks run on Intel Core i7-3610QM 2.30GHz 6MB

cache,4GB DDR3,750GB 7200 RPM,Win64 system

• Query1

• Query 8

• Query15

6.1 sec

0.2 sec

MongoDB

MySQL 0.1 sec

1.6 sec

0.4 sec

0.7 sec

MySQL

MongoDB

MongoDB

MySQL

Page 15: TPC-H in MongoDB

Benchmark result(cont.)

• Query 20

• Query 21

• Query 22

MongoDB

MySQL

1.1 sec

174.4 sec

MongoDB

MySQL

6.2 sec

5.5 sec

MongoDB

MySQL

7.6 sec

0.8 sec

Page 16: TPC-H in MongoDB

Discussion & Conclusion

• MongoDB left behind in all querieso Design problemo Aggregation framework problemo No standard Query Languageo Server side query processing is not the nature of NoSQLo Complex SQL cannot convert easily

• Only suitable for Applications:o Business card databaseo Web Blogo Applications without complex transactions

Page 17: TPC-H in MongoDB

Demonstration