milestone 1 workshop in information security – distributed databases project access control...

26
Mileston e 1 Workshop in Information Security Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and Ilia Oshmiansky 1 Project web site : http://infosecdd.yolasite.com

Upload: lionel-harvey

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

1

Milestone 1

Workshop in Information Security – Distributed Databases Project

Access Control Security vs. Performance

By: Yosi Barad, Ainat Chervin and Ilia Oshmiansky

Project web site ::// . .http infosecdd yolasite com

Page 2: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

2

Milestone 1:

Install and run Cassandra

Install and run YCSB++

Initial testing of Cassandra

Run benchmark tests

Install Accumulo

Our Plan:

Page 3: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

3

We have installed Cassandra on the lab computers

Plan Step:

Install and run Cassandra

Page 4: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

4

Cassandra database is configured and capable to run in 2 different modes:

1) One cluster consisting of one node which manages all the keys and values in the database.2) One cluster consisting of two nodes which share the keys values and they manage and store 50% of the database each.

Plan Step:

Install and run Cassandra

Page 5: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

5

• We have installed and built the YCSB++ source code

• We used YCSB++ with the "basic" database configuration supplied ,in order to test the benchmark framework

Plan Step:

Install and run YCSB++

Page 6: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

6

• We used Cassandra client shell in order to create keyspaces, column families, add a column within a family and for storing and retrieving key names and values.

• Cassandra supplies statistics for these manual operations so we could get the idea of how much time each operation consumes.

Plan Step:

Initial testing of Cassandra

Page 7: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

7

• We used Cassandra-10 client binding supplied by the YCSB++ database in order to connect to the Cassandra database.

• We ran some core benchmark tests and the results are further detailed later on in this document.

Plan Step:

Connect YCSB++ to Cassandra and run benchmark tests

Page 8: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

8

• First we ran the tests from one client pc to a Cassandra server consisting of a single node.

• Next we added another Cassandra node and re-conducted the same tests.

Plan Step:

Connect YCSB++ to Cassandra and run benchmark tests

Page 9: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

9

We ran the tests for these reasons:1. Establish a baseline by which future results (post implementation

of cell level ACL) will be judged.2. Establish the maximal throughput of Cassandra on a single node.3. Compare the performance of a Cassandra with one node to

Cassandra with two node.

Plan Step:

Connect YCSB++ to Cassandra and run benchmark tests

Page 10: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

10

• We created several scripts to automate the test. For example script that would:

1 )run all the different workloads YCSB++ offers with different numbers of threads2) Create an output file with the relevant results

Plan Step:

Automate the testing procedure

Page 11: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

11

Plan Step:

Automate the testing procedure

Page 12: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

12

We used the core workloads that are included with the YCSB installation and ran them all 8 times each .

Each time we increased the number of threads.

Workload A: Update heavy workload - mix of 50/50 reads and writes.

Plan Step:

Page 13: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

13

Workload B: Read mostly workload – This workload has a 95/5 reads/write mix.

Plan Step:

Page 14: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

14

Workload C: Read only - This workload is 100% read.

Plan Step:

Page 15: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

15

Workload D: Read latest workload - In this workload, new records are inserted, and the most recently inserted records are the most

popular .

Plan Step:

Page 16: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

16

Workload E: Short ranges - In this workload, short ranges of records are queried, instead of individual records.

Plan Step:

Page 17: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

17

Workload F: Read-modify-write - In this workload, the client will read a record, modify it, and write back the changes.

Plan Step:

Page 18: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

18

• We noticed a general degradation in performance regarding the Cassandra 2 nodes configuration • We assume it is due to the synchronization overhead between the two nodes.

• More work has to be done in order to explain these results. (see plans ahead)

Plan Step:

Connect YCSB++ to Cassandra and run benchmark tests

Page 19: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

19

• We have installed, configured and ran - apache Zookeeper and apache Hadoop as they are prerequisites for the Accumulo database.

• We made sure it works by performing several basic operations using the client shell

Plan Step:

Install Accumulo

Page 20: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

20

Progress Compared to Plan:

Milestone 1

Plan Step Status

Install and run Cassandra Complete

Install and run YCSB++ Complete

Run some initial manual testing of Cassandra Complete

Connect YCSB++ to Cassandra and run benchmark tests Complete

Install Accumulo Complete

Page 21: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

21

Milestone 1

1. Extend our Accumulo and Cassandra setups to include several clusters-

This stage is critical in order to get real meaningful test results and for finding security holes in the later stages.

Plans for ahead

Page 22: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

22

Milestone 1

2. Improve our testing environment-This stage includes the following:

a) Write our own workloads (with ACL)b) Run several clients simultaneously c) Edit the test configurations according to our test plan

(technical details)

Plans for ahead

Page 23: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

23

Milestone 1

d) Run diverse tests to understand the limiting factors in each test (might be the testing equipment, CPU-time, disk I/O, network limitations, synchronization overhead between nodes and much more). and if possible - change the setup to eliminate this limiting factor.

e) Analyze the CPU and disk usage of the machines to understand the results better.

Plans for ahead

Page 24: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

24

Milestone 1

3. Get into the Cassandra code and start the cell-level ACL implementation-

There are two main options:a) Sending JSON strings as part of the HTTP requests then

storing them in Cassandra.

Plans for ahead

Page 25: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

25

Milestone 1

b) Adding simple strings like: "(Alice, rx) (Bob, rwxo) (Charlie, rx) ..." we can store in Cassandra as is and when Alice will try to read a file from Cassandra we will check that the ACL allows her to do so.

Plans for ahead

Page 26: Milestone 1 Workshop in Information Security – Distributed Databases Project Access Control Security vs. Performance By: Yosi Barad, Ainat Chervin and

26

• We managed to complete the milestone as planned • Moreover, we succeeded in extending the system to two nodes.This is quite a breakthrough given the difficulties we experienced with the installations. And it brings us that much closer to achieving the goal in milestone#2, which is running a system consisting of several clusters.

Milestone 1

Overall