benchmarking interactive social networking actions
Post on 31-Dec-2015
33 Views
Preview:
DESCRIPTION
TRANSCRIPT
Benchmarking Interactive Social Benchmarking Interactive Social Networking ActionsNetworking Actions
Shahram GhandeharizadehShahram GhandeharizadehDirector of Database LabDirector of Database LabComputer Science DepartmentComputer Science DepartmentUniversity of Southern CaliforniaUniversity of Southern California
Outline
Motivation Research questions
Survey use cases BG Benchmark FORSEE Future research
Motivation Data Stores
Cloud Services
Person-to-person cloud services
Research Questions
What is the tradeoff between alternative data models? E.g., Is JSON superior to the relational
data model?
How do alternative architectures compare with one another? E.g., Is cache augmented SQL as good as
a document/extensible store?
Do NewSQL data stores scale as well as NoSQL data stores?
Survey Use Case
S. Barahmand and S. Ghandeharizadeh. BG: A Benchmark to Evaluate Interactive Social Networking Actions. CIDR ‘13, Asilomar, CA.
Data Model
Accounts
Friend
MembersPages
Follow
Resources Own
Share
Share
News Feed Displays Ownd
BG Architecture
Scalable
Emulates User Behavior
Service Level AgreementQuick and Efficient Rating
Visualization Tool
S. Barahmand and S. Ghandeharizadeh. Expedited Benchmarking of Social Networking Actions with Agile Data Loading Techniques. CIKM ‘13, SF, CA.
http://bgbenchmark.org
Good Benchmark = FORSEE
Focus on an important debate & provide relevant metrics to facilitate progress.
One number to describe alternative designs/solution.
Runs in a reasonable amount of time.
Scalable.
Effective abstraction with meaningful requests.
Extendible.
Good Benchmark = FORSEE
F
One number to describe alternative designs/solution.
Runs in a reasonable amount of time.
Scalable.
Effective abstraction with meaningful requests.
Extendible.
+ Unpredictable data
Good Benchmark = FORSEE
F
O
Runs in a reasonable amount of time.
Scalable.
Effective abstraction with meaningful requests.
Extendible.
+ Unpredictable data
SoAR
Good Benchmark = FORSEE
F
O
R
Scalable.
Effective abstraction with meaningful requests.
Extendible.
+ Unpredictable data
SoAR
4 months to rate =1 Week to rate =
Good Benchmark = FORSEE
F
O
R
S
Effective abstraction with meaningful requests.
Extendible.
+ Unpredictable data
SoAR
4 months to rate =1 Week to rate =
Good Benchmark = FORSEE
F
O
R
S
E
Extendible.
+ Unpredictable data
SoAR
4 months to rate =1 Week to rate =
Only when two members are NOT friends!
Good Benchmark = FORSEE
F
O
R
S
E
E
+ Unpredictable data
SoAR
4 months to rate =1 Week to rate =
Only when two members are NOT friends!
FORSEE = PREDICT
Good Benchmark = FORSEE
F
O
R
S
E
E
+ Unpredictable data
SoAR
4 months to rate =1 Week to rate =
Only when two members are NOT friends!
A good benchmark helps settle debates
quickly to enable its discipline to make rapid progress.
Future Research: Data Sciences
Challenge: Wide variety of science applications with diverse debates.
Hypothesis: A benchmark generator.
BenchmarkGenerator
ER diagram
Actions & their dependencies
Key Metrics
Application (data science)
SpecificBenchmark
Future Reseach Evaluate the hypothesis using BG.
Extend to other data science applications.
BenchmarkGenerator
Unpredictable data
Big Data: Operations
Simple Complex
Off-line
Interactive
Ad-hoc
Pre-specified
Big Data: Google Analytics
Simple Complex
Off-line
Interactive
Ad-hoc
Pre-specified
1. Gather click stream data: Optimized for writes, 2. Compute aggregated data: MapReduce/Hadoop
Objective:1. Advertising ROI2. Frequency of access to pages
Big Data: Google Analytics
Simple Complex
Off-line
Interactive
Ad-hoc
Pre-specified
1. Gather click stream data: Optimized for writes, 2. Compute aggregated data: MapReduce/Hadoop3. Enable users to view aggregated data.
Objective:1. Advertising ROI2. Frequency of access to pages
Big Data: Facebook
Show profile page of Farah FawcettFollow Barak ObamaFriend Lady Gaga
Simple Complex
Off-line
Interactive
Ad-hoc
Pre-specified
3 Vs: Facebook High Volume:
1.2 billion user profiles, 150 billion friend connections, 1.13 trillion likes, 17 billion tagged locations, 240 billion photos, ….
High Velocity: 700 million active users daily, 4.5 billion likes daily, 350
million photos uploaded daily, …
High Variety: Mix of data types: Structured records, multimedia content,
text.
Source: http://expandedramblings.com/index.php/by-the-numbers-17-amazing-facebook-stats/ posted on Oct 6, 2013.
Expertise/Contributions
BG Benchmark to evaluate performance of alternative data stores: SQL, NoSQL, NewSQL.
http://bgbenchmark.org
A high performance CASQL solution that minimizes software development life cycle.
KOSAR, a prototype of a CASQL solution.
Simple ComplexOff-line
Interactive
Ad-hoc
Pre-specified
BG, http://bgbenchmark.org
Joint work with Sumita Barahmand Benchmark for interactive social
networking actions. Consists of 11 actions:
CASQL
Joint work with Jason Yap. Key insight: Query result look up is
faster than query processing. Contribution is physical data
independence in CASQL systems: Transparent caching Serial schedules Detection of race conditions and
prevention of inconsistent states.
KOSAR
Joint work with Reihane Boghrati, Lakshmy Mohanan and Neeraj Narang.
A software prototype of CASQL Scalable Highly available Elastic
Boosts performance of a leading industrial strength RDBMS vendor from 2 actions per second to more than 300,000 actions per second.
BG Coordinator
Delta Analyzer
BGClient 2 BGClient NBGClient 1
Experiment
Load
Agile Data Loading Techniques
Experiment
…
Data Store Server
…
top related