© blackboard, inc. all rights reserved. how we size the academic suite: benchmarking at blackboard...
TRANSCRIPT
© Blackboard, Inc. All rights reserved.
How We Size the Academic Suite: Benchmarking at Blackboard TM
Speaker: Steve FeldmanDirector, Software Performance Engineering and [email protected]
2
Agenda and Introductions
» Goals, Objectives and Outcomes» Introduction and Methodology» Results and Findings» Working with the Sizing Guide» References and Resources» Total Time: 50 Minutes
3
Presentation Goals
» The goals of this presentation are:» Explain the pre-process activities for preparing
to execute a benchmark against the Blackboard Academic Suite.
» Present the results and findings from our most recent benchmark activities.
» Review how we size the Blackboard Academic Suite from these benchmark exercises.
4
Presentation Objectives
» Define the study of behavior modeling.» Define the study of cognitive modeling. » Define the study of data modeling.» Introduce the concept of adoption profiling.» Share the benchmark objectives and associated test cases.» Present a case for using sessions per hour over concurrency
as an acceptable performance metric.» Review the differences between cost performance and high
performance.» Discuss techniques for monitoring and measuring workload
and growth.» Provide guidance around storage purchasing» Provide guidance around load-balancer purchasing
5
Presentation Outcomes
» At the end of the session, administrators will be able to do the following:» Put a plan together to determine current and
future adoption profiles.» Use the current sizing specification for upcoming
hardware expenditures.» Make recommendations back to the Blackboard
Performance Engineering team for more effective information sharing.
6
Part 1: Introduction and Methodology
7
The Performance Lifecycle
Com
plete E
nd
to En
d
Com
plete E
nd
to En
d
Perform
ance E
ngin
eering
Perform
ance E
ngin
eering
Refactoring and OptimizingRefactoring and Optimizing
End to End Performance TestingEnd to End Performance Testing
Modeling, Profiling and SimulationModeling, Profiling and Simulation
Data Collection & Usage AnalysisData Collection & Usage Analysis
Strategy,Strategy, MethodologyMethodology andand BestBest PracticesPractices
SPE MethodologySPE Methodology
8
A First Look at SPE» The Blackboard Performance Engineering follows a
strict methodology based on the principles of Software Performance Engineering (SPE).» Assess Performance Risk» Identify Critical Use Cases» Select Key Performance Scenarios» Establish Performance Objectives» Construct Performance Models» Determine Software Resource Requirements» Determine System Resource Requirements
» SPE is a methodology introduced by Dr. Connie Smith and Dr. Lloyd Williams» http://www.perfeng.com» Performance Solutions: A Practical Guide to Creating
Responsive, Scalable Software
9
Behavior Modeling» Behavior modeling is the study of user behavior within a system to
determine workload and use case interaction.» Develop Markovian Models to determine probability of the following:
» Use case interaction» Transactional execution» Session lengths
» Samples are taken based on the following:» Institutional type and profile
» K-12» Higher Education: Small, Medium and Large (Private/Public)» Consortium» Corporate and Government
» License Range: Basic, LS Only and Full Academic Suite» Periods of seasonality
» Pre-Semester» Enrollment» General» Exams» Post-Semester
10
Cognitive Modeling
» Cognitive modeling is the psychological study of systematic human behavior within a system to determine patterns of abandonment and adoption.» Abandonment: Concept for explaining the patience
of a given user and their willingness to wait for system responsiveness.
» Utility: Use cases can be sub-classed and organized based on importance.
» Uniform: Use cases are equally weighted.
» Adoption: Concept for explaining increased frequency of use and reliance on a given system.
11
Data Modeling» Data modeling is the study of linear and latitudinal
volumetric growth of data in a system.» Linear growth refers to vertical based growth in the form
of increased record counts.» Factors Affecting Linear Growth
» Increased adoption» Data management strategy (Need for Pruning and Archiving)
» Latitudinal growth refers to horizontal growth in form increased complexity and maturity of data.
» Factors Affecting Latitudinal Growth» Increased adoption» Maturity of processes
» Samples are taken bi-annually from all willing clients.
12
Establish Performance Objectives
» Regression Comparisons» Critical client facing impacts» Vendor sponsor requirements » Implications of new features and sub-systems» Technology Ports: Software-based» Platform Changes: OEM components» OEM tuning parameters» Key Stakeholder Requirements» Prototypes for system configuration changes» Other…Client Requests
13
Part 2: Academic Suite Benchmark Review
14
Release 7.X Performance Objectives» Performance Objective #1: Version 6.3 to 7.X Unicode
conversion operational downtime minimization.» Small data models: Minutes» Moderate data models: Hours» Large data models: Under 3 days
» Performance Objective #2: Regression performance from 6.3 to 7.X cannot degrade more then 5%, but rather should improve by 5% without configuration (hardware/software) manipulation.
» Performance Objective #3: Complex domain analysis.» Need to change data model to always support complex domains.
» Performance Objective #4: Technology port of Perl to Java for the Discussion Board sub-system.» Business case for final technology port.
15
Release 7.X Performance Objectives
» Performance Objective #5: Intel Multi-Core Analysis» Vendor donations and expected coverage in hardware guide.
» Performance Objective #6: Dell Blade Technology» Vendor donations and expected coverage in hardware guide.
» Performance Objective #7: Sun Multi-Core Analysis» Vendor donations and expected coverage in hardware guide.
» Performance Objective #8: Sun Cost Performance and High Performance Server Comparison» Vendor lab time and expected coverage in hardware guide.
16
Release 7.X Performance Objectives» Performance Objective #9: Tomcat Clustering
» Exploratory analysis for technical feature change
» Performance Objective #10: Networked Attached Storage for Databases and Database Server Binaries.» ASP request for cost efficient operational management
» Performance Objective #11: Windows Content Load-Balancing Solutions» Vendor request for technology change» Risk mitigation strategy.
» Performance Objective #12: Persistence Cache (OSCache) Configuration.» Exploratory analysis for configuration guidance
17
Performance ScenariosWorkload Summary/Description
Under-Loaded Learning System and Community System
Regression test case from 6.3 performing a mix of student viewing/activity, instructor authoring and minimal administrator management. Meant to be an under-loaded system. Response times < 5s.
Calibrated Learning System and Community System
Regression test case from 6.3 performing a mix of student viewing/activity, instructor authoring and minimal administrator management. Meant to be an calibrated loaded system. Response times < 10s.
Over-Loaded Learning System and Community System
Regression test case from 6.3 performing a mix of student viewing/activity, instructor authoring and minimal administrator management. Meant to be an over-loaded system. Response times < 15s.
Calibrated Academic Suite Combination of Learning System, Community System and Content System use case interactions to reflect the budding adoption of the full Academic Suite. Response times calibrated to under-loaded system comparison (~5s.)
Calibrated Learning System and Community System with Concurrency Model for Assessments
Combination of Learning System and Community System use case interactions with 40% of the workload in a controlled Assessment Concurrency Problem. Response times calibrated to under-loaded system comparison (~5s.)
Calibrated Academic Suite with Complex Domains
Identical workload to the under-loaded Learning System with Community System model, but with the definition of 50 complex domain relationships. Response times calibrated to under-loaded system comparison (~5s.)
18
Performance Scenarios
» All scenarios are targeted for single, dual and triple workload evaluations.» Load-balanced servers (1 to N Servers…Typically 3)» Tomcat clusters (1 to N Nodes)
» Performance calibration at the Application Server level is main focus» Pre-defined Application Configuration
» Calibrated for response time acceptance» Under-Loaded: sub 5 second (~ under 1 second)» Calibrated: sub 10 second (~ under 2 seconds)» Over-Loaded: sub 15 second (~ under 10 seconds)
19
Performance Objective #1Model Name Benchmark #1 Benchmark #2 (Min. Threads) Improvement
Small Institution (Sun Microsystems) 25 minutes 16 minutes (4 threads) 36%
Moderate Institution (Sun Microsystems)
309 minutes 130 minutes (4 threads) 58%
Large Institution (Sun Microsystems) 6360 minutes 2650 minutes (4 threads) 58%
Model Name Benchmark #1 Benchmark #2 (Min. Threads) Improvement
Small Institution (Linux) 21 minutes 12 minutes (4 threads)
Small Institution (Windows) 9 minutes Not Valid NA
Moderate Institution (Linux) 288 minutes 107 minutes (4 threads) 37%
Small Institution (Windows) 196 minutes Not Valid NA
Large Institution (Linux) 5389 minutes 2120 minutes 40%
Large Institution (Windows) 989 minutes Not Valid NA
20
Performance Objective #2Performance Objective #2: Regression Comparison
0
5000
10000
15000
20000
25000
30000
Windows Linux Solaris
Platform
Ses
sio
ns
Per
Ho
ur
6.3
7
7.1
21
Performance Objective #3Complex Domain Analysis
0
2000
4000
6000
8000
10000
12000
14000
16000
Test Run
Ses
sio
ns
Per
Ho
ur
Simple Single Server
Complex Single Server
Simple Dual Server
Complex Dual Server
Sessions Per Hour 8080 8102 13341 14113
Simple Single Server
Complex Single Server
Simple Dual
Server
Complex Dual
Server
22
Performance Objectives #7 and #9T2000 Analysis and Clustering
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
18,000
20,000
Configuration
Ses
sio
ns
Per
Ho
ur
Single Server 2 Nodes
Load Balanced 1 Node Each
Load Balanced 2 Nodes Each
23
Performance Objective #9
Cluster Comparison
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
6.3 Windows 7.0 Windows 7.0 Linux 7.0 Linux Cluster
Platform and Version
Ass
essm
ents
Per
Ho
ur
120
240
24
Performance Objective #7Workload R7.1 Entry-
LevelR7.1 Mid-
LevelR7.1 High-
LevelR7.1 HL
ClusteredR1 (Workload of
120 Possible Concurrent Simulations
Learning System/Communit
y System)
7238 Sessions/Hr19 UPL/Second
311,656 Bytes/Second
51,888 Transactions
8080 Sessions/HR22 UPL/Second
480,824 Bytes/Second
53,780 Transactions
8212 Sessions/HR22 UPL/Second
488,168 Bytes/Second
54,049 Transactions
10455 Sessions/HR
25 UPL/Second544,673 Bytes/Second59,239
Transactions(2 Nodes)
R2 (Workload of 240 Possible Concurrent Simulations
Learning System/Communit
y System)
12459 Sessions/HR
31 UPL/Second640,958
Bytes/Second87,433
Transactions
13341 Sessions/HR
34 UPL/Second729,616
Bytes/Second90,353
Transactions
14913 Sessions/HR
33 UPL/Second695,319
Bytes/Second94,181
Transactions
16063 Sessions/HR
45 UPL/Second968,128
Bytes/Second106,659
Transactions(4-Nodes)
R3 (Workload of 360 Possible Concurrent Simulations
Learning System/Communit
y System)
17288 Sessions/HR
42 UPL/Second901,103
Bytes/Second118,754
Transactions
18455 Sessions/HR
50 UPL/Second1,102,667
Bytes/Second130,811
Transactions
20343 Sessions/HR
51 UPL/Second1,145,440
Bytes/Second145,287
Transactions
24034 Sessions/HR
65 UPL/Second1,329,037
Bytes/Second157,629
Transactions(6-Nodes)
25
Performance Objective #7 (Cont.)Workload R7.1 Entry-
LevelR7.1 Mid-Level R7.1 High-
LevelR7.1 HL
ClusteredR7 (Workload of
200 Possible Concurrent
Simulations Full Academic Suite)
5721 Sessions/Hr13 UPL/Second
275,672 Bytes/Second
37,313 Transactions
12548 Sessions/Hr33 UPL/Second
728,082 Bytes/Second
84,004 Transactions
12974 Sessions/Hr
35 UPL/Second735,846
Bytes/Second84,970
Transactions
13804 Sessions/HR
36 UPL/Second763,955 Bytes/Second90,941
Transactions(4 Nodes)
R8 (Workload of 400 Possible Concurrent
Simulations Full Academic Suite)
11908 Sessions/Hr34 UPL/Second
668,189 Bytes/Second
77,553 Transactions
18857 Sessions/Hr53 UPL/Second
1,157,486 Bytes/Second
118,353 Transactions
14668 Sessions/Hr
32 UPL/Second676,802
Bytes/Second96,742
Transactions(2 Nodes)
24034 Sessions/Hr65 UPL/Second
1,392,037 Bytes/Sec157,629
Transactions(4 Nodes)
R9 (Workload of 600 Possible Concurrent
Simulations Full Academic Suite)
12652 Sessions/Hr25 UPL/Second
451,975 Bytes/Second
64,289 Transactions
23056 Sessions/Hr63 UPL/Second
1,196,553 Bytes/Sec149,709
Transactions
20207 Sessions/Hr
47 UPL/Second1,014,189 Bytes/Sec130,907
Transactions(3 Nodes)
27997 Sessions/Hr71 UPL/Second
1,527,433 Bytes/Sec181,121
Transactions(6 Nodes)
26
Performance Objectives #5, 6, 7, 8 and 10 Hardware Comparison
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
ASPBlades
PerfEng1850 3.0
GHz
PerfEng6650 2.8
GHz
PerfEngT2000
SolarisV490 1.5
GHz
Ses
sio
ns
Per
Ho
ur
Single Workload
Distributed Workload
27
Part 3: Working with the Sizing Guide
28
Determining My Adoption Profile» Read the Blackboard Capacity Planning Guide
» Capacity Planning Factors: A Must Read!» Determine your current Performance Maturity Model Position» Put together a business plan with functional and technical
stakeholders» Identify adoption goals for coming calendar year.» Identify application initiatives (if they don’t know, have them
pursue)» Feature Roll-Out» Sub-System Enablement» Change in adoption patterns
» Draft Service Level Agreements (SLAs)» Rank use cases in the system» Set a goal for your future Performance Maturity Model Position
» Probe your end-users and audience for acceptable response times.
29
Determining My Adoption Profile (cont.)» Look into the past to see the future
» Analyze data available and digestible» Application Log Files for usage by sub-system (trend analysis)» Application Log Files for site statistics: hits, files, pages, visits» Application Log Files or Network Statistics for bandwidth and utilization» Database Growth Statistics and Storage Usage
» Within Blackboard» Write simple database functions to determine linear and latitudinal state of
data» If you have historical back-ups, restore and compare against present state of
data» Study critical use cases for behavior characteristics
» Work together with the greater Blackboard Community!
» Evaluate Enterprise Monitoring and Measurement Tools» Coradiant: True Turn-Key Solution
» Enlist the Statistics or Computer Science Department for support with this analysis.
» Analysis should be done seasonally» System vitals should be reviewed weekly and monthly
30
Light Adoption Profile» Support peak workload of 1,000 to 20,000 sessions
per hour based on configuration.» Roughly 10 to 60 Unique Page Loads per Second
» Average page or download size of 100kb
» Often used as an external complimentary aid to the class.
» Low adoption institutionally» 15 to 35% of active courses/sections take advantage» Limited functionality
» Mostly content sharing» Little collaboration
» Over-Loaded to Calibrated Configurations Used
31
Moderate Adoption Profile» Support peak workload of 10,000 to 30,000
sessions per hour based on configuration.» Roughly 30 to 90 Unique Page Loads per Second
» Average page or download size of 500kb
» Critical application on campus behind e-mail.» Moderate adoption institutionally
» Highest among students» 35 to 50% of active courses/sections take advantage» Extensive functionality
» Advanced content sharing» Collaboration» In-Class models for assessment and content delivery
» Ideal target for Calibrated Environment
32
Heavy Adoption Profile» Support peak workload of 30,000 to 50,000
sessions per hour based on configuration.» Greater then 100 Unique Page Loads per Second
» Average page or download size of 100kb to 1mb
» Workload rivals some of the largest commerce sites.
» Heavy adoption institutionally» Institutional initiative to leverage Blackboard» Extensive functionality
» Advanced content sharing» Heavy integration and Building Blocks» Extreme collaboration» In-Class models for assessment and content delivery
» Optimal for Under-Loaded Configuration
33
Choosing the Right Hardware» Cost Performance Model
» Cost-Conscious Institutions» Calibrated to 10 second abandonment policy» Mostly Level 1 and 2 Performance Maturity
Model Institutions
» High Performance Model» Performance over cost (Institutional Goals for
Adoption)» Calibrated to 5 second abandonment policy» Mostly Level 3 through 5 Performance Maturity
Model Institutions
34
Reading Each Profile» Workload Characteristics
» Sessions Per Hour: Concurrency not a valid identifier, either is FTE counts.
» Unique Page Loads Per Second: Complimentary metric based on concurrent workload, not users.
» Homogenous configurations presented, but more a shift to heterogeneous configurations.
» Web/Application Tier» 1, 2 and 4 Socket systems presented based on CPU clock speed,
RAM and server counts.
» Database Tier» 1, 2, 4 and 8 Socket systems presented based on CPU clock
speed, RAM and server counts.» Real Application Clusters offered
35
Light Adoption Profile: Cost Performance» Resembles Calibrated to Over-Loaded Performance
Configuration» Requires a distributed configuration from start
» Application and database system» Recommend Load-Balancing from start
» Blackboard scales best horizontally (consider clusters as well)» Each application server will support 5,000 to 7,000 unique sessions
in an hour. » Blade or Pizza Box Model Most Efficient» 10,000 sessions per hour ≠ 10,000 users logged in
» Based on a queuing model in which about 250 unique users are authenticated at any time
» Each session is roughly 90 seconds in length» Disposable, trivial use cases
» Systems utilized no greater then 65% at the application tier during peak workload and 30% at the database tier.
36
Light Adoption Profile: High Performance» Resembles Calibrated Performance Configuration
» Requires a distributed configuration from start» Application and database system
» Requires Load-Balancing from start» Blackboard scales best horizontally (consider clusters as well)» Best performance when each application server supports 7,000 sessions
or less each.» Good Candidate for Clustering
» Blade or Pizza Box Model Most Efficient» 20,000 sessions per hour ≠ 20,000 users logged in
» Based on a queuing model in which about 500 unique users are authenticated at any time
» Distributed workload against load-balanced configuration» Each session is roughly 90 seconds in length» Disposable, trivial use cases
» Systems utilized no greater then 65% at the application tier during peak workload and 30% at the database tier.
37
Moderate Adoption Profile: Cost Performance» Resembles Calibrated Performance Configuration
» Requires a distributed configuration from start» Application and database system
» Requires Load-Balancing from start» Blackboard scales best horizontally (consider clusters as well)
» Blade or Pizza Box Model Most Efficient» Quad Server Model can be as effective» Multi-Core Technologies just as effective» Good candidate for Clustering
» 20,000 sessions per hour ≠ 20,000 users logged in» More complex use cases» Maturity in how the product is being used» Robust execution models: concurrency and queuing models
» Consider RAC for Cost Performance Alternative over Large Monolithic Deployment» NAS based storage just as effective and easy to manage.» Consider the investment now when it is manageable.
38
Moderate Adoption Profile: High Performance» Resembles Calibrated to Under-Loaded Performance
Configuration» Requires a distributed configuration from start
» Application and database system» Requires a Load-Balancing from start
» Blackboard scales best horizontally» Clustering will assist
» Systems are not saturated or even close to being saturated.» Consistent utilization greater then 65% is the limit.» Quad Socket with multi-core DB is optimal if not larger» Still a good candidate for RAC
» Consider RAC as early as possible» If not RAC, scale-up at the database tier.
» 30,000 sessions per hour ≠ 30,000 users logged in» Each session is roughly 300 seconds in length
39
Heavy Adoption Profile: Cost Performance» Resembles Calibrated to Under-Loaded Configuration
» Requires a distributed configuration from start» Application and database system
» Requires Load-Balancing from start» Blackboard scales best horizontally (consider clusters as well)
» 30,000 sessions per hour ≠ 30,000 users logged in» Each application server will support 7,000 sessions per hour
» RAC or Scale-Up Model on the Database» Windows clients need to make a decision around database
support for scalability» Database capable of supporting 50,000, but best
performance when only supporting 20k
40
Heavy Adoption Profile: High Performance» Resembles Under-Loaded Configuration» Requires a distributed configuration from
start» Application and database system
» Requires a Load-Balancing from start» Blackboard scales best horizontally (consider
clusters as well)
» 50,000 sessions per hour ≠ 50,000 users logged in» RAC or Scale-Up Model
41
Sizing Storage» Determine rate of growth of key tables
» QTI Tables (ASI and Result)» Portal Extra Info» Discussion Board» Course Content» Users and Course Users» Activity Accumulator
» Determine rate of growth from a data file perspective in the database.» Monthly projections Goal of determining patterns
» Determine rate of growth from a file system perspective.» Monthly projections Goal of determining patterns
» Nail down a strategy for data retention and archival.» Research the following:
» http://www.spec.org/sfs97r1/results/sfs97r1.html» http://www.storageperformance.org
42
StorageNumber
of Existing Courses
Number of
Existing Users
File System Size Ratio of File System to Database
Storage
500 7,000 20 GB 10:1
5,000 50,000 200 GB 5:1
50,000 300,000 800 GB 4:1
500,000 600,000 Greater then 1 TB 3:1
43
Load-Balancer Support
» The guide is fairly agnostic so as long as the load-balanced device supports session affinity.
» Blackboard as an organization advises on two vendors in particular:» NetScaler (Used In ASP)» Juniper Networks (Used in Product Development)» F5 Big IP (Formally used in ASP)
44
Part 4: References and Resources
45
References» Blackboard Academic Suite Hardware Sizing
Guide (Behind the Blackboard)» Performance and Capacity Planning Guidelines
for the Blackboard Academic Suite (Behind the Blackboard)
» http://www.perfeng.com» http://www.spec.org/sfs97r1/results/sfs97r1.html » http://www.storageperformance.org» http://www.coradiant.com» http://www.quest.com» http://www.bmc.com
46
Past Presentations of Note» B2 2006: How We Size the Academic Suite, Benchmarking at Blackboard
» B2 2006: Deploying Tomcat Clusters in an Advanced Blackboard Environment
» 2006 BbWorld Presentation: Practical Guide to Performance Tuning and Scaling (2 Hour Workshop)
» B2 2005: Introduction to Load Testing, A Blackboard Primer
» B2 2005: Performance Testing Building Blocks
» Users Conference 2005: Managing Your Blackboard Deployment for Growth and Performance
» Users Conference 2005: Applied Software Performance Engineering
» B2 2004: Introduction to Software Performance Engineering
» B2 2004: Profiling Building Blocks for Performance Analysis
47
Questions?