071510 sun b_1515_feldman_stephen_forpublic

35
Scaling Blackboard for Large Scale Distance Learning Communities Steve Feldman, [email protected]

Upload: steve-feldman

Post on 10-May-2015

931 views

Category:

Documents


6 download

DESCRIPTION

2010 BbWorld presentation on Going Virtual with a 100% online presence.

TRANSCRIPT

Page 1: 071510 sun b_1515_feldman_stephen_forpublic

Scaling Blackboard for Large Scale Distance Learning

Communities

Steve Feldman, [email protected]

Page 2: 071510 sun b_1515_feldman_stephen_forpublic

online learning * Learning that takes place partially or entirely over the Internet.

Page 3: 071510 sun b_1515_feldman_stephen_forpublic

The Online Momentum Shift •  66% of degree-granting post-secondary institutions in

the US offer online, hybrid/blended online and other distance education courses.1

•  Over 4.6 million students were taking at least one online course during the fall 2008 term; a 17 percent increase over the number reported the previous year.2

•  The 17 percent growth rate for online enrollments far exceeds the 1.2 percent growth of the overall higher education student population.

•  By 2020, 50% of high school students will take an online course.1

3  

Page 4: 071510 sun b_1515_feldman_stephen_forpublic

Communities are Getting Larger

•  State and County Initiatives

•  Consortium Programs and strategic alliances between institutions.

•  Content distribution networks

•  New sources or revenue to reach markets and students that were not historically accessible –  Non-traditional students are

being marketed to

Page 5: 071510 sun b_1515_feldman_stephen_forpublic

Stakes are Getting Higher •  Competition for funding by government

•  Competition for revenue by students

•  Learning modality changing with each technological innovation

•  User expectations and online behavior changing constantly

•  Hours of availability fighting toward mission critical –  Often VLEs identified as 24x7 mission

critical systems, but resources to support are more like 8 x 5

Page 6: 071510 sun b_1515_feldman_stephen_forpublic

Connected  Learning  Modality    

Large  Ac3ve  Communi3es  

Heavy  Adop3on  of  Advanced  Tools  

Extended/Frequent  Time  in  System  

Richer  Content  and  

User  Experience  

What are we modeling… Hundreds to Thousands Concurrent Sessions

Emphasis on Asynchronous & Synchronous Collaboration

Longer ClickStreams & Disposable Access

Larger pages, graphics/video, client-side interactions

Performance  

Availability  

Scalability  

Page 7: 071510 sun b_1515_feldman_stephen_forpublic

scalability* The ability for a distributed system to expand by accommodating greater levels of load while maintaining similar levels of performance.

Page 8: 071510 sun b_1515_feldman_stephen_forpublic
Page 9: 071510 sun b_1515_feldman_stephen_forpublic

Scalable Deployments •  Emphasis on adoption of virtualization technologies

–  Virtualization technology transparent to guest OS and application.

–  Why: Take advantage of CPU and Memory expansion •  Emphasis on fast provisioning

–  Provisioning technology such as Dell AIM, VMWare deployment technology and XenServer deployment technology

–  Why: Solved problems to minimize human error and fast deployment.

•  Emphasis on diskless systems –  Hardware is just “rented” space for CPU, Memory and

Network. –  Why: Speed of network and storage so fast, why be

dependent on “wired” solutions.

Page 10: 071510 sun b_1515_feldman_stephen_forpublic

performance* The amount of useful work accomplished by a computer system compared to the time and resource used.

Alternative Definition: Response time plus latency.

Page 11: 071510 sun b_1515_feldman_stephen_forpublic
Page 12: 071510 sun b_1515_feldman_stephen_forpublic

Responsive Deployments •  Large 64-bid address space…

–  It’s cheaper today than 4 years ago –  Technology is heading this direction –  It’s not a bad thing…

•  Plentiful CPU worker threads… –  Use only which you need –  Take advantage of hyperthreading and MT technology –  Partition via virtualization

•  Many bigger…distributed environments

•  Continuous maintenance –  If you want to make your systems remain fast, you have to

“service” the roads. Lots of litter and potholes out there.

Page 13: 071510 sun b_1515_feldman_stephen_forpublic

What is Performance? •  Performance is quantifiable and measureable

•  Performance is also perception

•  Mostly recognized from a cognitive perspective –  Instantaneous –  Immediate –  Continuous –  Captive

Response  Time   Latency   Performance  

Page 14: 071510 sun b_1515_feldman_stephen_forpublic

Realistic Approaches to Achieve Performance •  Eliminate interface and resource contention.

–  Better to have more capacity than queuing •  Know your user behavior.

•  Optimize for the saturated and low-bandwidth network conditions. –  Enable Compression –  Optimize Images –  Cache Static Content

•  Large JVM memory allocations are not a bad thing, but rather something to expect with Java-based applications. –  Large JVM (4GB to 16GB) with aggressive options you understand.

•  Two keys to the database –  Continuous maintenance –  Understand the key queries and how the CBO handles

Page 15: 071510 sun b_1515_feldman_stephen_forpublic
Page 16: 071510 sun b_1515_feldman_stephen_forpublic

availability* The capability to service a functional request without issue under conditions of desired performance and workload scalability.

Page 17: 071510 sun b_1515_feldman_stephen_forpublic

What is Availability? •  High-availability offerings mask the effects of a

system failure in order to minimize the impact of access and functional use of a system to a community of users.

•  Simple Definition: –  Percentage of time the system is in its operational state.

•  You will often hear the concept of 3x9’s, 4x9’s or even 5x9’s –  Planned versus Unplanned

•  Availability = (Total Units of Time – Downtime) / Total Units of Time –  8760 hours in a year –  Downtime = 10 hours –  Availability = (8760 – 10)/8760 = 99.88%

Page 18: 071510 sun b_1515_feldman_stephen_forpublic

Quick View into Availability Statistics Availability  Percentage  Model   Unexpected  Down8me  per  Year  

90%   36.5  days  

95%   18.25  days  

98%   7.30  days  

99%   3.65  days  

99.5%   1.83  days  

99.8%   17.52  hours  

99.9%   8.76  hours  

99.95%   4.38  hours  

99.99%   52.6  minutes  

99.999%   5.26  minutes  

99.9999%   31.5s  

Page 19: 071510 sun b_1515_feldman_stephen_forpublic

Realistic Views of Availability •  If the application is not functioning as expected, but you

can login, is it available? –  Perception versus Reality –  If it’s slow, do my users feel just as bad as if they received an

error? •  How do you plan for unexpected?

–  Practice really does make perfect •  Do I treat the calendar from a date and time perspective

differently from an availability perspective? –  Will my users cause problems if I take the site down during low

usage periods/dates? –  Will the users even know that something happened? –  Can I recover fast enough?

Page 20: 071510 sun b_1515_feldman_stephen_forpublic

Realistic Approaches to Achieve Availability •  Strategically picking redundancy in the architecture.

–  Servers and storage make sense to a degree –  Monitoring makes sense –  Do advanced clustering architectures really make a difference? –  Do the costs of a dedicated DR facility and site make sense?

•  Choosing the right initiatives based on the resources available to manage –  Don’t set your administrators up to fail. –  If you don’t have the capabilities on-site, don’t be skeptical of

outsourcing the problem. •  Balance costs over goals

–  Choose the right places to put your pennies. –  Make the business drive the decision…it’s their money!

Page 21: 071510 sun b_1515_feldman_stephen_forpublic

Deployment: Availability

•  VLEs are different beasts today then in the past. –  Communities are bigger –  Sessions last longer –  Content is richer –  Key point: Adoption is greater and users expect their sites up 24 x

7 x 365 •  Architecture is designed for many parallel instances of the

product scaled in a horizontal fashion. –  Distributed physical deployments –  Virtualization is a key element

•  Database failover more important than horizontal database scalability. –  Emphasis on vertical database scalability

Page 22: 071510 sun b_1515_feldman_stephen_forpublic

Deployment: Advanced Monitoring

•  Measurement is the secret sauce for successful deployments. –  Most reliable and scalable deployments measure beyond

the server infrastructure •  Different types of measurements

–  System/Environmental measurements –  Business measurements –  Synthetic measurements

•  Collecting is only part of the prize –  Need to analyze the data to drive business decisions from

the data.

Page 23: 071510 sun b_1515_feldman_stephen_forpublic
Page 24: 071510 sun b_1515_feldman_stephen_forpublic

Lifecycle of Measurement

Define  Metrics:  Goal  SeVng  

Iden3fy  Method  of  Gathering:  Isolate  Tools  and  Processes  

Implement  Instrumenta3on:  Begin  Measuring  

Align  to  KPI/ROI:  Share  with  Stakeholders  

Recommend  Changes:  Show  Business  Value  

Reset  Expecta3ons:  New  Ini3a3ves  

Page 25: 071510 sun b_1515_feldman_stephen_forpublic

Different Types of Monitoring

Synthe3c  Monitoring  

Real  User  Monitoring  

Performance  Forensic  Monitoring  

Page 26: 071510 sun b_1515_feldman_stephen_forpublic

What is Synthetic Monitoring?

•  Automated monitoring technique to measure the functional behavior of a system, sub-system or component.

•  Typically a scheduled activity used to measure the availability, responsiveness and functional attributes of a common application scenario.

•  Can be executed from any access point to the system in question, both internal or external.

•  Also considered “Active” Monitoring of a system

•  Not intended to supply load, but rather perform sampling of performance and availability

•  Two methods: –  HTTP Simulation or Real Browser Emulation

Page 27: 071510 sun b_1515_feldman_stephen_forpublic

Tools for Synthetic Transactions •  You can really use any form of HTTP emulation tool

like JMeter, Grinder, MSTS, LoadRunner, SilkPerformer, SOASTA, etc…

•  Some monitoring software systems like Foglight, SiteScope, Nagios, CA IntroScope, Argent Defender

•  External services: Keynote, Gomez (Compuware), WebMetrics, AlertSite, Pingdom, SiteUpTime

•  Browser based solution: Selenium

Page 28: 071510 sun b_1515_feldman_stephen_forpublic

Strategies for Synthetic Transactions •  Site and Host Ping Tests should run on a multi-

second basis (15s to 30s)

•  Common, yet critical paths targeting functional systems for availability should run on a continuous interval (x < 5 minutes).

•  Complicated paths focusing on performance and availability should run every 30 to 60 minutes.

•  Repeated tests when desired SLA or outcome not achieved

Page 29: 071510 sun b_1515_feldman_stephen_forpublic

What is Real User Experience Monitoring?

•  Passive web monitoring that observes web traffic to measure the user experience.

•  Provides both quality of service and responsiveness metrics in order to gauge service levels of performance and availability.

•  Typically a continuous activity watching silently in a parallel channel or as a pass through channel.

•  Able to capture characteristics about the entire HTTP stream to be used for forensics and user incidents.

•  Most vendors package as an appliance, but beginning to see the rise of “virtual” appliances.

•  Synthetic monitoring is just not enough…

Page 30: 071510 sun b_1515_feldman_stephen_forpublic

Tools for RUM Monitoring •  Dominated by commercial vendors who have a niche in

web performance and/or application performance management. –  Quest FxM –  Coradiant TrueSight –  Oracle Real User Experience Insight –  Tealeaf –  CA/NetQoS

•  Rise in new tools coming from network equipment vendors like Cisco, Opnet and Citrix/NetScaler

Page 31: 071510 sun b_1515_feldman_stephen_forpublic

Strategies for RUM Monitoring •  Identify areas of dense usage in order to highlight

performance, availability and functional experience in most common components of system.

•  Start with a wide lens of traffic watching and slowly narrow the area of focus to minimize the “purge” of data.

•  The “purge” of data is going to happen, so be prepared to move the data out of the system into an alternative repository. –  Some of the vendors have already solved this problem via an

Enterprise Data Warehouse (eg: Coradiant BI) •  Most of these tools can show

–  Time 2 First Byte, Host Latency, Network Latency and E2E •  Avoid the trap of focusing on Time 2 First Byte

–  You are serving an entire application from client to server

Page 32: 071510 sun b_1515_feldman_stephen_forpublic

What is Performance Forensic Monitoring? •  Deliberate instrumentation approach to capture

performance characteristics about an application deployment.

•  Measures resource and interface statistics not typically visible from the application directly.

•  Provides data points about application code execution that can be tied down to both the user and/or the application component.

•  Can’t measure everything, but can sample consistently. –  Certain data points can be captured on a continuous basis such

as Java/J2EE container statistics

Page 33: 071510 sun b_1515_feldman_stephen_forpublic

Tools for Forensic Monitoring •  Recommended tool sets tie the PFM tool with the RUM

tool. –  Foglight FxM seemless integration with Foglight Application

Cartridges and Database Performance Analysis –  Coradiant TrueSight integration with Dynatrace APM (Coradiant

AV) –  CA NetQoS integration with CA Wily IntroScope –  Oracle RUE Insight with Oracle Enterprise Manager for

Applications and Databases. •  Limited supply of open source tools that can perform a

fraction of the functionality. –  No known integrations with RUM tools –  Point based tools per container (not aggregators) –  Example tools: JConsole, Java VisualVM

Page 34: 071510 sun b_1515_feldman_stephen_forpublic

Strategies for Forensic Monitoring •  Measure the essentials such as container interfaces and

resources.

•  Most vendors have rule agents to begin sampling with a greater degree of instrumentation when certain rules are broken.

•  Retain statistics for extended periods of time (greater than 1 year) for annual, month, weekly, daily and hourly comparison purposes.

•  Construct trending thresholds for alert purposes to invoke a planning exercise in advance of an incident. –  Yes application forensics can be used for trending purposes for

events in the future as they are based on events in the past as points of reference.

Page 35: 071510 sun b_1515_feldman_stephen_forpublic

Please provide feedback for this session by emailing [email protected].

The subject of the email should be title of this session:

Scaling Blackboard for Large Scale Distance Learning Communities