the evolution of the architecture of web applications requires a load testing revolution
DESCRIPTION
Long-held best practices of n-tier application architecture are being challenged. Old design patterns are new again. This session discusses the architecture evolution of Web applications, load testing approaches, and the future.TRANSCRIPT
The Evolution of the Architecture of Web
Applications Requires a Load Testing Revolution
Imad Mouline – CTO – Compuware APM
@imadmouline
Where The Data Comes From
Synthetic monitoring and load testing scripts from 3,000 enterprises,
across broad number of verticals
Number of distinct scripts ranges from 45K to 68K, depending on data
mining exercise
150+ backbone / data center / cloud testing locations, and thousands of
Last Mile / desktop testing locations
Real user monitoring measurements from 200+ sites
Users from around the world
Measurement samples range from 117 million to 526 million pages / user
interactions, depending on data mining exercise
Observation 1
Web applications are becoming more composite
The Client Is Becoming THE Integration Platform
* Source: Gomez 2010
By The Numbers – February 2010
Measurementcity
Number of hosts per user transaction
Hong Kong 5.51
Beijing 6.91
London 7.46
New York 8.17
Frankfurt 8.66
Number of hosts accessed directly by the browser, per user transaction, averaged across 3,000 companies
By The Numbers – June 2010
Number of hosts accessed directly by the browser, per user transaction, averaged across 3,000 companies
Measurementcity
Number of hosts per user transaction
Hong Kong 7.56
Beijing 8.57
London 8.59
New York 8.85
Frankfurt 8.87
By The Numbers – September 2010
Number of hosts accessed directly by the browser, per user transaction, averaged across 3,000 companies
Measurementcity
Number of hosts per user transaction
Hong Kong 6.82
Beijing 8.87
London 7.95
New York 9.82
Frankfurt 8.71
Paris 10.12
Stockholm 10.48
Helsinki 12.71
By The Numbers – November 2010
Number of hosts accessed directly by the browser, per user transaction, averaged across 3,000 companies
Measurement city Number of hosts per user transaction
Hong Kong 5.50
Paris 6.27
Amsterdam 6.90
London 7.25
Frankfurt 7.45
Beijing 9.10
Stockholm 9.61
New York 10.50
Helsinki 11.57
Observation 1a
Enterprises Are Adopting the Cloud(with or without their knowledge)
Amazon EC2 Region Percentage
EC2 Asia Pacific - Singapore 0.002
EC2 US West - Northern California 0.659
EC2 EU - Ireland 2.733
EC2 US East - Northern Virginia 16.194
TOTAL 19.588
Web Applications Are Moving To The Cloud – June 2010
Percentage of web app transactions that include at least one object hosted on Amazon EC2
Amazon EC2 Region Percentage
EC2 Asia Pacific - Singapore 0.151
EC2 EU - Ireland 1.578
EC2 US West - Northern California 2.066
EC2 US East - Northern Virginia 24.144
TOTAL 27.938
Enterprises ARE Adopting Cloud Computing – Nov 2010
Percentage of web app transactions that include at least one object hosted on Amazon EC2
Amazon EC2 Region Percentage
EC2 Asia Pacific - Singapore 0.151
EC2 EU - Ireland 1.578
EC2 US West - Northern California 2.066
EC2 US East - Northern Virginia 24.144
TOTAL 27.938
Observation 2
Content is becoming increasingly dynamic and distributed
Geographic Distribution Of Content Sources
How many cities does content come from to form the average transaction?
134%
2-536%
6-1013%
11-2011%
21-304%
>302%
Source: Gomez Active Backbone MonitoringSample of 12,000 production monitoring scriptsMultiple runs over 24 hours
Distribution of host cities by test (measured from multiple locations)
127%
2-531%
6-1015%
11-2012%
21-307%
>308%
1
2-5
6-10
11-20
21-30
>30
Distribution of host cities by test (measured from single location)
Observation 3
Content is becoming increasinglyinter-dependent
Browser Impact on Response Time
Response times differences across Firefox and IE for a 6-step transaction
Internet Explorer 7
Firefox 3.5
Major Differences In IE And Firefox Waterfall Charts
IE 7 Waterfall Chart Firefox 3.5 Waterfall Chart
Connection 1
Connection 2
Connection 3
Connection 4
Connection 5
Connection 6
Connection 7
Connection 3
Connection 2
Connection 1
Connection 6
Connection 5
Connection 4
Connection 7
Connection 8
Connection 9
Connection 10
Connection 12
Connection 11
Performance & Availability Issues Can Be Browser Specific
Internet Explorer 7
Firefox 3.5
Performance issue impacting Internet Explorer 7
Issue is 3rd party content blocking on IE only
Load Testing with HTTP Playback: Testing from NYC vs. Atlanta
The load order is the same using HTTP- From NY or Atlanta.
Some of the measurements are different.
Load Testing with IE Playback: Testing from NYC vs. Atlanta
The load order is different between Atlanta and NY
Observation 4
Processing is increasingly being pushed to the client
Significant Performance Differences Across Browsers
0
1
2
3
4
5
6
7
Load Time Perceived Render
Source: Gomez Real-User Monitoring Real users around the worldBroadband connections only
(October 2010)526 million page measurements200+ sites
Browsers Are Evolving To Support Heavier Client-Side Code
HTML5 support
Application cache canvas, audio, video, local storage, geo-location, web workers etc.
CSS3 Support
Webfonts, animations, gradients, shadows, etc.
Performance improvements
Faster JavaScript processing
Parallel download of JS scripts
More parallel connections
Resource pre-fetching
Multi-threading in JS
Key Trend - more and more client-side processing
RIA Frameworks Adoption
25.18 % of transactions surveyed depend on at least one of these frameworks
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
Percentage of transactions that leverage framework
Source: Gomez Active Backbone Monitoring ~ 3,000 enterprises48K+ distinct transactions active at least once during a 1 hr time period
Observation 5
Mobile Users Are Becoming Less Patient
Web & Mobile Site Performance Impacts User Behavior
0
5
10
15
20
25
30
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Ab
and
on
me
nt
Rat
e (
%)
Page Load Time Band (sec.)
Abandonment Rate -All Browsers
Abandonment Rate -iPhone Safari
Source: Gomez real user monitoring
Abandonment Rate Across 200+ Sites / 177+ Million Page Views over 1 week / All Browsers vs. iPhone Safari
0
2
4
6
8
10
12
Load Time Perceived Render
Significant Performance Differences Across Browsers & Devices
Source: Gomez Real-User Monitoring Real users around the worldBroadband & wireless connections only
(October 2010)526 million page measurements200+ sites
Recap of Observations
1. Web applications are getting more composite The browser is becoming the integration platform
of choice Cloud-hosted app components are mainstream
2. Content is becoming increasingly dynamic
3. Content is becoming increasingly inter-dependent
4. Processing is moving to the client
5. Mobile users are becoming less patient
Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Web application
The Traditional View of Web Application Delivery
Systems management
tools: “OK”
…user is happy
Traditional zone of control
Users
Traditional zone of control
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Slow response time
The Reality of Web Application Delivery
Transactions fail
Faulty display or operation
…user is NOT happy
4 sec’s
22 sec’s
Geographic disparities
Traditional zone of control
Server
Traditional Approach
NetworkDB
The Application Delivery Chain
The Challenge of Managing Application Performance
Local ISP
Mobile Carrier
Browsers
Devices
MajorISP
Content Delivery Networks
3rd Party/Cloud Services
Customers
Employees (via WAN)
Employees
Mainframe
Storage
Data Center
Web Services
Mobile Components
Virtual Desktops
Web Servers
App Servers
DB Servers
Load Balancers
Virtual/physical environment
WANOptimization
Network
The Application Delivery Chain…user is NOT happy
The Challenge of Managing Application Performance
Server
Traditional Approach
Local ISP
Mobile Carrier
Browsers
Devices
MajorISP
Content Delivery Networks
3rd Party/Cloud Services
Customers
Employees (via WAN)
NetworkDB
Employees
Mainframe
Storage
Data Center
Web Services
Mobile Components
Virtual Desktops
Web Servers
App Servers
DB Servers
Load Balancers
Virtual/physical environment
WANOptimization
Network
• Network peering problems
• Outages
• Inconsistent geo performance• Bad performance under load• Blocking content delivery• Incorrect geo-targeted content
• Configuration issues• Oversubscribed POP• Poor routing optimization• Low cache hit rate
• Network peering problems
• Bandwidth throttling
• Inconsistent connectivity• Configuration
errors• Application
design issues• Code defects• Insufficient
infrastructure
• Poorly performing JavaScript
• Browser/device incompatibility
• Page size too big
• Too many objects
• Low cache hit rate
• Network resource shortage
• Faulty content transcoding
• SMS routing / latency issues
The Application Delivery Chain
Simplification of the Problem is Key
Local ISP
Mobile Carrier
Browsers
Devices
MajorISP
Content Delivery Networks
3rd Party/Cloud Services
Customers
Employees (via WAN)
Employees
Mainframe
Storage
Data Center
Web Services
Mobile Components
Virtual Desktops
Web Servers
App Servers
DB Servers
Load Balancers
Virtual/physical environment
WANOptimization
Network
The Application Delivery Chain
Test Across the Entire Web Application Delivery Chain
Local ISP
Mobile Carrier
Browsers
Devices
MajorISP
Content Delivery Networks
3rd Party/Cloud Services
Customers
Employees (via WAN)
Employees
Mainframe
Storage
Data Center
Web Services
Mobile Components
Virtual Desktops
Web Servers
App Servers
DB Servers
Load Balancers
Virtual/physical environment
WANOptimization
Network
Load Testing 1.0
Load Testing 1.5
Load Testing 2.0
MajorISP
Local ISP
Mobile Carrier
Internet
Content DeliveryNetworks
3rd Party/Cloud Services
Browsers and devices Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Company: Online presence for a popular TV show
• Following episodes of the TV show the web site sees high traffic spikes
• Goal was to achieve 1500 logins per minute
• Load tested DB to improve performance in anticipation of another traffic spike
Load Testing 1.0 Works For Specific Situations
• As users were added, the
response time of step 3 (the
login) climbed immediately
• The test bottlenecked at 160
logins per minute (Goal 1500)
• But quickly dropped off as
users received server errors
• New login query was not
optimized and was
bottlenecking the
database servers’ CPUs
Application Bottleneck Causes Response Time Issue
Summary:•Problem found inside firewall•Fixes made for application issue•Retest shows second issue-bandwidth
•First test
•Second test
•After tuning- application performance improved.•New Bottleneck occurred 1300 logins per minute.•Bandwidth limit reached at 90 Mbps
Application Bottleneck – Re-test
1.0 1.5 2.0
1.0 1.5 2.0
MajorISP
Local ISP
Mobile Carrier
Internet
Content DeliveryNetworks
3rd Party/Cloud Services
Browsers and devices Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Company: Online Gaming Site
Testing a new rollout in support of a new sports season
• Support anticipated traffic increases
• Load tested system using cloud and Last Mile to validate performance for real users in new geographies.
Load Testing from the Cloud misses end-user perspective
View from the Cloud
• First 20 minutes Cloud testing shows acceptable performance
• After 2500 users, Response time climbs, Availability drops, Error rate climbs
Summary:
Cloud-only testing may give misleading availability data
Cloud starts with 100% availability
Less than 25% for the Last Mile
View from the Last Mile
• Last Mile shows different story
• Availability is terrible even at minimal load for real users
1.0 1.5 2.0
Company: Regional Online News Source
• Began testing for the election season
• Goal was to validate overall performance focusing in 2 key regions
MajorISP
Local ISP
Mobile Carrier
Internet
Content DeliveryNetworks
3rd Party/Cloud Services
Browsers and devices Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Load testing 1.0 and 1.5 miss regional issues
1.0 or 1.5 load testing shows tests passed
Page response times stayed under 4 seconds, outside of one brief blip
There was only 1 page error and 11 errors total out of 60000+ transactions
Increase and hold load and not exceed response times of 4 seconds and Success Rate of 99%
No Performance Issues Detected From Data Center
Summary:
Last Mile shows goal not reached
Cloud can’t detect the end user issue
Last Mile Case Study: Primary Geographies
Key geographies for this customer are New York and Pennsylvania.
The response time never met the 4 second average goal
Availability was Less than 99%
1.0 1.5 2.0
MajorISP
Local ISP
Mobile Carrier
Internet
Content DeliveryNetworks
3rd Party/Cloud Services
Browsers and devices Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Company: International Hotel chain
• New reservations system rollout
• New global server load balancing rolled out across multiple data centers
• Validate that system works globally
The Internet is global – where your customers are matters
Major Hotel Reservation System unavailable in 4 countries
0% availability in UK, Germany, Japan
99%+ availability in US, Canada, France
Summary:
Internal U.S. test looked good
Distributed testing fails in key locations.
1.0 1.5 2.0
?
Company: eRetailer fashion
• 100% virtual store
• Daily sales spike driving 90% of revenue stream
MajorISP
Local ISP
Mobile Carrier
Internet
Content DeliveryNetworks
3rd Party/Cloud Services
Browsers and devices Users
Storage
Web Servers
App Servers
DB Servers
Mainframe
Load Balancers
Mobile Components
Network
Load Testing 2.0 shows you what your customer will see
Load Testing with multiple browsers shows discrepancies
Availability vastly different between browsers
Comparison of Performance across the country - Firefox
Using Firefox browser – shows 100% availability for websiteWide variations in response time based on geography
Comparison of Performance across the country – IE
IE Browser : shows under 12 percent availability Availability and performance tied to geography
Page Element Downloads: IE Versus Firefox- Order Varies
Summary:
Only full browsers in target locations can show what really happens.
1.0 1.5 2.0
Root Cause:•Third party ad provider modifying the DOM •Depending on the load order of the third party the JavaScript in the ad would overwrite the DOM but only on IE
Load Testing Approaches : Which one is best for you?
HTTP : Behind the
FirewallHTTP : Data Centers Browser : Data
CentersReal World Desktops
Last Mile
Traditional Client/
Server TestDatacenter Testing
Accuracy of End-UserResponse Time
Incomplete Incomplete Indicative Most Accurate
Accuracy of Application Availability
Invalid Indicative Indicative Most Accurate
Ability to drive large load volume
Yes-requires substantialhardware
Best Better Good
Understand CDNImpact
No Misleading Misleading Most Accurate
Understand 3rd Party (ads, feeds, etc…)
No Minimal Some Most Accurate
Realistic object download
No NoStatic Only
Yes Yes
Visibility behind the firewall
Best Good Good Good
Load Test 1.0 Load Test 2.0Load Test 1.5
So what? Now what?
Broaden the definition of your web application
Test early, test often
Know your end-users, and test from their perspective
Pay attention to all 4 buckets: Data Center, Internet, 3rd Parties, Client
Look for the breaking point of the end-user experience, not just the breaking point of the application infrastructure
See you at CMG'11
Dec 5th - 9th, 2011
Washington, DC