shopzilla - performance by design

56
Performance By Design A look at Shopzilla's path to high performance Tim Morrow, Senior Architect 11/3/2009

Upload: tim-morrow

Post on 22-Apr-2015

8.073 views

Category:

Technology


4 download

DESCRIPTION

A look at Shopzilla's path to high performance

TRANSCRIPT

Page 1: Shopzilla - Performance By Design

Performance By DesignA look at Shopzilla's path to high performance

Tim Morrow, Senior Architect

11/3/2009

Page 2: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Agenda

• Our Company and what we do

• Design approach

• Detailed site architecture

• Architecture evolution

• Front end performance techniques

2

Page 3: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

3

Shopzilla, Inc. - Online Shopping Network

100M impressions/day

20-29M UV’s per Month

8,000+searches per second

100M+Products

Page 4: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

4

Background

• In 2000, the business was simple - bizrate.com in the US

• Co-brands, API and shopzilla.com

• Multivariate testing, Europe and multiple data centers

Page 5: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Growth

5

Page 6: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Original Architecture

6

Page 7: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Performance and Scalability Issues

• High latency due to sequential resource access

• Data fetching sometimes O(n)

• Lengthy Time To First Byte

• Memory constrained

• Lacked visibility into performance issues

7

Page 8: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Development Issues

• Serving 9+ web site experiences

• Different look-and-feel; localization

• New features carry high risk

• Shared code-base limited team autonomy

8

Page 9: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

9

Our Build Approach

• Start over, stay simple

• Build 2 weeks at a time, deliver every page ASAP

• Manage risk by managing exposure

Page 10: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

New Design Principals

• Simplify layers

• Decompose architecture

• Define SLAs

• Continuous performance testing

• Utilize caching

• Apply best-practice UI performance techniques

10

Page 11: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

New Site Architecture

11

Page 12: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Site Technologies

• Java 1.6, Tomcat 6

• Spring MVC

• TAL-like templating engine

• Services are JAX-RS implemented with Apache CXF

• Hibernate 3 with Ehcache

• Oracle 10g Database

• Oracle Coherence Data Grid

12

Page 13: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

13

Performance SLAs

Page 14: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Web Application Tier14

Service Calls

Page 15: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Web Application Tier

15

Page 16: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

16

www.bizrate.com/digital-cameras

Page 17: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Decomposed Services

http://…/service/v3/reviews?pid=419943686&show=30&sort=newestFirst

17

<ProdRevResponse> <ProductReviews> <ProductReview> <UserRating>8.0</UserRating> <Title>advanced and begining photographer alike will benefit</Title> ... </ProductReview> </ProductReviews></ProdRevResponse>

Page 18: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Service Invocation

18

• Connection pooling

• Stale connection checking

• Hardware load balancers

• Connection and Socket Timeouts

• O(1) invocations on a single page

• JAXB XML->Java unmarshaling

Page 19: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

How Do We Performance Test?

• Highly Concurrent requests

• Dozens of services

• Database access

• Search Engine invocations

• Meta-data lookups

19

Page 20: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Performance Testing Environment20

Page 21: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Analyzing Performance

• Emit server-side performance information

10.61.35.25 198.133.178.17 - - [15/Jun/2009:18:48:13 -0700] "GET /digital-cameras/ HTTP/1.1" 200 195063 - - www.bizrate.com unique_id=c52efc5b-740b-44d4-8693-587b6b756564!rst=22

start=1245116893247;elapsed=19;requestId=c52efc5b-740b-44d4-8693-587b6b756564;startDate=2009-06-15 18:48:13.247-0700;url=http:///services/content/v7/topsearchService/BR/US/402/0/21;

start=1245116893248;elapsed=24;requestId=c52efc5b-740b-44d4-8693-587b6b756564;startDate=2009-06-15 18:48:13.248-0700;url=http:///search/v5/US/12/product_search?keyword=digital+cameras

• Build server-side call graphs

21

Page 22: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Profiling

• YourKit Java Profiler

• Identify Syncronization Bottlenecks

22

Page 23: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Production Performance Monitoring

23

• JMX Mbeans

• Graphite graphing

Page 24: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Caching

• Started with a simple replicated local cache

• Cached data stored in the service process

• Cannot scale to large data sets

• Read-through caching not always suitable

• Moved to Oracle Coherence

24

Page 25: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Oracle Coherence

• Distributed data grid

• Dynamic cluster membership

• Automatic data partitioning

• Continuous availability

• Computation in the grid

25

Page 26: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Case Study #1 – URL Builder

• Rule set describing most optimal URL structures

• High frequency access from site

• Backoffice system to compute rules

26

Page 27: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Coherence Solution

27

Page 28: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Results

• Continue to meet performance SLAs

• In production for over 6 months

• Very successful project

28

Page 29: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Case Study #2 - Keyword Metadata

• Map of long IDs to object model describing landing page

• Over 600 million entries

• Entry point for the majority of our traffic

• Home-grown partitioned, distributed cache

29

Page 30: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Challenges

30

Time Consuming and intricate

Restarts Required

Page 31: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Solution

31

Automatic read-through

Automatic Expiration

Scalable

Page 32: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Results

• Faster acquisition of new paid placements

• No restarts

• Less software to maintain

• Great performance

32

Page 33: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Cache Architecture

• 6 physical instances

• 8-way, 32 Gb RAM

• 16 JVMs with 1.5 Gb heap

• Distributed Cache

• Database read-through

• LRU expriy based on object count

33

Page 34: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Performance Testing Coherence

• Performance test application against the grid

• Test scenarios such as server loss and data population

34

Page 35: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

UI Performance Techniques

• 80% of time spent rendering the page

• Yahoo currently list 34 best practices

35

Page 36: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Minimize HTTP Requests

• Combined files

• CSS Sprites http://spriteme.org/

36

Page 37: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Use a CDN

• Move your content closer to end users

• Reduce latency

• Every resource except for dynamic HTML

• Offloads 100s of gigabytes per day

37

Page 38: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Expiry, Compression and Minification

• Expiry headers instruct the Browser to use a cached copy

• > 2 days considered “Far Futures”

• Use versioning techniques to allow forced upgrades

• Compressing reduces page weight

• Minifying may still reduce size by 5% even with compression

38

Page 39: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Reduce DNS lookups

• Yahoo recommends 3 – 4 DNS lookups per page

• Base page: www.bizrate.com

• Javascript & CSS: file01.bizrate-images.com

• Static images: img01.bizrate-images.com

• Dynamic images: image01.bizrate-images.com

• 3rd party Ads are a different story

39

Page 40: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Avoid Redirects

• Redirects delay your ability to server content

• We strive for zero redirects

• Exceptions:

– Redirect after POST

– Handling legacy-styled URLs

– Links off-site for tracking purposes

40

Page 41: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Use a Cookie-free domain

• Don’t send cookies when requesting static resources

• Buy a separate domain name

– bizrate-images.com

• Saves many Kb of upload bandwidth

• Revenue increased by 0.8%

41

Page 42: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Do Not Scale Images in HTML

• Don’t request larger images only to shrink them

• We utilize a dynamic image scaling server

• CDN caches and delivers exact image size

42

Page 43: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Make favicon.ico small and cacheable

• Can interfere with download sequence

• 2Kb+ multi-layered version:

• 318 byte version:

• We save 100s of megabytes per day

43

Page 44: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Flush the Buffer Early

• Delivers content to users sooner

• By default Tomcat flushes every 8Kb of uncompressed content

• Investigate more proactive flushing

44

Page 45: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Web Performance Measurement

• Continuous monitoring of full page load performance

45

Page 46: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

46

Can you spot our release dates?

shopzilla.com 10/13/08

bizrate.com 10/17/08

Page 47: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

47

Yeah, yeah… but did we make money?

Page 48: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

48

Site conversion rates increased 7-12%

Revenue = sessions x conversion % x CPC

Page 49: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

49

Performance Impacts Abandonment

Page 50: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

50

Performance Penalties & bizrate.co.uk

~120% SEMSessions

Page 51: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

51

Performance Summary

• Conversion Rate +7% - 12%

• Page View’s +25%

• US SEM Sessions +8%

• Bizrate.co.uk SEM Sessions +120%

• Infrastructure Required (US) -50% (200 vs 402 nodes)

• Availability 99.71% 99.94%

• Product Velocity +225%

• Release Cost $1,000’s $80

Page 52: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

5252

Is Performance Worth The Expense?

YES!

Page 53: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

5353

Simplicity, quality, performance are design decisions

Page 54: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

5454

Questions?

Page 55: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

Thank You!

Blog: tech.shopzilla.com

Email: [email protected]

Jobs: jobs.shopzilla.com

Page 56: Shopzilla - Performance By Design

Performance By Design | 11/3/2009

References

• http://developer.yahoo.com/performance/

• http://www.oracle.com/technology/products/coherence/index.html

• http://www.evidentsoftware.com/products/clearstone.aspx

• http://www.yourkit.com/

• http://spriteme.org/

• http://www.keynote.com/

• http://jakarta.apache.org/jmeter/

• http://cxf.apache.org/

• http://graphite.wikidot.com/

56