practical relevance measurement

Post on 18-Aug-2015

55 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

Practical Relevance MeasurementWalter Underwood

Principal Software Engineer

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Measure• Explain• Diagnose• Fix• Repeat

Data-Driven Improvement

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

Actionable

Measure: What is a good metric?

10

Common interpretation

Accessible,credibledata

Transparent,simplecalculation

Juice Analytics: http://www.juiceanalytics.com/writing/choosing-right-metric

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Click-through rate (CTR)• Great for navigational search (one correct answer)• Great for simple search UI

• Success rate• With multiple clicks, identify the result that satisfied• Dwell time, conversion, …• Needed for informational search• Useful for complex UI: facets, filters, …

Measure: What is Success?

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Use these measures to explain search success to others• 0.50 CTR or success rate is pretty good• A one percentage point improvement is very good• Do not over-promise• Report CTR daily and graph it

Explain: Overall Success

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• What are the problem queries?• Per-query CTR reports

Diagnose

2

Query CTR Frequency

gorilla 0.45 10,000

orangutan 0.10 100

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• What should the queries’ CTR be?• How many clicks would that be?• How many times were people less happy than expected?

Diagnose: Click Residual

2

Query CTR Frequency Actual Expected Residual

TOTAL 0.50 100,000 50,000

gorilla 0.45 10,000 4,500 5,000 -500

orangutan 0.10 100 10 50 -40

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Customer uses different word than content• The Vocabulary Problem (Furnas, et al; 1987)• Add synonyms: coat, jacket, parka, anorak, …

• Content doesn’t exist• We don’t sell that — add “no hits” page with

recommendations• Add to site/inventory

• Misspellings (about 10% of queries)• Fuzzy search• Query suggestions

Fix: What is the cause?

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Count attempts (queries)• Count clicks or successes• Associate clicks with queries• Handle anomalies• Hash size

Implementing Metrics

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Log the query• Log a hash or unique ID for this attempt

• Will be used to link clicks to attempts• Random bits generated in server (session log)• Random bits generated in browser (URL param)• Hash calculated from user ID, query, and time

Implementing: Count Attempts

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Search results page will have decorated URLs on results• Tracking hash• Rank (1-based)• Query is optional, you can join with the attempts log

• Result links look like: /product-1234?tk=ABCD&r=1 /product-5678?tk=ABCD&r=2

Implementing: Count Clicks

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Multiple clicks from one SRP• Opening results in new tabs• Cached SRP• Solution: Only count one, maybe the last one

• Clicks that don’t match a query attempt• Bookmarked results• Clicks from a page loaded a previous day• Solution: Ignore them

• Bots hitting SRP• Remove from stats• JavaScript-generated token helps filter bots

Implementing: Anomalies

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

• Birthday Paradox — you’ll need more bits than you think• Chegg uses:

• 48 random bits• Encoded in URL-safe Base64• Eight characters• Looks like: tk=NmAwoq5e

• 1% chance of collision with 2.4 million searches• See “Birthday Problem” on Wikipedia

Implementing: Hash Size

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

Questions?

2

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.

Search Analytics for Your Site — Louis Rosenfeld

3

http://rosenfeldmedia.com/books/search-analytics-for-your-site/

Confidential Material – Chegg Inc. © 2005 - 2015. All Rights Reserved.Confidential Material – Chegg Inc. © 2005 – 2015 by Chegg Inc. All Rights Reserved.

Thank you!wunder@wunderwood.org

5

top related