a/b testing pitfalls - working at booking.com · source: https: //developer.amazon ... tracking...
TRANSCRIPT
WHO'S THIS DUDE?ALEJANDRO PARDO LOPEZ
Client Side Developer && Team Leader @ Booking.com@apardolopez
WHAT IS A/B TESTING?Test different versions of a website or feature by randomly
assigning your users into two or more groups, each one exposedto a different variant of the website/feature, and comparing the
impact of each against each other
WHY A/B TESTING?Ability to detect and measure the real impact of our changes in
the UX or performance
USUAL SUSPECTSConversion: Successful sign up / purchase / click on CTA pervisitor)Bounce ratesTime SpentOther user metrics (e.g. navigation times)
OTHER USEFUL METRICSFront End Performance (page load times, navigation times)Backend performance ( CPU wallclock, SQL wallclock )ErrorsExternal impact (e.g. # of Customer Care tickets)
MULTIVARIANT TESTS (I.E. MORE THAN 2VARIANTS)
Test multiple variations of same featureCompare each variant aganinst the othersThe more variants, the bigger your user base needs to be todetect a changeUse a power calculator to determine how many users you needto detect a certain amount of impacte.g. http://www.evanmiller.org/ab-testing/sample-size.html
EXAMPLES OF MULTIVARIANTSCTA colour (famous Google's 40 shades of blue)Copy experiments (CTA's, email headlines)
REDUCED USER GROUP (E.G. 10% OR LESS OFTOTAL TRAFFIC)
Experimental features to reduced group of users for earlyfeedbackEarly detection of errorsEnabling potentially dangerous code (e.g. heavy DB queries)
GRACEFUL DEGRADATION - EMERGENCYSWITCHES
Disable lightactionsReduce data shown to reduce queries to overloaded DBHide buttons that lead to pages in trouble (e.g. in anotherdatacenter that is under pressure)
SECONDARY METRICS FTW!!!Secondary signupsPerformance impactErrorsInconclusive can be also a target impact (e.g. code refactor)
TRUSTWORTHY DATA"When running online experiments, getting numbers is easy;
getting numbers you can trust is hard"
TRUSTWORTHY DATAWithout data you can trust, you cannot make a decision.
Basically, you know nothing about the results of your test
ROBOTSThey can bias your results
Visitor numbers will be inflatedVisitor numbers can be altered in just one variant, makingdistributions unevenConversion rates can me affected as well due to the incrementin visitorsBut also clicks! Some robots parse Javascript
AVOID NOISETrack only people that are actually exposed to the change
Otherwise, spotting change in results is much harder, and exp hasto run for longer
e.g. Track everyone visiting the website, but the change is only onthe product page
TRACK USERS ONLY WHEN THEY AREEXPOSED TO CHANGE
LightboxesChange is actually viewed in browser viewport
BUT WEAKER TOOSensitive to JS errorsCookie overrides by HTTP requests (use server side cookiesinstead)
$('.item').on('click', function( e ){ var title = 'Base title for lightbox';
/* Do some stuff */
showLightbox({ title: title });});
$('.item').on('click', function( e ){ var title = 'Base title for lightbox';
/* Do some stuff */
if( track( featureA ) === 'b' ) { title = 'New title for lightbox'; } showLightbox({ title: title });});
$('.item').on('click', function( e ){ var title = 'Base title for lightbox', position;
/* Do some stuff */
position = $('#elem').offset().top; console.log( position ); /* End do some stuff */
if( track( featureA ) === 'b' ) { title = 'New title for lightbox'; } showLightbox({ title: title });});
$('.item').on('click', function( e ){ var title = 'Base title for lightbox', content;
/* Do some stuff */
// Synchonous call, returns default content or variant B content. content = readContentFromServer() || {}; /* End do some stuff */
if( content.useVariantBcontent && track( featureA ) === 'b' ) { title = 'New title for lightbox'; } showLightbox({ title: title });});
// Footer content is changed on the template, based on the variant(function(){ track.onView('#selector', feature);});
// Simple onView implementationtrack.onView = function( selector, feature ) { if( !selector || !feature ) return;
var trackIfVisible = function( data ){ if( isVisible( data.selector ) ) { track.feature( data.feature ); return true; } return false; };
if( ! trackIfVisible( selector, feature ) ) { throttle( trackIfVisible, { selector: selector, feature: feature } ) .on( 'scroll' ); // We don't want to run all the function code on each scroll event }}
<p>Some content</p><p data-track-on-view="feature">New content added by variant B</p><p>Some other content</p>
<p>Some content</p><p>New content added by variant B</p><p data-track-on-view="feature">Some other content</p>
<p>Some content</p><div class="empty-visible-div" data-track-on-view="feature"></div><p>New content added by variant B</p><p>Some other content</p>
Assume you might lose some visitors in the experimentCalling a tracking pixel or AJAX when browser is loadinganother page is completely unreliableYou can store in localStorage/Cookie the feature and track onnext page load (still not 100% relible)Alternatively, pass a parameter in the url for the server to dotracking on the page rendering