pinterest engineering tech talks - 4/29/14

67
Discover Pinterest Engineering

Upload: pinteresr

Post on 16-Oct-2015

8.701 views

Category:

Documents


0 download

DESCRIPTION

In April 2014, Pinterest engineers presented to members of the engineering community at a series of Tech Talks held at the Pinterest offices in San Francisco. Topics included: - Mobile & Growth: Scaling user education on mobile, and a deep dive into the new user experience (with engineers Dannie Chu and Wendy Lu) - Monetization & Data: The open sourcing of Pinterest Secor and a look at zero data loss log persistence services (with engineer Pawel Garbacki) - Developing & Shipping Code at Pinterest: The tools and technologies Pinterest uses to build quickly and deploy confidently. You can find more at: engineering.pinterest.com and facebook.com/pinterestengineering

TRANSCRIPT

  • Discover Pinterest Engineering

  • Agenda

    !

    Mobile & Growth: Monetization & Data:

    Deploying & Shipping Code:

    Wendy & Dannie

    Pawel

    Chris & Jeremy

  • Wendy LuMobile

    Scaling User Education on MobileThe Experience Framework

    Dannie ChuGrowth

  • User Education

  • Motivation?

  • User Education leads to Engagement

  • First Pin Education

  • Challenges

  • 1. Preserve a great user experience

  • Conflicting Education

  • Never-ending Stream of Nags

  • Nag that never goes away

  • 2. Targeting

  • Are you a new user? Invite some friends!

  • Do you have an empty homefeed? Make it better!

  • 3. Rapid Experimentation

  • Experiment with messaging w/o requiring a release

  • 4. Keep the client code clean

  • Solution?

  • The Experience Framework

  • Conflicting Experiences

  • Experiences and PlacementsNag Placement Home view Placement

    Pin Tutorial Card Placement

  • AB Experimentation, Simple Client Code

  • Experiences are delivered to the client at runtime

  • Experience Framework

    Decision Engine

    Configuration

    Handlers

    Admin DashboardAPI Servers

    AB Experiments Framework Kafka (logging)

    HBase (User States,

    Experience States)

    API ServersAPI ServersCascading/ MapRed

    User State

    Batch Loader

    Clients

    SDK

    SDK

    What experience should the user see?

    Experience Framework

  • Deep Dive: Nags

  • What are nags? Important information which we

    occasionally display at the top of a users feed

    Can be a Call to Action Confirm your email or an announcement You can now add a map to any board!

  • Enter: The Experience Framework

  • Enter: The Experience FrameworkEach time we reload the home

    feed, ask the experience framework: What should I show in this nag?

  • Single Nag Manager that relies on the experience framework to give it the contents to render

    Nag Manager

    Experiment with nag messaging, call to actions, images

    Add any new nag dynamically, controlled from the backend.

    Cool-down management - should not see more than one nag in a set

    period of time

  • What does that nag data look like?Handling actions

    "bg_img_url_2x" = "http://mobile-assets.pinterest.com/iphone/nags/[email protected]"

    "title_text" = "Pinspire your friends!"

    "detailed_text" = "Know someone who'd like Pinterest? Invite them along."

    "button1_text" = "No, Thanks!"button1_uri" = ""!

    "button2_text" = "Invite Friends"!"button2_uri" = "pinterest://invite_friends"

  • Handling ActionsAll initialization and presentation of view controllers is handled through a

    central Navigation Manager.

    Centralizes code to create and present view controllers

    Consistency to other platforms for deep links

    Allows dynamic insertion of nags from the backend without having to write

    new client code and submit a new release

  • [[NavigationManager sharedManager] handleURL:[NSURL URLWithString:@pinterest://invite_friends]];

    Navigation Manager

    self.presentationDelegate

    presentInviteFriends

  • Deep Dive: New User Experience

  • New User ExperienceWhat is NUX?

    !

    The set of initial tutorials and education presented to a user after they register

  • Personalize a users content immediately after signupConnect with Friends Choose Interests

  • Next, teach users how to Pin

  • Experiment between classic user ed and NUX

  • Experiment between classic user ed and NUXWhen doing experiments where we need to call the network to get the

    users treatment group, need to make sure were not adding perceived latency

    Structure view controllers in a way that you can asynchronously load in the modules

    dependent on the treatment group

    If need to transition to dierent view controllers, set a time out in which we transition

    to the control treatment

    Be Fast or Fail Fast

  • 1 step vs. 2 steps vs. 3 stepsExperiment with dierent versions of NUX

  • Experiment with dierent versions of NUXBackend controls all strings, allowing us to dynamically experiment with

    dierent text (Messaging, titles, calls-to-action)

  • steps = ( { "continue_button_text" = Continue; "detailed_text" = "Tap a network to find people who share your interests."; "follow_button_text" = "Follow selected people"; step = 1; "title_text" = "First things first"; "total_steps" = 2; }, { "completion_message" = "Finding Pins for you..."; "continue_button_text" = "Tap at least {0} more to continue"; "detailed_text" = "Tap whatever you're interested in these days."; "finish_text" = Finish; "num_interests" = 5; "skip_text" = "Pinterest is much more interesting when you tell us what you like."; step = 2; "title_text" = "Pick 5 interests"; "total_steps" = 2; } );

    After signup, request all data for NUXEnter: Experience Framework

  • Supports dynamic number and order of stepsNUXViewController : UINavigationController

    Maps an array of display data to an array of view controllers

    Protocol method advanceToNextStep called by each child view controller

    Checks the array it keeps for the next view controller to push

    JSON dict for Intro

    JSON dict for Friend Selector

    JSON dict for Interests Selector

    NUXIntroViewController

    NUXConnectViewController

    NUXInterestsViewController

  • Wins from Experience FrameworkSingle place in the backend that manages all experiences for all platforms

    Dynamically trigger display of content

    Conflict resolution for educations that touch the same views

    Experiment with flows, messaging, and images

  • engineering.pinterest.com

  • Pawel GarbackiSoftware Engineer, Monetization

    Pinterest SecorZero-data-loss log persistence service

  • Pinterest is a data driven companyData matters

    100+ experiments active at a given point in time

    1500+ tracked metrics

    200+ log types

  • We produce a lot of dataWe produce a lot of data

    PBs of data in S3, growing by Tens of TB a day

    Hundreds of production hadoop jobs, processing about half a PB of data each day

  • Singer (logging agent)

    App

    Data pipeline

    Local Disk

    S3Kafka (log collector)Secor

    (log saver)

    Storm (realtime

    stats)Hive (hadoop

    analytics)Redshift (ad hoc queries)

  • Singer (logging agent)

    App

    Local Disk

    S3Kafka (log collector)

    Secor (log saver)

    Storm (realtime

    stats)Hive (hadoop

    analytics)Redshift (ad hoc queries)

    Data pipeline20B messages/day

  • Kafka 101Distributed pub-sub service

    Designed for high throughput Producer Producer Producer

    Consumer Consumer Consumer

    Kafka cluster

  • Anatomy of a topicTopic is a category to which messages are published

    Partition is a shard of a topic controlling the level of consumption parallelism

    Messages are assigned unique identifiers called osets

    1 2 3 4 5 6 7 8 9 10 11 12

    1 2 3 4 5 6 7 8 9

    1 2 3 4 5 6 7 8 9 10 11 12

    Writes

    Partition 0

    Partition 1

    Partition 2

  • Save the dayKafka is optimized for local writes

    Local disk capacity is good for a few

    days worth of data

    Data needs to be saved (at least) daily

    to long term storage - Amazon S3

  • How soon is eventuallyAmazon S3 is a cloud file system

    Eventual consistency model

    No guarantees on when uploaded data will become visible to the readers

    No monotonicity - data available in the past may magically disappear

  • Secor design guidelinesObjectives:

    Persist Kafka logs to S3

    Cause no data loss

    Work properly with eventual consistency model

    Properties:

    Horizontal scalability

    Fault tolerance

    Customizability

  • No-S3-reads principleSecor never reads data from S3

    Lightweight metadata is stored in strongly

    consistent state repository

    Strategic choice of file names

    s3n://logs//

    __

    represents software compatibility

    version

    Inconsistencies introduced by consumer failures

    get fixed automatically by file overwrites

  • Date clusteringData processing tools rely on date-

    partitioned directory structure

    s3n://logs/event/dt=2014-04-04/

    Timestamps extracted from messages on

    the fly

    Support for pluggable parsers for thrift,

    json, etc.

  • https://github.com/pinterest/secor

  • Chris Danford (@chrisdanford) Jeremy Stanley (@rouxbot)Web Team

    From Development to ProductionThe life of web features

  • Develop Review Deploy Measure

  • Develop Review Deploy Measure

  • Developing - Ideal state able to iterate quickly

    easy errors caught automatically

    easy-to-understand and powerful abstractions

  • Fast iterationDeveloping

    Build tasks and dependencies modeled as a graph

    cumberbatch watches for changes of file contents

    orchestrator knows tasks and dependencies

    build the minimum tasks to heal damage in the graph

    maximize parallelization of tasks

    built on Grunt - access to large library of build tasks

  • Self-healing build graphsprite/*.png

    sprite task

    sprite.png sprite.scss

    Sass task

    component/*.scss

    components.css

    imagemin task Autoprefixer task CSSlint task

  • Self-healing build graphsprite/*.png

    sprite task

    sprite.png sprite.scss

    Sass task

    component/*.scss

    components.css

    imagemin task Autoprefixer task CSSlint task

  • Self-healing build graphsprite/*.png

    sprite task

    sprite.png sprite.scss

    Sass task

    component/*.scss

    components.css

    imagemin task Autoprefixer task CSSlint task

  • Self-healing build graph, example 2sprite/*.png

    sprite task

    sprite.png sprite.scss

    Sass task

    component/*.scss

    components.css

    imagemin task Autoprefixer task CSSlint task

  • Self-healing build graph, example 2sprite/*.png

    sprite task

    sprite.png sprite.scss

    Sass task

    component/*.scss

    components.css

    imagemin task Autoprefixer task CSSlint task

  • LintingDeveloping

    catches easy bugs

    enforces consistent style

    pyflakes

    pep8

    jshint

    CSSLint

    custom RegEx linting

  • RegEx-based linter

    $(.item).addClass(selected);

    BoardPicker.js

    this.$(.item).addClass(selected);

  • Static analysis / type safetyDeveloping

    Google Closure Compiler

    template static analysis

    validation of template inheritance

    extract used option variable names

    Thrift clients for service calls (Python) and shared constants (JavaScript)

  • Closure Compiler - type safety

  • Generated constants from Thrift

  • AbstractionsDeveloping

    component framework

    styles are scoped to the component

    DOM access is scoped to the components DOM

    events up, methods down

    scaolding script

    live component catalog - discovery of existing components

    autoprefixer, spriting - remove boilerplate

  • component catalog

  • spriting

    .thumbsUpButton {! @include inline-image(sprites/main/thumbUp');!}

    Usage

  • Develop Review Deploy Measure

  • Reviewing - Ideal state the most-relevant person is reviewing changes

    integration-type issues are caught

  • Code ReviewsReviewing

    local test runner script runs Jasmine and Node tests parallelized

    PR watcher tool - visibility of relevant PRs

    code review process - R+, E+

  • Parallelized Jasmine tests in PhantomJS

  • Github Watcher tool

  • Integration problems are caughtReviewing

    PRs trigger a build and tests - 3 minutes

    latest.pinterest.com is continuously deployed from head

    Selenium integration tests run against every deploy

  • Pull request builds

  • Selenium

  • Develop Review Deploy Measure

  • Deploying - Ideal state code deploys are invisible to users

    frequent and non-disruptive to developers

    immediate rollback when theres a problem

  • User experienceDeploying

    stickiness to a version

    flip nearly instantaneously between builds

    reduce version thrashing

    worry less about (style mismatches, JS errors due to data format mismatch with

    server)

    asset versioning

  • Serving multiple application versions

    new!sessions

    old!sticky!sessions

    A B

    all!sessions

    B

    1 2

  • Serving multiple application versions, contd

    3 4

    new!canary!sessions

    all!non-canary!sessions

    B C

    new!sessions

    old!sticky!sessions

    B C

  • Asset versioninglogo.png

    renamehash file!contents

    logo.59aa9183.png

    bundle.css

    background-image: logo.png

    background-image: logo.59aa9183.png

    update references

    bundle.css

    bundle.907389d8.css

    renamehash file!contents

  • When things go wrongDeploying

    experiments dashboard - turn o experiments instantaneously

    version rollback is nearly instantaneous

  • Develop Review Deploy Measure

  • Monitoring healthA/B Dashboard

    A/B dashboard key metrics

    Sentry

    Stats dashboard

    Alarms - Monit, PagerDuty

  • A/B Dashboard

  • Sentry

  • Sentry error emails

  • Results

  • Results5 engineers on web team

    all teams at Pinterest developing their own web features on our platform

    components re-used across teams

    2 scheduled deploys a day

    anomalies in key metrics surfaced immediately

    100s of simultaneous experiments