data science and engineering for marketers
TRANSCRIPT
@MicahHerstand
Software Engineer, User Advocate, Writer, Actor, Singer-Songwriter
@MICAHHERSTAND
ˈmikə
@MicahHerstand
“Marketing has become a technology-powered discipline, and therefore, marketing organizations must infuse technical capabilities into their DNA.”
~Scott Brinker, MarTech Conference Program Chair
@MicahHerstand
LESSON OBJECTIVES: Theory
Discover how data science enables marketing innovationMeasure, Metric, CSF, KPICustomer segmentationBig Data, Open Data, Linked DataGrowth Hacking
Ensure your org’s data engineering empowers marketersDatabaseSQL (Relational), NoSQL (Document, Graph)Data warehouse
@MicahHerstand
LESSON OBJECTIVES: Setup
Install a database managerSequel Pro for MacMySQL Workbench for Windows & Linux
Connect to your org’s databaseStandard, SSH, SSL
Bookmark SQL helpersSQLZoo (run SQL online!), Tutorials Point, Khan AcademyGoogle queries: “site:docs.oracle.com UNKNOWN TERM”
@MicahHerstand
LESSON OBJECTIVES: Practice SQLCreate a mental model for what it’s like to query SQL using English firstAcquire the vocabulary to understand a SQL queryEncounter example SQL queries and see their resultsPractice your knowledge through exercises
@MicahHerstand
DATA SCIENCEMeasure, Metric, CSF, KPI
Customer segmentationBig Data, Open Data, Linked Data
Growth Hacking
@MicahHerstand
DATA SCIENCE: Measure, Metric, CSF, KPI
It’s the metrics, stupid!
“The price of light is less than the cost of darkness.”~Arthur C. Nielsen, namesake of Nielsen TV ratings
“What gets measured, gets managed.”“There is nothing so useless as doing efficiently that which should not be done at all.”"Management is doing things right; leadership is doing the right things."~Peter Drucker, the founder of modern management
@MicahHerstand
DATA SCIENCE: Measure
Definition: Anything that can be measuredCaveat: Must be a single variable measure
E.g. № of usersE.g. № of active users
Challenges: Definition of terms
E.g. Does account creation make someone a customer?Measurement process
E.g. How frequently should data be collected?
@MicahHerstand
DATA SCIENCE: MetricDefinition: Value derived from 2+ measures
Metric selection: Efficiency vs effectivenessE.g. cost of customer acquisition vs customer lifetime value
Analysis: Information vs insightsE.g. customer value vs value of customers acquired through LinkedIn
Optimization: Source vs campaignE.g. customers w/ expired CC vs customer bounce rate when CC expired
Caution: Vanity, engagement, and benchmark metricsE.g. Facebook Likes, Time on Page, DVD sales
@MicahHerstand
DATA SCIENCE: CSF (Critical Success Factor)
Definition: What is required to achieve business objectives.E.g. acquire new customers
Prerequisites: Business objectivesE.g. to obtain 10% market share (BO), must acquire new customers (CSF)
More on CSFs: bit.ly/sidata-csf
@MicahHerstand
DATA SCIENCE: KPI (Key Performance Indicator)
Definition: A measurable value that demonstrates how effectively a company is achieving key business objectives
E.g. cost per lead, customer lifetime value, traffic-to-lead ratio, retweets of last ten tweets, landing page conversion rates
Prerequisites: Critical Success FactorsE.g. to acquire new customers (CSF), track those acquired per week (KPI)
Requisites: SMART (Specific, Measurable, Achievable, Relevant, Time)E.g. weekly rate of customer acquisition
Caution: Perverse incentives and unintended consequencesE.g. referral programs to increase customer acquisition
More on KPIs: bit.ly/sidata-kpi
@MicahHerstand
DATA SCIENCE: CSFs vs KPIs
Graphic origin: bit.ly/sidata-kpi-vs-csf
@MicahHerstand
DATA SCIENCE: Prioritization
“Never confuse motion with action.” ~Benjamin Franklin
Graphic Origin: bit.ly/sidata-metrics-graphic
@MicahHerstand
DATA SCIENCEMeasure, Metric, CSF, KPI
Customer segmentationBig Data, Open Data, Linked Data
Growth Hacking
@MicahHerstand
DATA SCIENCE: Customer Segmentation
Definition: the practice of dividing a customer base into groups of individuals that are similar in specific ways relevant to marketing
E.g. SI grads, New Yorkers, users who have yet to purchaseUtility: One size does not fit all. Allows for novel KPIs.Prerequisites: Business Objectives, Metrics
E.g. Want to gain 10% salon market (Biz Objective), while 25% of total customers are men (metric), target men as it’s an under-saturated market
Types: A priori, Needs-based, and Value-basedCaution: Don’t break the law by targeting protected classes
E.g. AirBnb cannot offer Iranian-Americans discounts for Nowruz
More on KPIs: bit.ly/sidata-kpi
@MicahHerstand
DATA SCIENCEMeasure, Metric, CSF, KPI
Customer segmentationBig Data, Open Data, Linked Data
Growth Hacking
@MicahHerstand
DATA SCIENCE: Big Data, Open Data, Linked Data
"Big Data will spell the death of customer segmentation and force the marketer to understand each customer as an individual.”~Ginni Rometty, CEO, IBM
"Google only gives you answers for questions people have asked before.”“A mark of a good site is realizing you're not the only site in the world.”~Tim Berners-Lee, inventor of the World Wide Web
@MicahHerstand
DATA SCIENCE: Big DataDefinition: Data sets that are so large or complex that traditional data processing applications are inadequate to deal with them.Technical Challenges:
Volume (amount of data)Velocity (speed of data in and out)Variety (range of data types and sources)
Human Challenges:No magic bullets, easy to overstate current capabilities
Novel Opportunities:Real-time pricing, Sentiment analysis, Optimized offers
@MicahHerstand Designed by Forrester Research, accessed at bit.ly/sidata-bigdata
@MicahHerstandDesigned by Forrester Research, accessed at bit.ly/sidata-bigdata
@MicahHerstandDesigned by Forrester Research, accessed at bit.ly/sidata-bigdata
@MicahHerstand
DATA SCIENCE: Open Data
Definition: Data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. “Free as in speech, not beer.”
E.g. data.gov, census.govAlternate Definition: Public or private data stores available for integration into one’s own data system.
E.g. developer.nytimes.com, Thomson ReutersChallenges:
Low cost, high quality, and large quantity—pick twoData normalization (e.g. gender and sex, China bowls vs China country)
@MicahHerstand
DATA SCIENCE: Linked Data
Definition: A method of publishing structured data so that it can be interlinked and become more useful through semantic queries.
E.g. Facebook’s Open Graph, Google Rich Snippets, Twitter CardsNovelty: Data sources share schema so no middleware necessaryChallenges:
Comparatively few data sourcesData analysis tools less matureFewer trained developers
"Marketing department might want to dominate the Linked Data web.”~Ralph Swick, COO of the W3C, organization responsible for World Wide Web standards
@MicahHerstand
DATA SCIENCE: Linked Data
"When companies post data as Linked Data they can be held accountable. Regex has [fuzzy] responsibility.”
~Ralph Swick, COO of the W3C, organization responsible for World Wide Web’s technology standards
Accessed March 8th, 2017
@MicahHerstand
DATA SCIENCEMeasure, Metric, CSF, KPI
Customer segmentationBig Data, Open Data, Linked Data
Growth Hacking
@MicahHerstand
DATA SCIENCE: Growth Hacking
Graphic origin: bit.ly/sidata-gh-cartoon-2
@MicahHerstand
DATA SCIENCE: Growth Hacking
Graphic origin: bit.ly/sidata-gh-cartoon-3
@MicahHerstand
DATA SCIENCE: Growth Hacking
“Growth hackers are a hybrid of marketer and coder.”“[Growth hacking] requires a blurring of lines between marketing, product, and engineering, so that they work together to make the product market itself.”~Andrew Chen, Head of Rider Growth at Uber
“The true unicorns are those who can go end-to-end designing, building, measuring, analyzing, and iterating with a combination of user intuition and deep analytics.”~Matt Humphrey, Sold his startup HomeRun for $100M+ after 18 months
@MicahHerstand
DATA SCIENCE: Growth Hacking
Definition: A process of rapid experimentation across marketing channels and product development to identify the most effective, efficient ways to grow a business.
E.g. Airbnb cross-listing on CraigslistNovelty: Interdisciplinary skills and knowledgePrerequisites: Interdisciplinary teams, acceptance of failure, outside-the-box thinkingRequisites: Measurable, metric-based
@MicahHerstand
DATA ENGINEERING: Database
Definition: A collection of structured data, organized for rapid search by an automated computer program.
Novelty: List or calculate data from various sourcesE.g. How much revenue has been made by sales from customers whose first visit was referred by a Facebook ad?E.g. How many customers (who have made at least $100 in purchases total) have used our referral program?
@MicahHerstand
DATA ENGINEERING: Relational Database
Definition: A type of database that organizes data into tables (think spreadsheet) and creates clearly defined relationships between those tables.
E.g. SQL (MySQL, PostgreSQL, SQLite, Oracle Database, MS SQL)SQL is a programming language that lets people setup relational database as well as add, update, delete, and lookup data within them.
Novelty: Up-front schema, data integrity checks, transactions.E.g. ensure a movie cannot be added without an associated director
Challenges: Large datasets and an evolving schema are difficult to manage.E.g. you want to track customers’ age, then decide not to, then decide to track gender as a binary, then decide to make gender a free-text option…
bit.ly/sidata-sql-vs-nosql
@MicahHerstand
DATA ENGINEERING: NoSQL Databases
Definition: A database that is not a relational database. (NoSQL is colloquial jargon, not a standard)
E.g. MongoDB, Redis, Couchbase, neo4jNovelty: No schema required to store data. Easily scalable. Super fast lookups.
E.g. easy to track customers’ age, then decide not to, then decide to track gender as a binary, then decide to make gender a free-text option…
Challenges: Data integrity, stable transactions.E.g. cannot ensure a director is always included when adding a movie
bit.ly/sidata-sql-vs-nosql
@MicahHerstand
DATA ENGINEERING: Data warehouse
Definition: a computer system optimized for analytical and informational processing that is filled with data copied from both inside and outside the enterprise
E.g. a database with both a sales table and a google analytics table and a census table.
Novelty: analyze business data without affecting day-to-day operationsE.g. you want to see employee clock-in times without preventing them from simultaneously clocking out.
Challenges: large overhead and maintenance costs without being necessary
@MicahHerstand
DATABASE SETUPDatabase manager application
Database ConnectionsSQL Helpers to Bookmark
@MicahHerstand
DATABASE SETUP: DB Manager Application
Definition: A graphical user interface that simplifies database interactions for developers
Examples:Sequel Pro for Mac: bit.ly/sidata-macMySQL Workbench for Windows & Linux: bit.ly/sidata-not-macPHPMyAdmin for web access
@MicahHerstand
DATABASE SETUPDatabase manager application
Database ConnectionsSQL Helpers to Bookmark
@MicahHerstand
DATABASE SETUP: Database connections
Unsecured Connections are often called “standard” and require no setup besides the application you just downloaded
Secured Connections can use SSH or SSL and require additional encryption technology to be installed on your computer.
Your company should have documentation on how to use these.
@MicahHerstand
DATABASE SETUP: DB Connection Info
Server: www.herstand.comUser: sistudentsPassword: Hf68S9CpK67RUDV3Database: simoviesPort: 3306 (default MySQL port)
@MicahHerstand
DATABASE SETUPDatabase manager application
Database ConnectionsSQL Helpers to Bookmark
@MicahHerstand
DATABASE SETUP: SQL Helpers to Bookmark
Learn: TutorialsPoint.com, KhanAcademy.com Play: SQLZoo.net (run SQL online!)Cheatsheet: bit.ly/sidata-sql-cheat-sheetCheatsheet with examples: bit.ly/sidata-cheat-with-examplesRTFM: bit.ly/sidata-mysql-rtfm
@MicahHerstand
PRACTICE SQL: English queries
Questions SQL can answer: Who, What, Which, Where, When, How ManyE.g. Who directed the film Get Out?E.g. Who acted in the film Get Out?E.g. What films were released before Jan 1, 2000?E.g. Where did the director of Get Out go to college?E.g. Which colleges had the most graduates direct films since Jan 1, 2000.E.g. When was Get Out released?E.g. How many actors were in both Get Out and The West Wing?
@MicahHerstand
PRACTICE SQL: VocabularySyntax, . ; ( ) “ ” *
VerbsSELECTINSERTUPDATEDELETE
Query PartsASFROMWHEREHAVINGORDER BY GROUP BY
FiltersLIKENOT> <=!=>=<=ANDORIN%
SortASCDESC
Aggregate FunctionsMIN, MAX, SUM, AVG, COUNT
Advanced FunctionsINNER JOINOUTER JOINREGEXP
@MicahHerstand
PRACTICE SQL: Anatomy of a Query
SELECT FROM movies WHERE ;
title AND release_date
title
COUNT(title) AS num_of_titles
title AND MIN(release_date)
title = “%Star Wars%”
release_date > ‘2000-1-1’
release_date > ‘2000-1-1’ AND title = “%Star Wars%”
title = “Get Out”*
Result
@MicahHerstand
PRACTICE SQL: Anatomy of a Query
SELECT * FROM movies
GROUP BY release_date
titleORDER BY ASCDESC
Result
;