live person under_the_hood_taldor_for_publish
TRANSCRIPT
Leveraging Data: Building a Stable Platform
Ophir Cohen, Data Platform Lead, [email protected] Amit Fainer, Data QA Lead, [email protected]
May, 2013
Connection before content… 2
Who was the commander of whom in the army?
Who met his wife in India?
Agenda 3
Connection before content
LivePerson Is…
Data platform requirements
Quality challenges
Architecture
Development and production processes
Case study: LivePerson BI Reports
LivePerson Is…
Mission:
Creating Meaningful Customer Connections
4
Company• Cloud-computing, SaaS pioneer since 1998
• IPO April 2000 (Nasdaq: LPSN); debt free
• 700+ employees
• LivePerson offers an extensive and rapidly-growing partner network
Customers• 8,500 customers around the globe have chosen LivePerson to create secure,
reliable connections with their customers. LivePerson clients include:
• 8 of the top 10 Fortune 500 companies
•Top 10 of 15 commercial banks (Fortune 500)
•Top 4 of 5 telecommunication companies (Fortune 500)
•4 of the top 7 of the Forbes Global 2000
•5 of the top 6 software and services companies (Forbes 2000)
•8 of the top 10 of Interbrand's Best Global Brands
Service Delivery• 1.8 billion visitors monitored per month
• 20 million connections per month
• Analyzes over 1.2 million documents and chat transcripts per month.
Mission
Creating Meaningful Customer Connections
Live Chat and Click-to-Call Vendor 2012
Enterprise Customer Success & Domain Expertise
Finance
High–Tech
Retail
Telecom
Travel
5
Requirements 6
Massive Data flow (few TB a day)
Different Data types, Different Producers
Never Lose Data!
Variety latency needs – Near real-time through Offline
Data is accessible to everyone for Processing, in a standardized,
common paradigm, adopted by all consumers and producers
Quality Challenges 7
Large volumes of Data – Automate or Die
Bugs yield corrupted Data
Produced data stays Forever
Consumers need a standardized form to assure data integrity
Architecture 8
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Architecture – Persistency Layer 9
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Kafka (by LinkedIn):• Queuing mechanism• Persistency layer• High availability layer
Architecture – Streaming Processing Layer 10
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Storm (by Twitter)
• Stream processing• Pluggable framework
Architecture – Batch Processing Layer 11
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Hadoop (an Apache Project)
• Reliable, scalable, distributed computing framework
• Rich eco-system
Develop, Test and Deploy at Scale 12
Automated, Continuously integrated with built-in Performance
testing
Satisfying Monitoring and Auditing needs of Tiers 1 through 5
On going production tests
Auditing mechanism
Scrum
Isolated production-mirrored environment for Testing
Case Study – LivePerson BI Reports 13
Case Study – LivePerson BI Reports 14
Source to target
Auditing tool as part of data integrity tests
Load tests in real data env