linkedin links analysts to collaborate on analysis · 2016. 9. 13. · the analytics team at...

26
#TDPARTNERS16 #datacatalog GEORGIA WORLD CONGRESS CENTER LinkedIn Links Analysts to Collaborate on Analysis Rohit Jonnalagadda Business Operations, LinkedIn Stephanie McReynolds VP of Marketing @Alation

Upload: others

Post on 19-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

#TDPARTNERS16 #datacatalog GEORGIA WORLD CONGRESS CENTER

LinkedIn Links Analysts to Collaborate on AnalysisRohit JonnalagaddaBusiness Operations, LinkedIn

Stephanie McReynoldsVP of Marketing @Alation

Page 2: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

A little about me & LinkedIn

2

• Investment banker turned data junkie• Work as part of a cross-functional team to support our

Marketing Solutions (advertising) business• Expert in manipulating data with SQL but goal is always

to deliver insights that actually drive business decisions

Page 3: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

• Mix of numerous home-grown, open source, & procured products• Offline analytics starts with Kafka → Hadoop• Distributed team of users (“anyone” can learn SQL)

• Primary DW Environment• Used by hundreds of

data analysts across Finance, Operations, Product, HR, etc.

• (Most) ETL in Hadoop

• Built at LinkedIn• Hundreds of Billions of

Messages Routed Daily

• Petabytes of storage across the grid

• Writing > 75TB+ Daily• Spread across 3 DC’s• Thousands of Nodes• Hive, Presto, Spark

Supported by a robust environment

3

Page 4: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

The analytics team at LinkedIn

4

• Different types of “data discovery” happen in different teams of analysts

• All analysts need access to data, but individual workflows can be quite different

• Data catalogs are a central point of reference for all data consumers

Executive Reporting

1000s of Data Consumers

10s

Business Ops100s

Ad Hoc Analysis1000s

Page 5: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

• 3 data industry trends are driving Data Catalogs in the enterprise

• Challenges of linking analysts & data• How data cataloging helps• LinkedIn example

Linking Analysts to Collaborate

5

Page 6: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Trend #1: Data Proliferation

6

Data-driven organizations demand data proliferation:• All new products released

with new data structures• A new data set every

week• Deeper and wider data is

being produced than ever before

- Typical weblog has hundreds of attributes/columns

Page 7: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

“Big” data’s challenge is human

7

Volume is not our challenge, the speed of analysis is• Impossible for any one analyst to keep up with the continual stream of

new data updates• Documentation is often light by design• Rough conclusions are easy, accurate insights are hard• Impossible for any one analyst to keep up with the continual stream of

new data updates

Remember: insights come from analysis, not from keeping up with the data

Page 8: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Trend #2: Data Discovery

8

36% of end-users now preparing their own data - Late-binding/discovery oriented style of analysis wins over predictable/ well structured BI queries

Source: TDWI Best Practices Report, Improving Data Preparation for Business Analytics, Q3 2016

Page 9: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

What can be cataloged for re-use?

9

86% of organizations looking for re-use options to make data prep efficient – data catalogs help immensely with re-use & consistency

Source: TDWI Best Practices Report, Improving Data Preparation for Business Analytics, Q3 2016

Page 10: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Trend #3: Collaboration

10

Analysis has become a team sport

“According to data we have collected over the past two decades, the time spent by managers and employees in collaborative activities has ballooned by 50% or more.

Source: Harvard Business Review, Collaborative Overload, January 2016

Page 11: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

“But Collaborative Overload is a Risk

11

Data on leaders across 20 organizations show that those regarded by colleagues as the best information sources & most desirable collaborators have the lowest career satisfaction.

Source: Harvard Business Review, Collaborative Overload, January 2016

Page 12: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Challenges of the new era of analysis

12

Data-driven orgs drive Data Proliferation• A new product, a new dataset• Every product launches new datasets• Unboxing process is often one of discovery without documentation

More ad-hoc analysis challenges Human Productivity & System Performance• Performance - Cost to trying a query out for the first time• Analysts & tools must be productive cross-system

Analysis is now a team sport where Collaborative Efficiency & Overload must be managed• Effective collaboration requires some organizing structure/documentation• Best analysts are overloaded/burnt out• New analysts take 6 months to learn LinkedIn data

Page 13: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Scenario: Onboarding new users

13

Scenario: New employee needs to learn about our vast data footprint• No single place to learn• Unlike “source code”, queries are decentralized and live

on a mix of desktops and servers• Difficult to discern “source of truth” when questions can

have multiple answers• Need to come up to speed quickly due to rapid growth

and constant product innovation

Page 14: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Step 1: Build an Inventory

14

• What data sources exist?• What data is available?• What do the columns mean?• Where does the data come from (ETL, lineage)?• What is sensitive/protected?• What promises do we make to our users about their private

data and how we can use it for advertising purposes?

Page 15: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Step 2: Enrich the Catalog

15

An inventory without a sense of usage is not very informative, need to know:

• Who used it• How was it used?• Why was that data helpful

Samples of common queries: What is the growth rate in Country X? How is our sales pipeline tracking for the quarter?What customers are at risk for churning?

Page 16: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Step 3: Support Human Adoption

16

• Training• Support• Adapting

Page 17: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Value of Alation for LinkedIn Analysts

17

Productivity:• Collaboration: Teams around the world can quickly share insights with

one anotherROI:• Teams are spending more time disseminating knowledge and less time

writing queries. This shortens product release cycles, drives faster deal closings, and increases overall productivity.

Benefits:• Onboarding has been greatly simplified as Alation has generated an

organic repository of up to date knowledge

Page 18: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Alation Delivers a GPS for Analysts

18

Data Catalog links data and analysts together for collaboration• Automates the inventory• Maintains a rich catalog based on actual analyst behaviors• Reinforces best practices

- SmartSuggest recommendations- Behavioral interventions for governance- Monitors wide & deep usage

Page 19: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Table Explorer

19

Page 20: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Popularity Indicators

20

Page 21: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Data Profiling

21

Page 22: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Lineage

22

Page 23: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Articles to Collaborate on Definitions

23

Page 24: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Data Catalogs address complexity

24

A platform for efficient & effective human collaboration • Proactive recommendations• Inline documentation• Details to navigate Data proliferation

- Table Explorer- Data Profiling- Interactive Query Editor- Lineage

Page 25: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Find out more about Data Catalogs

25

alation.com/resources• TDWI Best Practices Report, Improving Data Preparation for

Business Analytics, Q3 2016

Alation Booth #729

Page 26: LinkedIn Links Analysts to Collaborate on Analysis · 2016. 9. 13. · The analytics team at LinkedIn 4 • Different types of “data discovery” happen in different teams of analysts

Thank You

Questions/CommentsEmail:

Join Us AtAlation Booth

Follow UsTwitter

Rate This Session # with the PARTNERS Mobile App

Remember To Share Your Virtual Passes

[email protected] & [email protected]

#729

739

26

@slangenfeld @alation