welcome to today's web seminar · chief analytics officer, aspirent professor, georgia...
TRANSCRIPT
SPONSORED BY HOSTED BY
Welcome to
Today's Web Seminar
Lenny Liebmann
Contributing Editor SourceMedia
Lenny Liebmann has spent his career studying how enterprises in key verticals such as healthcare and insurance leverage information technology in the context of their real-world business objectives and constraints. His 40-plus years of engagement with front-line practitioners make him a sought-after speaker and moderator for industry conferences and online events. He has also been widely published as a journalist, analyst, and featured columnist. Lenny wrote his first line of code in 1973, graduated Yale in 1979, and began his career at AT&T Bell Laboratories.
Beverly Wright, PhD Chief Analytics Officer, Aspirent Professor, Georgia Institute of Technology Dr. Wright is a leading analytics practitioner and sought-after speaker at industry events. In her current position, Dr. Wright leads the Analytics and Insights practice for Aspirent, a management consulting firm for Fortune 100 clients. She is also a professor at Georgia Institute of Technology’s Scheller College of Business, where she also serves on the Board of Directors for the university’s Business Analytics Center. Dr. Wright previously held analytics, marketing, and decision-making leadership positions at Cox Communications, SunTrust Bank, Southern Company, and AGL Resources. She has also taught at Georgia State University, East Carolina University, Clayton State University, and Berry College. Dr. Wright earned her PhD in Marketing Science, MS in Analytical Methods, and BS in Decision Sciences from Georgia State University. She is a Certified Analytics Professional (CAP) and received Professional Research Certification from the Marketing Research Association.
Peggy Tsai Vice President – Analytics & Data Morgan Stanley Wealth Management Peggy Tsai is Vice President in the Analytics & Data department at Morgan Stanley. She is responsible for managing the adoption of Data Governance standards and processes in Wealth Management. She works closely with IT, Business Stewards and Enterprise to identify and resolve data quality issues that impact the business. Peggy has over 15 years of experience in data management, stewardship and governance. Prior to joining Morgan Stanley, Peggy was Data Innovation Lead at AIG where she worked in the enterprise data management practice to support Anti-Money Laundering, Solvency II and GDPR in Latin America and Europe. In addition, she led a winning Innovation Boot Camp team with a proof of concept on Natural Language Processing and Machine Learning aimed at improving the underwriting process. Peggy also worked at S&P Global where she held various positions in enterprise data group and technology in order to drive the value of data between the business and IT. Peggy has a Masters in Information Systems from New York University and a Bachelors of Arts in Economics from Cornell University. She is a member of the Data Governance Professional Organization. In her spare time, she hosts externships for Cornell freshmen students in order to increase the awareness of data and analytics among undergraduates.
5
• Machine Learning • Data Discovery • Process • going to the source • making data usable • developing solid insights
• Challenges
what is machine learning? 6
An application of artificial intelligence that automates the process of analyzing data, allowing computers to detect patterns, build and refine models, and make decisions, without being explicitly programmed by humans to do so. The “learning” term comes into play through iterative model refinement and optimization, converging on higher accuracy.
Applications Methods Supervised vs Unsupervised
Prediction
what is data discovery? 7
The process by which business users make sense of vast amounts of information contained in data warehouses, spreadsheets, flat files, or a mix of them all. Data discovery combines database architect (or other data “owner”) knowledge with analysis software and business intuition to turn raw data into a powerful tool for driving business decisions.
Data Selection Data Cleaning Data
Transformation Data Mining
The Process: from data to insights 8
Even with the advancements made in data science and machine learning in recent years, businesses still face lofty hurdles upfront: to analyze the data, you have to first find the data.
Source of data
• Data in its most raw form, often in disparate sources and in different forms
Making it usable
• Interpretation, cleaning, outlier handling, forming metrics
Finding insights
• Data is ready – now make sense of it
Challenges
9
the Source | who, what and where is the data? 10
Nat
ure
of
the
dat
a • Transactions
• Customer information
• Business information
• Product information
• Sales pursuits
• External data appends
Dat
a C
aptu
re M
eth
od
s • POS systems
• Surveys (digital or paper)
• Online vs in-person
• Booking systems
• Old-fashioned data entry
• Purchased
Sto
rage
Lo
cati
on
s • Flat files
• Spreadsheets
• Relational tables
• Full Enterprise data warehouse
• Unstructured vs structured
• Audio (e.g. customer call recordings)
Source: Location
Raw data may come from disparate sources, owned by different data stewards, and require tact and skill to access
11
Source: Types
Data is often inconsistent – even the same variable or metrics may have different measurements
12
making it usable | the data cleaning process 13
Un
der
stan
din
g th
e d
ata • Metadata: data
about data
• Data dictionary – definitions, data type, and possible values
• Join keys
• What does a missing value mean
• Level of granularity
• What is a “unit”
Dat
a va
lidat
ion
• Missing value imputation
• Record validation
• Verification of uniqueness
• Joining/merging
• Deduplication
Cle
ansi
ng
and
pre
p
• Outlier handling
• Standardization
• Incorrect data/invalid characters or out-of-range values
• Dummy coding
• Transformation
• Variable reduction
Usage form: Cleaning
Data is typically dirty – even if data is ‘perfect’ not usually collected and stored for analytics purposes
14
finding insights | learning from the data 15
Bu
sin
ess
Qu
esti
on
s •Understanding the reasons behind the analysis
•How the information will get used
•Not just the “why” but the “why-why” behind the reasons
Usa
bili
ty
•No real value if results are not consumable by the target audience
Rel
evan
ce
•Managing expectations
•Results that are important to the business
•Must hold value to the questions asked
Finding insights: Uncovering the truths
Analytics professionals are truth seekers: extracting meaning from data is the fun part…
16
summary 17
• In today’s world, business decisions have to be data-inspired to stay competitive
• Those decisions are only as good as the data used to make them
• “Garbage in, garbage out”
• Inefficient data pipelines negatively impact IT (more processing, storage, and manpower is required) and the business (slower decision making, incorrect information is shared, costly mistakes)
• Being data-savvy can translate to business benefits
• Revenues
• Reduced operating costs
• Market advantages
• Consumer loyalty
The Need for Data Governance in Analytics
Peggy Tsai Vice President, Morgan Stanley Wealth Management
February 27, 2018
Data Governance Lifecycle
Scope
Define
Measure
Risk Impact
Monitor
Data Stewards
glossary
Data Quality Rules
Risk Assessment
Dashboard
Evolution of Data Governance
The biggest drivers for data governance is no longer regulatory/compliance
BCBS 239 CCAR GDPR MiFid Solvency II
Critical Data Elements Governance Councils Data Quality Monitoring Data Sourcing/Lineage
Analytics is Driving the Data Evolution
Data Steward for Accounts
Data Scientist
• Define Business Terms for Customer Name, Address, Email
• Define Data Quality Rules and Thresholds
• Provide requirements for business usage
• Define business logic or rules
• Create models that predict customer behavior/patterns
• Find, join and create large sets of uncurated data
• Answer questions that quickly drives the Sales or Marketing campaigns for example
Case Study: FreshDirect
• Online grocer competes on analytics
• Predicts consumer’s shopping patterns to develop custom solutions
No organization can succeed in analytics without
a strong foundation in data governance
Three major disciplines for Data Governance
What is different with Analytics Governance?
Machine Learning for Data Discovery
• A framework for the data is still required for structured and unstructured data
• Data Quality controls still need to exist to ensure completeness and accuracy of the data
• People still need to ensure the curation of the data is accurate in order to create the training set
Key Differentiators in Analytical Governance
Self-Service
Curated Data
Visualization Usability
Crowd-sourcing
Speed
Analytics Data
Governance
Drive value from data quickly and reliably
SPONSORED BY HOSTED BY
Thank You!
Any Questions?