can privacy & data sharing coexist? · access data legals de-identify matching re-id risk...
TRANSCRIPT
Can Privacy & Data Sharing
Coexist?
Scott AlbinHead of Ecosystem ServicesData Republic
A conversation with
Moderated by Mike Meriton
Co-Founder & COO, EDM Council
• Joined EDM Council full-time 2015 to lead Industry Engagement
• EDM Council Co-Founder & First Chairman (2005-2007) –
Finance Board Chair (2007-2015)
• Former CEO GoldenSource (2002-2014) – an original IBM Global
MDM Company
• Former President of CheckFree CFACS (Compliance &
Reconcilement Solutions)
• Former Executive for D&B Software and Oracle
• FinTech Innovation Lab – Executive Mentor (2011 – Present)
Hosted by Scott AlbinHead of Ecosystem Services
• Focused on growing the Data Republic ecosystem of banks,
telcos, airlines, retailers, and others contributing to the Data
Economy.
• Former Southeast Asia Data & Analytics Consulting Leader for
PwC Singapore, advising on data strategy, use of advanced
analytics, AI and data sharing and commercialization.
• Has worked across financial services (retail banking, wealth and
insurance), public sector, travel & transportation, retail, energy
and healthcare.
• He has lived and worked across a range of markets
including the US, Australia and Southeast Asia
Can Privacy & Data Sharing Coexist?
1. Foundations of the emerging data economy
2. The perceived data privacy vs innovation tradeoff
3. Real-world data innovation and collaboration case studies
4. Data management considerations for external collaboration
5. Scaling projects with secure technology from Data Republic
Get scanning!There are QR codes on slides throughout the presentation.
Download a slideScan the QR code using any smartphone camera app to download the slide or access references.
Foundations of the emerging data economy
1
The value of data is climbing exponentially- and quickly
$65 million10% increase in data accessibility can mean $65M increase for Fortune 1000 companies.
Forrester
10%
In 2022, data initiatives are predicted to generate
$274.3 billionForrester
Datavalue
+
But up to
73%
of data never gets used or analyzedForrester
Data liquiditycan turn data into a valuable resource
MatureImmature
Imm
atu
reM
atu
reP
riv
acy
La
w F
ou
nd
ati
on
Data Portability / Utility
Competing agendas for strengthened privacy and open data regulation are being implemented around the world
Strengthened Privacy
● GDPR/PDPA/CCPA
● Data Sovereignty
● ACCC Unbundled
Consent
Open Data
● EU Open Data Strategy
● Open Banking
● Data Portability
● Consumer Data Right
$2.9 trillionAI Augmentation will create $2.9 Trillion of business value in 2021
Gartner
AI investments are set to boost revenue by over 30% over the next four years.
Accenture
AI Value
$
30%
One example of data utility: the application of technology and algorithms, which are getting moresophisticated at an incredible pace
AI and algorithms need data like a rocket needs fuel
Data liquidity + AI and
ML =New
business value
In order to innovate andcreate value, you need to connect your data to the outside world + keep it safe
Data
Skillsets
Algorithms
3rd parties
Data Products
2 The perceived data privacy vs. innovation tradeoff
The myth
Privacy vs. Innovation
SecurityIdentity
TrustGovernance
ConsentTransparency
Data sharingCollaborationOpen innovationLiquidityValueSocial good
The reality today is very different
Privacy and Innovation
SecurityIdentity
TrustGovernance
ConsentTransparency
Data sharingCollaborationOpen innovationLiquidityValueSocial good
There are five key problems to solve that bridge the ‘tech gap’ and enable scalable data collaboration.
Key technology requirements to enable enterprise organisations to balance privacy and innovation
Time tovalue
Privacy and consent
Access vsgovernance
Secureanalytics
Scalability
● Standardised governance workflows
● Common legal framework
● No PII● Decentralised
matching● Models for
Consent management
● Configurable access controls
● Simplified project licensing
● Audit trails
● Quarantined analytics workspaces
● Agnostic tooling● Output
governance
● Secure data source integration
● Flexible workspace deployment
● Decentralised PII protection
3 Real-world data innovation and collaboration case studies
The data innovation and collaboration maturity curve
Innovationat scale
Internaloptimization
Data collaboration
Data commercialization
Dataecosystem
Case study: Rapid evaluation of AI and ML solutions
Innovation at Scale
Who: Global Health Insurance Company
Goal: ● Collaborate and innovate ● Quickly engage and evaluate new capability
providers, those with potentially valuable AI or ML solutions
● Host internal data challenges or datathons including their internal data science team
Solutions:● Improved care delivery● Operational cost reduction to mitigating
disease recurrence● Developing targeted treatment plans for at-
risk populations
Impact:
● $5M+ in recognizable operational return from a single partner project by deployment of a Sandbox-built and maintained AI model
● Crowdsourced innovation spurs cognitive diversity and returns equivalent productivity of 1 FTE in as little as 35 days
● 48 hour turnaround from receipt of legal documents to data access in a Workspace
● 25+ organizations evaluated within 9 months
Case study: Safely match customers across datasets for new customer acquisition
Data Collaboration
Who: Financial Institution and Loyalty Program
Goal: ● New customer acquisition. ● Marketing initiatives to introduce Loyalty
customers to the financial services offering.
Solution: ● De-identified, matched customer records
analysed● Boolean logic for suppression list of existing
customers● ML techniques applied to identify
characteristics of a high value customer● Marketing campaigns targeting new
customers launched
Impact:
● Rapid analysis of a matched dataset that previously could not be combined
● Application of ML techniques surfaced opportunity to target customers better
● 1,700 new customers acquired in the first 2 weeks of launch of campaign
Data management considerations for collaboration
4
Audit
Legal People Use Security
Data Output
Data Republic has identified seven types of controlsfor governance of data collaboration projects
Additional information on Data Republic Seven Controls Framework available on request
Data with PII requires additional considerationsand measures
Separate PII and non-PII
Tokenize attribute data (“pseudonymisation”)
Transformation (e.g. aggregation, differential privacy)
Control (e.g. access, disclosure)
Basic approaches to privacy preservation
Consent to use customer data has nuanced considerations
Consent considerations for customer data
Data use scenario Category description General consent required
Aggregate / non-matched Receiving or providing data to a third party at an aggregate level (anonymised, non-personal, non-identifiable, no “pink ferrari” possibility)
Not required (but obtaining ‘Use’ consent is best practice)
Matched-in Receiving data at a personal level against an identifiable individual
Required to obtain ‘Collection’ consent
Matched-out Providing data to a third party at a personal level against an identifiable individual
Required to obtain ‘Disclosure’ consent
Other considerations...
Regulatory Jurisdiction Anonymisation / De-identified Exemptions & the grey areas
1 Start with a plan and clear use cases
2 Keep it simple (to start)
3Importance of internal and external readiness
4 Lead, don’t follow
Lessons learned
Scaling projects with secure technology from Data Republic
5
The safe, secure and scalable way to innovate and collaborate with data
Senate PlatformThe trusted platform for governing inter-organisational data collaboration and innovation
Senate MatchingPrivacy preserving , patented matching technology that does not let Personal Information (PI) leave your secure systems
Data Republic EcosystemJoin, create or expand an ecosystem of trusted data partners, consumers and providers.
Scan the QR code to watch a demo of the Senate platform
Governed projects are performed in two ways on the Senate platform
Innovation Sandbox
Secure sandbox for leaders to accelerate innovation agendas, without compromising data security or privacy.
Data Collaboration Suite
Confidently launch secure projects with data from multiple parties, with governance and security all under control.
● Create new data insights with your partners by matching customers and combining insights to tailor offers and personalise experience
● Develop new revenue streams by enabling ‘data apps’ to be created with your data and partners data
● Conduct M&A due diligence
● Host hackathons using your sensitive data without the risk of data exposure or PI leakage
● Invite startups, academics or high tech companies (AL/ML) to demonstrate their algorithms, IP or skillsets on your data
● Securely expose sensitive data to partners / suppliers to improve efficiencies
Data Republic
Trusted by 150+ organisationsacross Singapore, Australia and the USA
Top 5US Insurer
Top 5 SG Insurer
Questions?
FOR MORE INFORMATION:
Scott Albin
Head of Ecosystem Services
Data Republic
Full presentation with
appendix of additional
information will be sent
via email.
Appendix
1 Step-by-step guide for data innovation and collaboration with Data Republic
APPENDIX
Innovation at Scale
Workflows to Access Data
Legals Data PrepSecure
WorkspaceAnalytics
& AI
Deploying ‘Into the
wild’
Monitor
Iterate & Improve
Data Preparation
The right data, to right people;
at the right time.
Controlled AccessBYO tools and flexible
infrastructure.
Monitoring and Optimising
Improving solutions overtime.
Decision frameworkThe who, what, when,
where, why, how.
A
Enterprise Organisation
ML
AIAI
One to Many
APPENDIX
Data Collaboration
Workflows to Access Data
LegalsDe-identifyMatching
Re-IDRisk
Solutions
Secure Workspace
Data PreparationThe right data, to right people;
at the right time.
Decision frameworkThe who, what, when,
where, why, how.
Controlled AccessBYO tools and flexible
infrastructure.
Approved Output
Insight ExtractedApproved under data
license terms.
One to OneA
Enterprise Organisation
B
Enterprise Organisation
APPENDIX
Data innovation deep dive2
APPENDIX
The data innovation imperative
Assessing external technologies and tools is essential for innovation and transforming companies digital capabilities.
Many enterprises do not have the data expertise internally, and need to assess external capabilities and tools to drive their business forward.
84%
of executives say that innovation is important to their growth strategy
McKinsey
Yet for business owners or innovation managers, today’s solutions for testing data tools are costly, time-intensive and risky.
APPENDIX
Your company data has an abundance of potential value
Realized value
Performance gap
Probable value
Potential value
Vision gap
Gartner APPENDIX
Senate Innovation Sandbox
wo
rksp
ace
Se
na
te
Your org’s environment
Senate project setup
Approved output
Enable innovation analysis by deploying:
Partner, alternative or 3rd party data
AI/ML/deep learning solution companies
Bespoke tool sets & programs
External data talent
Approved data packages uploaded
to Workspace
Data scientists analyze data in quarantined
workspace
Data license approved
Workspace launchedData uploaded as a package
Database Contributor node
Customer data is de-identified using Senate Matching
C
T
APPENDIX
Case study: Innovation
Who: Global Health Insurance Company
Goal: • Collaborate with new partners and drive innovation.• Quickly engage and evaluate new capability providers, those
with potentially valuable AI or ML solutions (i.e. train a machine learning algorithm to determine which patients in a dataset are most likely to return to a hospital within 30 days of discharge).
• Host internal data challenges or datathons including their internal data science team.
Problem: Data privacy and security concerns.
Solution: Data Republic for neutral infrastructure and legal framework to facilitate a sandboxing solution, where the companies de-identified raw data could be made available for secure discovery by external organizations.
How it worked:• 10-20Tb of de-identified health data under strict infosec and
privacy rules onboarded to Senate.• Fast deployment of restricted views of this data within Data
Republic’s secure workspace.• Provision of Partner algorithm to the data under a
permitted use license structure.• Risk team access to Partner evaluation with minimal
training and effort to test and review output.
Impact:
Successful ROI from new risk models.
Global engagement model.
Provided platform for expansion into next generation use cases.
APPENDIX
Case study: Innovation
Who: Global Health Insurance Company
Goal: A health insurer with a global footprint leverages Data Republic’s Senate technology to innovate on 10+ years of certified, de-identified patient and clinical research data to solve real-world problems in healthcare.
Solutions:● Improved care delivery● Operational cost reduction to mitigating disease
recurrence● Developing targeted treatment plans for at-risk
populations
Impact:
● $5M+ in recognizable operational return from a single partner project by deployment of a Sandbox-built and maintained AI model
● Crowdsourced innovation spurs cognitive diversity and returns equivalent productivity of 1 FTE in as little as 35 days
● 48 hour turnaround from receipt of legal documents to data access in a Workspace
● 25+ organizations evaluated within 9 months
APPENDIX
Case study: Innovation
Who: Bank and AI Platform DataRobot
Goal: Evaluate capabilities of an AI platform. New technologiesand techniques to transform the Banks model for evaluating consumer retail risk.
Problem: To bring the DataRobot software directly into the Bank would be a time-intensive and expensive exercise.
Solution: Data Republic for secure, neutral infrastructure and legal framework to facilitate a sandboxing solution.
How it worked: The DataRobot software was installed in Data Republic’s secure workspace, to be accessed and trialled by the Bank’s analysts.
The Bank used their consumer credit application data as input into the DataRobot tool to consider more sophisticated techniques to evaluate retail risk:
• Vast range of linear models.• Hyperparameter tuning options.
DataRobot provided comprehensive model explanations saving time for the Bank’s analysts.
Impact:
AI methods confidently and easily applied.
Insights developed in hours rather than months.
APPENDIX
Case study: Innovation
Who: Bank
Goal: To use research data from Roy Morgan as sample-based validation of their value models.• Secure, de-identified analysis of a matched dataset covering
research data and bank data.• Restriction of the data license to exclude any row level extracts • Conduct analytics to compare the value segments used by the
bank of a sample of matched records, enriched by Roy Morgan research.
Solution: Secure analytics Workspace provided with attribute data from Roy Morgan and the bank, and matched data tables enabled for analysis through Senate Matching.• Matched data tables de-identified and provided the key to
delivering a multi-company view of each customer that exists within both datasets.
• Customer value model insights delivered to adhere with the data license agreed between two parties.
Impact:
Validation of the accuracy of the bank’s value segmentation.
New insights into how to make the bank approach more encompassing in the future, looking at fresh indicator variables.
Deepened collaboration framework between bankand research company.
APPENDIX
Case study: Innovation
Who: Temasek and StartupX
Goal: • To ideate and build solutions for a better and more sustainable
future through the world’s first global sustainability hackcelerator.
• A Datathon segment allowed participants to access synthetic data and analytics toolkits to build innovative data-driven solutions focused on bettering health outcomes and financial well-being.
Solution: Data Republic provided strategic advisory services on Datathon event preparation, governance of Datathon licensing, dataset de-identification and loading, as well as participant preparation.• A Datathon Project capturing all licensing terms for combined
datasets and participants facilitated via the Senate Platform. • Teams securely accessed and analysed synthetic datasets and
design innovative data-driven solutions through Senate secure workspaces.
• A dedicated Data Republic support team were on the ground at the event to give expert advice and guidance.
How it worked:• 3 day event• 3 months in the making• 52 hours of hacking • 60 Datathon participants• 100+ Hackathon participants
Impact:
The winning team delivered a data-driven solutionthat predicts and identifies populations with ‘at risk’ health, based on their supermarket and banking transactions.
APPENDIX
Case study: Innovation
Who: Bank
Partners: Artificial Intelligence, Machine Learning, Data Processing and Data Visualization firms
Goal: To provide granular data to external capability partners in a governed, safe and regulatory compliant way.
Solution: Data Republic provided strategic advisory services on Datathon event preparation, governance of Datathon licensing, dataset de-identification and loading, as well as participant preparation.• A Datathon Project capturing all licensing terms for combined
datasets and participants facilitated via the Senate Platform. • Teams securely accessed and analysed synthetic datasets and
design innovative data-driven solutions through Senate secure workspaces.
• A dedicated Data Republic support team were on the ground at the event to give expert advice and guidance.
How it worked:• Capability partner, a data visualisation firm, uses Data
Republic to access banking credit transaction data to create insight and visualisation reports over a geo-spacial map for its customers in the government and retail sector.
• Bank used a powered chatbot created by a capability partner, an AI firm, to reduce the number of enquiriescoming into the call centres.The AI firm’s natural language technology automates the customer services process by allowing consumers to request for information using natural language.
• Data Republic provides both the bank and its AI partner with a sandbox to expose actual customer queries to the AI model.
APPENDIX
Case study: Innovation
Who: Top four Australian Bank
Partners: Hyper Anna, an AI platform
Goal: The Bank were looking to provide their Business Banking clients with access to data-driven insights, via an easy-to-use reporting tool.
Problem: Preserving customer data privacy and security were top-of-mind concerns for the Bank.
Solution:Trial the application of a natural language processing system in the provision of insights to clients, leveraging card transaction data.
Hyper Anna provided an enquiry interface for the Business Banking clients, while Data Republic provided the legal frameworkand bespoke infrastructure to enable Hyper Anna to deploy their application.
How it worked:● Cleansed de-identified card transactional data was
licenced from the Bank to Hyper Anna on a recurring basis.
● Bespoke infrastructure ensured that neither project governance nor the utility of the Hyper Anna application were compromised.
● The Hyper Anna enquiry system considered business performance, catchment analysis, benchmarking and customer demographic information, providing practical insights to clients, to enhance business performanceand optimize productivity.
Impact:
● Successfully trialed for 20 clients as part of a pilot project.
● The business intelligence product could be applied to any other bank transactional data.
APPENDIX
Data management considerations continued
3
APPENDIX
Techniques for protecting privacy
De-identification and privacy preserving linkage
• Hashing• Salting• Tokenization
Disclosure risk controls (Sensitive attributes)
• Access controls: Controlled environments, controlled data access
• Output checks
Re-identification risk management
• Transformations: Aggregation, generalization, perturbation (diff. privacy)
• Access controls: Controlled environments, controlled data access
Advanced techniques
• Homomorphic encryption• Confidential computing• Federated analytics
Watch this webinar for a deep dive into the topic: datarepublic.com/webinar-privacy-preserving-matching-techniques APPENDIX
Considerations for cloud
Importance of neutrality in low trust relationships; protection of data and IP
Data at rest versus data in motion
Regulatory restrictions on use of cloud
Future proofing and maximising interoperability
The role of cloud in data collaboration
APPENDIX
How to get started4
APPENDIX
Setting yourself up for accelerated success requires new capabilities
Strategy and Alignment
Business Value and
Metrics Alignment
Data Ecosystem
Strategy
Use Case and Partner
Identification
Use Case and Partner
Sizing and
Prioritization
Internal Readiness
Operating Model
Due Diligence
Data Readiness
System Set Up
Solution Design
External Readiness
Data
Commercialization
Framework
Go-To-Market
Planning
Partner Value
Proposition
Partner Sourcing and
Scaling
Use Cases and
Partner Plan
Development
Execution
Partner Data Sharing
Onboarding Support
Privacy preserving
data license
development
Delivery management
and tracking
Governance
adherence and
privacy validation
Best Practice
Business Value and
Metrics Alignment
Data Ecosystem
Strategy
Use Case and Partner
Identification
Use Case and Partner
Sizing and
Prioritization
APPENDIX
Data Republic provides expertise for organisations throughout this process. We can help you get started.
APPENDIX
Resources
APPENDIX
Resources● Download the Senate platform whitepaper
https://www.datarepublic.com/whitepaper-senate
● Senate platform demo video https://www.datarepublic.com/senate-platform-demo
● Webinar: Understanding privacy preserving matching techniques https://www.datarepublic.com/webinar-privacy-preserving-matching-techniques
● Data Republic resources library
https://www.datarepublic.com/resources/resources-insights
APPENDIX