session 16 security frameworks in data warehousing and...
TRANSCRIPT
Session 16
Security Frameworks in Data Warehousing and Their Interplay with Healthcare Analytics
Patrick NelliSenior Vice PresidentHealth Catalyst
Learning Objectives
• Discuss the balance between data utilization and security/privacy
• Share examples in key areas that impact this balance Monitoring Data de-identification Cloud environments User access
3
Why
We have an obligation to patients to make the best use of the data that we collect on their behalf
4
Data Utilization
Security / Privacy
Top Technology Initiatives Driving IT Investment
5
14%
27%
29%
30%
0% 5% 10% 15% 20% 25% 30% 35%
Other
Data / Business Analytics
Security
Cloud Computing
Source: 2016 State of the CIO – Survey. Exclusive Research from CIO (http://www.cio.com/)
Why
6
Security and Privacy
7
• Multiple layers of security and privacy Physical Controls Preventive Controls Detective Controls Administrative Controls
Many More (HITRUST – 14 Control Categories based on ISO 27001)
• For today, primarily focus on detective
Balancing Act #1
Monitoring
8
Data Utilization
Security / Privacy
Poll Question #2
9
What is the most prevalent security incident pattern in healthcare (by frequency of confirmed data breach incidents)?
a) Cyberespionageb) Insider and privilege misusec) Stolen assets (e.g. laptops)d) Web application attackse) Walking away with paper recordsf) Unsure or not applicable
You Will Never
Catch Me!
`
10
11%
3%
3%
3%
7%
19%
22%
32%
0% 5% 10% 15% 20% 25% 30% 35%
Everything ElseCyberespionage
CrimewareWeb Apps
Point of SaleStolen Assets
Misc. ErrorsPrivilege Misuse
Source: Verizon 2016 Data Breaches Investigations Report
Security Incident Patterns in Healthcare(% of total incidents, only confirmed data breaches)
**Higher Than Any
Other Industry
`
11
55%
57%
60%
61%
61%
68%
78%
85%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Network monitoring toolsIntrusion detection systems (IDS)
Audit logs of access to pt. recordsPatch and vulnerability management
Data encryption (data at rest)Data encryption (data in transit)
FirewallsAntivirus/malware
Tools Implemented for Information Security By Acute Care Providers
Source: 2016 HIMSS Cybersecurity Survey
Our Perspective
• Logs aren’t enough, need monitoring• Manual Search and BI on top of logs Human reviews
• Automated Alerting rules (PagerDuty, Azure OMS, etc.)
12
Monitoring
13
Stack Examples Example Metrics
Analytical Applications / Reports
Web-based, Qlik, Tableau, BO
Usage, click paths, performance
Analytics Environments Specialty focused environments for Predictive Analytics, NLP, Image Analysis
Performance, run times, model metrics (rmse, accuracy)
Database / Data Store / ETL / Compute
SQL Server, Oracle, DataLake
Queries, Access (AD), ETL run times
VMs / Hardware OS (Windows / Linux),Virtualization (HyperV, VMWare)
Event logs (installs, invalid logins, failed applications), performance logs
Network Switches, Firewalls, Routers Invalid logins, suspicious login patterns (IP-analysis)
Benefits
Security / Privacy
Performance / Efficiencies
Product D
evelopment
Triple Benefit of Monitoring Analytics Products
• Security / Privacy
• Performance / Efficiencies
• Product Development
14
Triple Benefit of Monitoring Analytics Products
• Aligns with Level 4 and 5 of HITRUST Policy Process/Procedures Implemented Measures Managed
• Enables streamlined re-certification (SOC 2, HITRUST)• Enables audit of access and appropriate use
15
Security / Privacy – Overview
Triple Benefit of Monitoring Analytics Products
16
Security / Privacy – Ex. Appropriate Use
WHERE p.PersonNM = ‘Pete Hess’
Triple Benefit of Monitoring Analytics Products
17
Security / Privacy – Ex. Appropriate Use
Triple Benefit of Monitoring Analytics Products
• Automate Access Review
Query access groups (Active Directory)
Query database access (SQL Server) or application access (Qlik, Tableau, Web)
Query SQL queries (IDERA) and application usage (Qlik, Tableau, Web)
18
Security / Privacy – Ex. Access
Triple Benefit of Monitoring Analytics Products
19
Performance / Efficiencies
Triple Benefit of Monitoring Analytics Products
20
Performance / Efficiencies
Minimize Total Time Through
the Loop
Triple Benefit of Monitoring Analytics Products
21
Product Development – Overview (Think Lean)
LEARN BUILD
MEASURE
IDEAS
CODEDATA
Session CountsDistinct UsersReturn Users
(Cohort Analysis)Click PathsSelections
Satisfaction Survey (Net Promoter
Score)A/B Tests
Source: Eric Reis, The Lean Startup
Triple Benefit of Monitoring Analytics Products
22
Product Development – Example
Triple Benefit of Monitoring Analytics Products
23
Balancing Act #2
Data De-Identification
24
Data Utilization
Security / Privacy
Safe Harbor
25
• 18 data elements removed/transformed
• Problematic Areas
All elements of dates (except year) for dates
All geographic subdivisions smaller than a state
“The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information”
• No one-size-fits-all transformations
• Curse of dimensionality (k-anonymity)
• Tradeoff between anonymity and utility
• Hard to get right, restricts vast majority of analytical use cases
Expert Determined
26
Source: [1] http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance
Data Flow
AnalyticalValue of Data
Category
Location
Analytical Use Cases
Full PHI (Untransformed)
Secure Environments
Ad hoc querying, analytical applications,
reports, decision support, etc.
Redacted Data (Still PHI)
Secure Environments
Ad hoc querying, analytical applications,
predictive analytics, image analysis, etc.
HIPAA De-Identified Datasets
Varies
Product development, summary aggregated
metrics
Privacy & Security Risk
Data Continuum
Balancing Act #3
Cloud
28
Data Utilization
Security / Privacy
Cloud Environments
29
Overview
Stack Examples Example Metrics
Analytical Applications / Reports
Web-based, Qlik, Tableau, BO
Usage, click paths, performance
Analytics Environments Specialty focused environments for Predictive Analytics, NLP, Image Analysis
Performance, run times, model metrics (rmse, accuracy)
Database / Data Store / ETL / Compute
SQL Server, Oracle, DataLake
Queries, Access (AD), ETL run times
VMs / Hardware ?????
Network ????? Attempted sign-ons,
Benefits
Security
Performance / Efficiencies
Product Developm
ent
• Most of the analytics stack will eventually move to the cloud
• However, first cloud pressure will be for specific analytics use cases
Cloud Environments
30
Best Practices – Leverage Their Audits
Source: 13 Effective Security Controls for ISO 27001 Compliance When using Microsoft Azure
Cloud Environments
31
Best Practices – Monitoring
Cloud Environments
32
Best Practices – Alerting
Cloud Environments
33
Best Practices – Security Center
Topic We Are Contemplating
34
Data Utilization
Security / Privacy
User Access
• Streamline user permission granting process Make select reports / applications
available to everyone within certain roles Involve data stewards
• Role based security Simplify roles
35
Lessons Learned
36
1. Data is useless if you don’t put it in the hands of analysts, operators, and clinicians. Need to strike a balance between security/privacy and data exposure.
2. Logging is not enough, need to make the data actionable through search and BI. This can lead to multiple benefits:
a. Security / privacy
b. Performance efficiencies
c. Better product development
3. Data de-identification is typically not a good balance of utilization and security.
4. Cloud environments, if set up properly, help with the balance of utilization and security.
Analytic Insights
AQuestions &
Answers
37
What You Learned…
38
Write down the key things you’ve learned related to each of the learning objectives
after attending this session
Thank You
39