modeling public records fraud detection• claim edits, usual and customary charges, and network...
TRANSCRIPT
Using Predictive Modeling and Public Records in Fraud DetectionClint Fuhrman
National DirectorGovernment Healthcare
Taxpayer Dollars Are Under Attack
Opportunities
Eliminate the “Pay and Chase”
status quo by looking to other industries, private sector for
successful approaches and technologies
– Identity Proofing/Identity Management –
Financial Services, Banking– Predictive Claims Analytics –
Property and Casualty Insurance– Social Network Analysis –
Intelligence and Law Enforcement
• Greater focus on the individuals and entities in the program
• Are beneficiaries enrolling who they claim to be? • Have they disclosed all assets, income, correct state of residence, etc?• What are the true backgrounds of the practitioners, officers, agents, etc?• What is the risk profile of a provider based on background, associations, etc.?• What significant events are occurring between enrollment periods?
CMS Center for Program Integrity (CPI) “National Fraud Prevention Program”
focused
on prevention and detection that is integrated, risk‐based, and measurable; four areas
of focus: Provider Screening; Predictive Modeling; Data Integration; Case Management
The Many Faces of Fraud
Over 80% of all suspected fraud cases involve provider fraud.
FALSIFICATION OF INFORMATIONFALSIFICATION OF INFORMATION
QUESTIONABLE PRACTICESQUESTIONABLE PRACTICES
OVERUTILIZATIONOVERUTILIZATION
Note: Lists are not comprehensive.
False coding,altered claims
Upcoding, unbundling,
cost‐shifting,
prescribing practices,
clustering, underutilization,
invalid places of service,
non‐contracted providers
Medically unnecessary diagnostics,
high frequency of office visits,
unnecessary durable medical equipment,
inappropriate diagnosis procedures
Government
Background Screening
Collections
Legal
InsuranceInsurance
Financial Services
Who are you?
Where are you?
Who are you related to, and how?
How much of a risk do you present?
Identity Analytics Health Care
We assess the risks and opportunities associated with people,
businesses and assets.
Data
250M+ unique
individuals
1B unique business
contacts
Analytics
30M transactions/hr
<500 millisecond avg
search response time
~34 Terabytes in use
ComputingReal time analytics Scores
to support customer
workflow for remote
transactionsScores around individual
risk/ opportunity
Linking34 billion public
records
1 million documents
added every day
36,000 legal,
business, news
sources
Overview of a Data Aggregation/Risk Solution Provider
Health Care Solutions for Commercial Payers 5
ENTITY RESOLUTION
LINK ANALYSIS
CLUSTERING ANALYSIS
COMPLEX ANALYSIS
PUBLIC RECORDS
PROPRIETARY DATA
NEWS ARTICLE
UNSTRUCTURED
RECORDS
STRUCTURED
RECORDS
Utilizing Advanced Technology to Establish Identity and Risk
Health Care Solutions for Commercial Payers
Claims Analytics
Presentation Title
• Early detection of fraud, waste and abuse• Prioritized results with fewer false positives, which enable
more efficient use of investigative resources
• Alerts concerning adverse changes in the status of individuals or entities accessing benefits or networks
• Lower claims losses, better cash flow and higher ROI than
traditional “post‐payment only”
methods
• Consistent control over risk, quality and costs thanks to automated provider screening and monitoring
• Confidence in knowing that the right providers are being paid for the appropriate services on the appropriate members
Reducing Risk: Advantages of Enterprise Solutions
Analytics: The Value of Tips vs. FWA Software
0%
10%
20%
30%
40%
50%
60%
70%
Tips Data Analysis
Percent of Respondents
Most volume
Most savings
Health Care Solutions for Commercial Payers
Predictive Modeling Adds a Score Plus More
TwoSanctions
Criminal Record
SignificantEdits
Plus More
Bankruptcy
Sample Model Score: 985
Copyright © 2011 LexisNexis. All rights reserved.
Fraud Prevention: Predictive Claims Analytics
Claim EditsClaim Edits
Provider DataProvider Data
Diagnosis DataDiagnosis Data
Treatment DataTreatment Data
Internal (P
ayer) D
ata
External Data
Claims Fraud IdentificationClaims Fraud Identification
“Provider of Interest”
Identification
“Provider of Interest”
Identification
Subrogation IdentificationSubrogation Identification
Social Network AnalyticsSocial Network Analytics
And more…And more…
Edits
Public Records Data
Sanctions Data
Fee Schedules
PREDICTIVE MODELING
TEXT MINING
BUSINESS RULES
IDENTITY MATCHING
TEXT SEARCH
SOCIAL NETWORK ANALYTICS
VISUALIZATION
DATA SMART ORDER
USER INTERFACE
REPORTING ENGINE
SCORING ENGINE
DATA MART
DATA EXCHANGE
FUNCT
IONAL CO
MPO
NEN
TSSTRU
CTURA
L COMPO
NEN
TS
Health Care Solutions for Commercial Payers
Stopping abusive and fraudulent claims prior to payment will allow customers to devote
more resources to providing care to members.
Stopping abusive and fraudulent claims prior to payment will allow customers to devote
more resources to providing care to members.
Claim ArrivesClaim Arrives
License and sanctions data,criminal history, sexual offender,
etc.
• Claim edits, usual and customary charges, and network pricing
agreements stop only part of the impact of fraud, waste and
abuse on healthcare payers• Data‐driven analytics can produce additional claim edits that can
significantly supplement the current claims adjudication process• Claim‐level scoring can:
• enhance identification of claims post‐pay for audit and
potential recoveries• be tuned for use in pre‐pay to stop the most egregious
abuses before payment is made• Business rules, monitoring for specific treatment codes, and
rules for claim routing pre‐pay or post‐pay improves workflow
• Claim‐level edit and scoring results can be supplemented by the
identification of providers who consistently bill outside of
normal patterns and practice• While some providers are relatively easily identified, others
exhibit much more subtle patterns that are nonetheless abusive• Identifying these more subtle patterns can provide benefit to:
• In some cases, the SIU• In some cases, the claim audit team, and• In some cases, the network management team
• Problem providers can also have their bills returned before payment to have medical records attached
Analytics for Claims Processing Workflow
Copyright © 2011 LexisNexis. All rights reserved.
Claim Continues in
Adjudication
Claim Continues in
Adjudication
Fraud is hidden in a sea of valid
claims
Without Anything
Fraud is concentrated
and prioritized for review and
mitigation
With Predictive Modeling
Fraud
FraudPotential
High
Low
CLAIM NUMBER
SUSPICION SCORE
144618 993138514 991143949 989145594 988148531 986152506 983152787 982146937 981157651 976141970 973152271 970138703 969149491 968139439 963158952 950149319 948152602 945
Fraud Prevention: Claim Scoring Using Predictive Models
Predictive analytics provides a score for each claim, policy, etc., allowing activity to be
concentrated on areas that have the highest probability of financial return
Some fraud is captured but much is missed
With Rules
Create the target richenvironment
• Models can help identify problem providers early that would
not have been identified by other methods• Looking at thousands of attributes about a provider or a
claim to find a data pattern that makes a robust prediction• Models use:
•Diagnostic codes•Treatment codes•Provider types•Date stamps
• Identify treatment patterns associated with diagnoses that
are characteristic of known problem providers and flag other
providers that exhibit similar treatment patterns
Provider Models
Copyright © 2011 LexisNexis. All rights reserved.
Algorithms
• Supervised vs. Unsupervised Learning• Have a specific outcome in historic data• Do not have an outcome “cluster”
like together
• Decision Trees• Accurate, conceptually “understandable”, non‐linear, non‐parametric,
robust with outliers, missing data, automatic interaction terms
• Neural Nets• Work best with pre‐transformed “smooth”
data• Difficult training time• Black Box
• Regression• Most established/widely used algorithm• Works well, but doesn’t have some of the advantages of trees• Works much better on linear data
Health Care Solutions for Commercial Payers
Social Network Analytics
Health Care Solutions for Commercial Payers
Challenges Facing Health Care Enterprises
Disparate data is spread across separate physical locationsScale of data. BIG Data is getting BIGGER.Adding relationships exponentially expands the size of the BIG Data analytics challenge.LexisNexis has leveraged parallel‐processing computing platforms and large scale graph analytics for a over a decade.
17
Technology advances are enabling a more proactive response
The emergence of open-source massive parallel- processing computing platforms opens new opportunities for enterprises to increase the agility and scale of solutions focused on addressing fraud and abuse.
– Effectively ingest and integrate massive volumes of disparate data.
– Process and Analyze exponentially faster than traditional databases.
Large Scale Graph analytics, generally thought to be the domain of companies like Google, offer new variables that provide relationship context between events, exposing patterns and outliers that otherwise would be hidden.
– Can be applied to many other many areas beyond network analysis and social graph analysis, such as epidemiology and mathematics.
– Suited to revealing well organized fraud networks hidden within BIG Data and generating actionable results.
18
•
Graph Analysis
‐
Twitter uses Graph Analysis to help
the site determine who’s connected to
whom in the Twittersphere.
‐
Google uses Graph Analysis to power
its PageRank feature.
‐
LexisNexis uses Graph Analysis to
resolve Identities and establish
relationships
•
Social Network Analysis
‐
Graph Analysis that specifically focuses
on graphs built on social relationships.
Graphic Analysis and Social Network Analytics
19
Mixes First Party data with Public and Third Party data sourcesAdds fidelity to existing entitiesAdds new linkages into the analysisAds new entities into the analysisExposes ring leaders and brokers that don’t directly participate
Addition of External Data
Trends in Social Network Analytics
20
Reliance on “Created”
Data
Transform “straw” into “gold”• Process numerous discrete data points
into high‐value dataAdvanced Linking Technology • Resolve numerous names, addresses,
phones, and other info into a “Person
ID”• Better accuracy than other resolution
techniques• Resilient to name, address, and other
info changes (i.e. stable over time)Improves detection, simplifies processing, makes results easier to understand
Trends in Social Network Analytics
21
Powered by massive parallel‐processing open‐
source computing platforms.
Graph \ Network 3 Billion derived public data
relationships between people merged with
risk indicators.
Graph Analytics examine up to 20 billion data
points to create variables that allows for
predictive analysis incorporating relationship
context and associated risk.
Targets fraud across all sectors including Health
Care, Financial Services and Government.
Targeting fraud using large‐scale graph analytics
22
On June 6, 2008, the Department of Justice announced the arrest of Felcoranenda Estudillo
on charges of defrauding Medicare of approximately $12 million in an elaborate scheme
involving home health care services and kickbacks for referrals of patients who were not
eligible for services.
Estudillo was a registered nurse and operated Wescove Home Health Services from her
home in West Covino, CA. Her husband, Oscar Estudillo, owned the business, as well as
several others that used the same home address as their base. Mrs. Estudillo is the only
person named in the indictment, but records show her husband was
the legal owner of the
business.
The link analysis chart on the following slide was constructed to show the complex array of
relationships among Estudillo, her husband, and the varied business they own and operate.
Businesses were linked to the Estudillos that were not reflected
in the indictment.
The identities linked to the Estudillo’s in following slide have been masked but are an
accurate representation of the relationships revealed by the link analysis.
Social Network Analytics
23
Social Network Analytics
24
A top insurer flagged 7 claims as “collusion claims”
Using carrier data alone, we found a connection between 2 of the
7 claims.
Fraud Detection: Social Network Analytics
25
Collusion in Louisiana AFTER Advanced Linking Technology is Applied
Assigned unique IDs to all parties and HPCC added 2 additional degrees of relative data
Family 1
Family 2
Showed 2 family groups interconnected on the 7 original claims plus linked to 11 more.
Fraud Prevention: Social Network Analytics
26
Proof of Concept – NY Office of Medicaid Inspector General
Health Care Solutions for Commercial Payers
Applied social network analytics to information provided by
the State of New York and public data supplied to identify
relationships between a group of New York Medicaid
recipients living in high‐end condominiums located within the
same complex and any links those individuals might have to
medical facilities or others providing care to New York
Medicaid recipients.
Purpose of Proof of Concept
28
• Derived Public Data Relationships are built from +/‐
50 terabyte data base
for the entire U.S. population. This is used to build a large scale network map
of the Medicaid Recipients and everyone associated within 2 degrees.
• Patented algorithms used to cluster the network map and generate
statistics
to measure every cluster.
• Graph is queried for the clusters with the most significant statistics.
• For each cluster, if all these recipients are connected..How many of them are living in expensive residences, owned expensive property or drive expensive cars?How many recipients are contacts of medical businesses?How many medical businesses are associated with any of the people in the cluster?How many are currently receiving benefits?
Methodology
29
What is the list of preferred expensive vehicles?
Make Description # Owned Make Description # Owned
Mercedes-Benz 46 Chevrolet 2
Lexus 41 Hummer 2
BMW 27 Jeep 2
Infiniti 13 Nissan 2
Acura 9 Toyota 2
Lincoln 8 Aston Martin 1
Audi 7 Bentley 1
Land Rover 7 Cadillac 1
Porsche 6 GMC 1
Jaguar 5 Honda 1
Mercedes Benz 3 Volkswagen 1
Saab 3 Volvo 1
City Walk Sample Vehicle Statistics
30
Name Deeds Held Name Deeds HeldHudson Eight 78 Mike Greem 21Hudson Five 74 Scott Hill 21Hudson First 73 Betty Donaway 21Hudson Nine 65 Al Clark 19Harry Anderson 45 Dave Miller 17Hudson Ten 41 Mark Walker 16Hudson Seven 39 Mike Smith 16Home Nationwide 33 Val Edwards 15
Hudson Three 33 Eric Garcia 14Brian Smith 28 Dane Young 14Alan Stevens 25 Bill Moore 14Chris Doe 24 Karen Carter 14Sophie Davis 23 Casey Baker 14Washington Mutual 23 Art Nelson 14Fleet Mortgage Co. 21 Cathy Parker 13
Dominant buyers and sellers at City Walk
Property Deed Reference Counts for City Walk
31
Cluster Visualization
32
A Comprehensive Approach
Presentation Title
A Layered Solution with Fresh Insights
Business Rules
Predictive Modeling
Identity Match/ Claims Watch Intelligent Data Retrieval
Severity Analysis
External Evaluators
Medical Claims
Payer Watch
List
ProviderBill
ProviderBill
Prepayment
ClaimFocusSM
Evaluation
Prepayment
ClaimFocusSM
Evaluation
PAY
Contributory
Data
IDV &
Authentication
EvaluateEvaluate
Appropriate Claims Handling Process
Appropriate Claims Handling Process
Claims Data
History
Policy Data
Provider Billing
History
Medical Bill
Detail
Analytics Processing
Yes
No
Bringing it All Together
Clint Fuhrman
National Dir, Government Healthcare
LexisNexis Risk, Inc.
202‐503‐6639
Thank You!
36
[email protected] In Group: LexisNexis Health Care SolutionsTwitter: LexisHealthCareBlog: http://blogs.lexisnexis.com/healthcare