how fannie mae leverages data quality to improve the business

17
Navy Cyber Defense Operations Command Cyber Warriors – Ever Vigilant UNCLASSIFIED UNCLASSIFIED

Upload: dlt-solutions

Post on 08-Aug-2015

59 views

Category:

Technology


3 download

TRANSCRIPT

Navy Cyber Defense Operations Command Cyber Warriors – Ever VigilantUNCLASSIFIED

UNCLASSIFIED

How Fannie Mae Leverages Data Quality to Improve the Business

April 23, 2015Speaker:  James Barrett

Federal National Mortgage AssociationWashington DC USA

© 2015 Fannie Mae

About Fannie MaeAs the leading source of residential mortgage credit in the U.S. secondary market, Fannie Mae is supporting today's economic recovery and helping to build a sustainable housing finance system. We exist to provide reliable, large‐scale access to affordable mortgage credit in all communities across the country at all times so people can buy, refinance, or rent homes.

We are working to establish and implement industry standards‚ develop better tools to price and manage credit risk‚ build new infrastructure to ensure a liquid and efficient market‚ and facilitate the collection and reporting of data for accurate financial reporting and improved risk management.

We are committed to being our customers’ most valued business partner and delivering the products, services, and tools our customers need to serve the entire market confidently, efficiently, and profitably.

James Barrett is the Data Quality Manager in Enterprise Data, Operations & Technology at Federal National Mortgage Association (Fannie Mae). His background includes architecture (enterprise and solutions), database administration, project management, and custom software development specializing in enterprise data stores, and of course, data quality.

Note: The views expressed in this presentation are the speaker’s and do not necessarily represent those of Fannie Mae.

SPEAKER

How Fannie Mae Leverages Data Quality to Improve the Business

1. Overview2. Data Quality …

Who cares? Why care? What is it? When to deploy? Where to deploy?

3. Expectations & Experiences Centralized vs. federated vs. self‐service models for DQ build‐out Effective self‐service DQ DQ integration with enterprise architecture Cost reduction DQ ownership

4. Data Quality – Next Steps

Who cares about Data Quality?• Regulators

– Enterprise vs business “silos”• Data Governance & Chief Data Officer

– Responsible for DQ to Senior Management• Data Owners

– Need to be aware of DQ and fix it if necessary• Data Managers

– Governance and Owners look to EDM for solutions for DQ• Users of Data

– Provide data used by decision‐makers – tactical and strategic• People affected by decisions made by users of data

– Customers, policy‐makers, planners

The Enterprise Data Quality Manager has many viewpoints and opinions to consider!

Why care about Data Quality?• Because regulators care• Data quality affects quality of work and life

– Did you use DQ today?– Do your teams use DQ in their jobs?

• How can governance, data owners, and data management ever meet the enterprise DQ need?– Data keeps growing– Roles and responsibilities need definition, and 

change over time

Some sort of balancing act must be achieved

What is Data Quality?• Fit‐for‐use

– Avoid over‐kill; use DQ to meet purpose for which data is used; not all data is critical for all purposes; global vs. local 

• As many criteria as there are uses for data– My fatal error may be your trivial warning

• Measurement, Monitoring, Remediation– DQ Business architecture– Data Correction is tough

• Attribute validation vs. source‐target reconciliation– Use cases enable fit‐for‐use analysis

Start with a DQ Business ArchitectureDQ Rule Metadata TemplateDQ FunctionsDQ Use CasesDQ Relationships with Metadata ManagementDQ Relationships with Enterprise Logical Data Modeling

When to deploy Data Quality?• Re‐active

– After somebody notices– After somebody asks (with/out $)

• Pro‐active– Before anybody notices– Before it spreads downstream– Use a pre‐defined list of data attributes and 

standard rules• Exceptions:  accept, replace, or reject

Being pro-active can be expensive; being re-active is risky

Consider your consumers when defining exception rules

Where to deploy Data Quality?• Application Build‐Out ‐ Centralized vs. Federated

– While loading (“in‐flight) OR after loading data (“at rest”)• Self‐Service

– Can be fast and cheap– Can’t handle all DQ rules and requirements

• Areas of risk– How to identify?– How to quantify?

• At the source if possible– Need 20/20 hindsight OR green‐field projects

Hybrid strategies seem the most robust

DQ application build‐out• Centralized

– Rules built/run by 1 application for other applications’ data – ”at rest”– Single application implies single owner ‐> enterprise data governance

– Initial DQ build‐out: DQ standards, design patterns, CoE resources• Federated

– Rules built/run by application for its own data – “in flight” or “at rest”– Rules and stored metrics/exceptions owned by each application

• Self‐Service– Rules built/run by data analyst/team – “at rest”

– Rules and stored metrics/exceptions owned by team/analyst– Tool‐based (e.g, IDQ Analyst) rather than custom development

Federated scales better than centralized model; Self-Service has lowest cost per rule but setup and support requires DQ CoE

Effective Self‐Service DQ• Training and documentation• Usage tracking after training• Customer feedback• Desk‐side support after training• Team‐based access and change 

control• Individual and shared rule folders

Confucius: “Feed a man a fish, you feed him for a day.

Teach him to fish, you feed him for a life time”

DQ <‐> Enterprise Architecture• Centralized DQ rule repository• Data quality rule lineage• Technical vs. business DQ rules• Patterns for DQ rules in data flow from/to:

– Transaction Data Store– Operational Data Store– Master Data Store– Data Warehouse– Data Mart

• BPM and BAM– Data exceptions and corrections imply:

• Alerts• Replay corrections for downstream• Re‐calculation of derived attributes

13Architecting DQ <-> EA: don’t let the perfect be the enemy of the good.

Cost reduction• Re‐usable components

– DQ Rules:  Logical and Physical Design Patterns

– DQ Rules Results: Data Structures for Summary and Detail Metrics

– DQ Reports: Metadata‐driven

• DQ Warranty• DQ Metrics Common Message XSD• Self‐Service DQ framework

“Killing two birds with one stone” is a proverb made for DQ cost reduction

DQ ownership• Data Owners – defining DQ rules, by 

ELDM entity/attribute• Application Owners ‐ remediation of DQ 

exceptions• Data Governance ‐ DQ policies and 

standards• Data Management ‐ best practices for 

implementation of DQ standards• Data Users – identifying and raising DQ 

issues to all of the above

DQ management requires good negotiation and persuasion skills to build teams

Data quality ‐ next steps• Define KPIs to manage 3 DQ build‐out models• Integrate Self‐Service and Federated DQ• Quantify DQ risk at rule‐level, and apply to DQ warranty value‐chain• Integrate BAM and BPM with data corrections

It ain’t what you don’t know that hurts you, it’s what you know for certain that ain’t so.- Mark Twain