Data Architectureat CIA
Dave RobertsChief Technical Officer
Application Services, CIO
2
Topics
• Enterprise Data Architecture at the CIA
• Applicability across enterprises
Data Architecture Mission Statement
Enable the mission by enhancing the value of data.
4
A Framework for Data Architecture “make it easy to share within and outside the Agency”“centralized data repository”“enable more effective linguistic search and data manipulation”“reuse the data, so multiple entry of the same information is eliminated”
Some Business Drivers:
• Enterprise Data ModelWhat things are important to us and how they are related
• Master Data ManagementStarting points and priorities
• Data Services StrategyShielding applications from changes in data structure
• Repository StrategyIncreasing data value by reducing data fragmentation
• Uniform Resource IdentifierTelling different things apart
• Semantic Technology StrategyKnowing what we know and what it means
5
When We Started
• No enterprise data model to use for standardization
• Projects underway changing major data stores
• If we developed an EDM and then standardized, it would be too late to influence the major projects
• So we are aligning projects with the EDM as we build it
• We must address not just technical characteristics of data but also the data strategy for the enterprise.
• We are releasing a draft of our EDA every quarter in FY07, final EDA to be complete at end of FY07.
6
Security and
Access Control
Finance Systems and
InterfacesReference
Mis
sio
n &
Bu
sin
es
s O
pe
rati
on
s
EDM Subject Area Overview
Objective
Activity
ResourceParty
Information Object
Co
nte
xt
Subject Area Overview
7
Enterprise Data Strategy
• We are changing our data culture, from project-centric to enterprise-aware
• The scope of the change is huge--top management support is essential (and we have it)
• Enterprise Data Layer project provides public focus for our efforts to improve data value
• The business understands “cleaning up” data and putting it “into” the EDL
• Although data may not move to “enter” the EDL, nevertheless it’s a useful construct
8
Application View of the EDL
• Always available
• Subject to strong, consistent access control
• Discoverable
• Physically protected
• High data integrity
• Sharable
• Consistently represented
9
Enterprise Data Layer Defined
The Enterprise Data Layer is a collection of data of interest to the enterprise, software used to access, manage and control it and hardware used to house and access it.
The Enterprise Data Layer – is always available– makes all data discoverable – includes entity types, attributes and relationships in alignment with
the Enterprise Data Model– has duplicate entity instances resolved and there is a unique
enterprise identifier for each instance– is accessed through a set of enterprise access interfaces– has access to it controlled by enterprise access control– is physically secure
10
Middleware Data Mapping
Enterprise Data Layer and Applications
Enterprise Data Stores
Enterprise Access Control
Ser
v-ic
eService ServiceSer
vice
Se
rvic
e
Application Application Application Application
11
Enterprise Data Model
• The EDM is conceptual, shows only entity types and principal relationships
• There’s a lot that it doesn’t show and doesn’t control
• For master data entities, a logical data model will be developed that will show attributes and details of relationships
• Master data entities include the objects of interest to the community (Person, Organization, etc)
12
Use of Data Models in CIA Data Architecture
• Data models are not used to control storage structures
• Data model constraints apply to interfaces to the EDL, not to physical storage
• Data model constraints can be met logically through middleware or other multipurpose services, or at the storage level
• This flexibility is used to deal with legacy and with COTS
13
How We’re Getting There
• We work with projects in development• Every project is given an Enterprise Data Model
Maturity Assessment• Assessment can be from 1 (project-centric data
management) to 5 (entirely enterprise-aware and compliant)
• Assessments are carried along with project status data
• Maturity level assessment provides a management tool to set goals and track progress
14
Next Step—Legacy
• We are inventorying legacy data stores
• For each, we will develop an appropriate plan to bring each into the EDL
• Some will have storage structures changed, some will have storage structures emulated through middleware or other enterprise services
15
Duration of our Effort
• We can measure progress by EDAMM level and by amount of data “in the EDL”
• The effort won’t be completed in a year; or five; or ten
• We will deliver mission benefit in the current FY and on a continuing basis
• This is like a quality effort; you don’t stop
16
What’s Next
• We are constructing artifacts to make our EDAMM assessment as objective as possible– Features checklist– Mandatory content indicated on EDM– Logical data models for master entity types
• We are writing our EDA to describe the whole process– The document will be specific about EDAMM
assessments– To be completed September 30, 2007; drafts quarterly– March 31 issue will be the first to include EDAMM
17
What About a Community?
• If you’re talking about the whole federal government or the whole Intelligence Community, what should you do?
• We believe that it is not practical to force compliance with storage structures even within a single enterprise, much less across a federation of enterprises
• We are standardizing on interfaces, even within the enterprise
• In a community, why not standardize on information exchange and ignore how it’s stored?
18
The Conglomerate Model
• Enterprise Technical Architecture uses a concept called the conglomerate model
• Individual pieces are separate businesses
• Standardization is intended to allow interchange of information, not common infrastructure
19
Data Architecture for a Conglomerate
• Information exchange formats are required
• XML is the obvious choice for exchange
• But shared semantics for most important data is also needed, not conveyed by XML
• Agreement on conceptual data model for principal entities is needed
20
Thank you!