case study: taxonomies as a tool to increase discovery of intelligence community data assets

19
1 Taxonomy Development Revisited Lessons Learned* Marcie Zaharee, PhD Data Harmony User Conference February 2014 [email protected] 1 *Fahsi, A., Zaharee, M. (2013). Framework for Developing an Intelligence Reconnaissance and Surveillance (ISR) Operations Taxonomy. MITRE Technical Report Approved for Public Release

Upload: accessinnovations

Post on 11-May-2015

693 views

Category:

Documents


4 download

DESCRIPTION

Presented at the 10th annual Data Harmony Users Group meeting on Tuesday, February 11, 2014 by Marcie Zaharee of the MITRE Corporation. Describes the use of Data Harmony Thesaurus Master by the NGA-DCGS Metadata Harmonization (MDH) project to create and manage the Intelligence Surveillance Reconnaissance (ISR) Operations taxonomy. The results of the project suggest that an ISR Operations taxonomy can be built, exported, and shared with the DoD and intelligence communities in a format that both users and systems can understand, and that the taxonomy can serve as an aid in populating metadata fields to increase discovery of data assets.

TRANSCRIPT

1

Taxonomy Development Revisited

Lessons Learned*

Marcie Zaharee, PhD

Data Harmony User Conference

February 2014

[email protected]

1

*Fahsi, A., Zaharee, M. (2013). Framework for Developing an Intelligence

Reconnaissance and Surveillance (ISR) Operations Taxonomy.

MITRE Technical Report Approved for Public Release

2

Overview

• Recap from last year

– ISR Operations Taxonomy Development effort and framework

• Lessons Learned

– Working in teams

– Working with SMEs

– Developing, Maintaining, Exporting, and Posting

• Summary

Approved for Public Release

3

ISR Operations Taxonomy*

Research Questions:

• Can an adequate unclassified ISR Operations taxonomy be

built from open source material?

• Can an unclassified ISR Operations taxonomy be designed

in a way that: – Is repeatable?

– Is easily accessible and understandable to the end user?

• How can an ISR Operations taxonomy… – Provide terms for population of metadata?

– Be effectively exported to a machine-readable language and used to

facilitate searches?

*Classification scheme to categorize ISR Operations data assets (i.e.,

platforms and sensors) Approved for Public Release

4

Taxonomy Development Framework

Approved for public release 13-140

5 Approved for Public Release

6

Educate team members

Approved for Public Release

7

Build consensus, not necessarily unanimity

Approved for Public Release

8

Document decisions

Approved for Public Release

9

A network of experts is essential

Approved for Public Release

10

Establish guidelines when working with

SMEs

Approved for Public Release

11

Graphical Representation is helpful when

working with SMEs

Approved for Public Release

12

Establish file naming conventions

Approved for Public Release

13

Aggregation can present a challenge

Approved for Public Release

14

Authoritative Sources May Not Exist

Approved for Public Release

15

Maintaining terms is complex and

resource intensive

Approved for Public Release

16

Export terms in a machine readable

language

Approved for Public Release

17

Understand posting requirements when

working with repositories

Approved for Public Release

18

User Feedback

• Human Readable

– Download metrics on Data Services Environment reflect interest in the

ISR taxonomies

– Differences in opinion on categorization of terms and overall hierarchy

• Machine Ingestible

– Longer PT and NPT names may result in unreadable User Interface (UI)

list labels as well as impacting search

Approved for Public Release

19

Summary

• Successfully answered our research questions – Can an adequate unclassified ISR Operations taxonomy be built from

open source material?

– Can an unclassified ISR Operations taxonomy be designed in a way

that it is repeatable and understandable?

– How can an ISR Operations taxonomy be exported to a machine-

readable language and used to facilitate searches and provide terms for

population of metadata

• Nine step framework essential to our proof of concept,

but not without its limitations

Approved for Public Release