data management managing big data briefing 10/2012 will graves us-visit chief biometric engineer...
TRANSCRIPT
Data Management Managing Big DataBriefing10/2012
Will GravesUS-VISIT Chief Biometric engineer
Chair of Biometric Domain
Established in 2003, US‑VISIT was one of the initial programs at the Department of Homeland Security (DHS).
In 2007, US-VISIT became part of the DHS National Protection and
Programs Directorate (NPPD).
US-VISIT AT A GLANCE
2
Vision– A more secure Nation through advanced biometric
and biographic identification, information sharing, and analysis.
GUIDING PRINCIPLES
3
FACILITATE legitimate travel and trade
ENHANCE the security of our citizens and visitors
ENSURE the integrity of our borders
PROTECT the privacy of our visitors
INFORMATION REPOSITORIES
4
A DAY IN THE LIFE – BIOMETRICS*
5
*Data as of March 2011.
A DAY IN THE LIFE – BIOGRAPHIC OVERSTAY ACTIONS*
6
*Data as of March 2011.
Big Data
Big Data includes data sets with sizes beyond the ability of commonly-used software tools to capture, curate, manage, and process the data within a tolerable elapsed time
Big Data Datasets so large that they’re difficult to work with using conventional toolsets (terabytes, exabytes, ….)
Big Data as relates to Biometrics is the capture, collection, mapping, analysis, housing, and retrieval of biometrics related data containing properties such as privacy, availability, reliability, maintainability, security, usability, performance, prediction, prevention, and detection
7
Big Data - Factsheet
1996: A Teradata database becomes the world's largest database at 11 terabytes
1999: Teradata customer has world's largest database with 130 terabytes
2000: Oracle Claimed the largest warehouse at ~ 140 TB hosted on two massive Oracle instances
… today 140TB is almost passe
• Cost of Big Data Storage Dropping
• In 2000 a GB avg $16.06, a 1 TB data warehouse was rare
• Today a GB avg $0.0621, a terabyte can be had for < $100
8
Big Data Challenges
Storing unprecedented volumes of data Rate of biometric data in production is increasing Size of biometric data is increasing (i.e. DNA)
Extracting and visualizing the data in ways that are helpful to users Metadata and Semantics for describing content
Finding what is needed, in the context of the mission semantically-enabled search engines that can use the context
Eliminating what is not needed Either after the data usage or before deciding to store it
Governing data collections within the COI Effective governance of data resources Quality control structures
9
Big Data Taking Control
Employee a wide variety of data techniques and technologies in the areas of
Data aggregation Query Languages
Data manipulation ETL
Data analysis Query Plans, Cost Estimation
Data visualization Formatting
Data Organization Data Modeling, Storing Data
Data Retrieval Query Optimization
Data Integrity
10
Steps In Preparing To Handle Big Data
Requirements Analysis– User needs, policy mapping, what must database do
Conceptual Design– High level description (often done w/ER model)
Logical Design– Translate ER into DBMS data model
Schema Refinement – Consistency, normalization
Physical Design – Indexes, disk layout, conversions
Security Design – Who accesses what, and how
11
Harmonizing Big Data
12
Other Use Cases
Availability and Visualization Maintainability Reliability Performance and Usability Prediction and Prevention Detection and Security System and biometric algorithm improvement
13
CONTACT INFORMATION
William Graves
Chief Biometric Engineer
US-VISIT Program
Information Sharing and Technical Assistance Branch
Tel: 202-298-5230
14