data management needs and challenges for telemetry scientists josh m london wildlife biologist,...
TRANSCRIPT
Data Management Needs and Challenges for Telemetry
ScientistsJosh M London
Wildlife Biologist, Polar Ecosystems ProgramNational Marine Mammal Laboratory
NOAA NMFS Alaska Fisheries Science Center
Temptation to identify biologists as the source for the raw data
The Tip of a Complex Iceberg
hypothesisagency needs/mandatesfunding initiatives
opportunistic vs. planned
tag design/vendor tag programming
Deployment of tags (location, age/sex, time)
Data Management
data quality controlsynthesis
movement model
PublicationsContract reportsStatus/Listing Review
derived products
Field Workand
Study Design
Narrowing BottleneckMany biologists lack
the skills and training for effective, scalable database design and data management
practices
Field Work & Tag Deployment
When? Where? Which Tag/Vendor? Which Age? Which
Sex? (Do we have a choice?)
Tag Programming Deployment Length
(attachment type)
Limited Tools for Managing Raw Telemetry Data
‘raw’ data
via Argos as CSV/Text Process w/ Vendor
Software (behavior data) Typically output as CSV Field data about animal
(e.g. ID, species, sex, age, health)
needs
Explore ‘raw’ data Address hypotheses Visualize movement/use Synthesize w/ dependent
(e.g. health, age) and independent data (e.g. other animals, remote sensed)
Biologists Not Trained in Large Scale Data Management
Biologists
Excel and/or Access ESRI ArcMap (shapefiles) Google Earth Mouse Click Interaction Programming (visual
basic, R, python) recipe driven … not developers
Data Manager
Postgres/PostGIS, Oracle, MySQL, SQL Server
Normalization and Efficient Design
Scripting, Jobs, Transactions
Data Integrity Automation, Reproducible
My Perspective
To address complex questions related to marine mammal telemetry and understanding animal ecology, I had to become more of a data manager …And, in the process, I’ve become less of a biologist
Start (2006)
Argos Monthly CDs SatPack Access
Database Excel Files (limited to
56k) Large, Flat Tables No Central Repository
Current System
Nightly FTP Argos Push
Nightly Data Processing
CSV/External Oracle Table
PL/SQL Procedures Developed/Designed
with Training via Google Search
My Perspective
Current Limitations
Data access requires a minimum level of technical skills (basic SQL, Oracle framework, Oracle APEX, R spatial tools, ArcMap)
Single Point of Access/Failure (me) Limited Documentation of Design Design May Not be Optimal/Appropriate Main Objective to Provide Data to Analysts –
Not necessarily designed for providing data to public
My Perspective
Greatest Needs – Research Program
Data Management and Design Consultation Data Design & Documentation Portal
(user-friendly metadata) Low Tech Exploration Tools Database and Application Developers
(data flow and data input) Training Opportunities
My Perspective
Greatest Needs – External to Program?
Provide Meaningful Public Access to Data A Clear Data Sharing Policy w/ Best Practices Encourage/Facilitate Scientific Collaboration Meet Agency Needs and Requirements How to Communicate Scientific Knowledge in
the Modern/Digital Age–sharing knowledge/expertise just as important as sharing data
Publish Data Once
My Perspective
Challenges / Road Blocks
Limited Funds and Priorities – appropriate resources for doing the priority analysis and science not available, let alone the resources to distribute data responsibly
Database design/management often in the hands of the least skilled users
IT Policies, Investments, and Infrastructure Varied Across Institutions
No standard(s) for communicating and sharing ‘raw’ animal telemetry data. What is ‘raw’ data?