usgs bioinformatics activities ecoinformatics january 2010 gladys cotter mike frame ecoinformatics...
TRANSCRIPT
![Page 1: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/1.jpg)
USGS Bioinformatics ActivitiesUSGS Bioinformatics ActivitiesEcoinformatics
January 2010
Gladys Cotter
Mike Frame
Ecoinformatics
January 2010
Gladys Cotter
Mike Frame
![Page 2: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/2.jpg)
3
2
1USGS Bioinformatics Activities
Potential areas of collaboration
Questions
Topics for Discussion
![Page 3: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/3.jpg)
•Tools•Protocols•Standards
Collecting
Bioinformatics USGS NBII – addressing bioinformatics challenges
through collaboration, content development, technology, and creating long-term infrastructure
•Cross-referencing •Relationship of data
Linking
•DBMS•Central & Distributed•Security•Backups•Archival •Standards
Storage
•Structure•Governance•Standards •Policies
Organization
•Multi-levels•Difficult•Mashups•Standards
Integration
•Tools•Standards•Usability•Training•Non-biased
Analysis Synthesis
•Tools•Governance•Infrastructure•User analysis
Delivery
•Tools•Protocols•Standards
Applications
for
•Fusion•Blending•Related Integration•Analysis •Models
•Research•Decision Making•Policies•Education•Outreach
Sustainable Reliable Outreach Training
![Page 4: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/4.jpg)
Biological Spatial InfrastructureNBII
Over 72,000 records Based on FGDC BDP Training Program QA/QC Program Standards Cross-walks
EML Dublin Core
Establishing Administrative Tools Expanding internationally Embedding in-line visualization
![Page 5: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/5.jpg)
World Data Center for Biodiversity & Ecology
• World Data System created through the International Council of Scientific Unions (ICSU) in 1957
• Currently 50 World Data Centers (WDC) in place internationally
• USGS National Biological Information Infrastructure (NBII) network designated as the WDC for Biodiversity & Ecology in 2002
![Page 6: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/6.jpg)
WDC Current Activities
• Renewable Energy Project Prequalification Demonstration project – Goal: support rapid prequalification of sites across the nation that are potentially
suitable for renewable energy (with an initial focus on federal lands). • Data sets include, but are not limited to: • Land Cover (GAP), • Protected areas/Stewardship (GAP), • Species Distributions/Habitat Affinities (GAP), • Species Occurrences (US-GBIF Mirror Site and NBII), • Integrated Taxonomic Information System (ITIS)• Topography (USGS), • Landforms (USGS/GAM), • Soil Moisture (USGS/GAM), • Ecosystems (USGS/GAM), • Renewable Energy Potential (i.e., wind, solar, geothermal, and
biofuels; NREL), and • Infrastructure (i.e., power grid, projected smart grid, and roads; NREL
and USGS).
• Protected areas – working with WDPA, USGS GAP• Sponsoring WDC for Biodiversity & Human Health
– South Africa is hosting– Providing workshops, training, demonstration projects– Evaluating how to leverage ILTER activities
![Page 7: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/7.jpg)
Multilingual IABIN Catalog
Ability to search by:IABIN TNMap interface Resource TypeLanguageTaxonomyMulti-lingual thesaurus
Thesaurus web-servicesEnglishSpanishPortuguese
![Page 8: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/8.jpg)
NBII Search
Unique Facets
Dynamic biological clusters
Refine Results
Biological images
Map Display
![Page 9: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/9.jpg)
Additional
Unique Facets
Thesaurus integration
Publisher refinement
Diverse Sources
DBMSWebsites
FederationDocuments
Weighting of sources
![Page 10: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/10.jpg)
Integrated Taxonomic
Information System
• Multi-agency partnership
• Primarily North America Taxa
• Used Globally
• Web-services released Summer 2009
• Taxonomic Workbench 2010
![Page 11: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/11.jpg)
NBII Species Mashups• Designed for
– One-stop-shop for species information in SE– Integrate diverse sources
• Content Type• UI Presentation
![Page 12: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/12.jpg)
USGS Data Integration
3 Major Goals:1. Establishing corporate data available via
ESRI services
2. Improving access to Modeling data, including Water quality, stream, etc.
3. Providing easy to use “data upload”, “registry”, and “discovery tools”
![Page 13: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/13.jpg)
North American EOL
• Multi-agency partnership designed to develop a prototype for “species” information” within the Great Lakes and Chesapeake Bay regions
![Page 14: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/14.jpg)
NSF DataNet Grant Background
• NSF solicitation to establish– Long-term archives for science data – Develop sustainable business model to
support these activities– Involve multi-disciplinary domains– Develop various R&D needed to support effort – Provide ongoing “operational” support
Funded 2:
DataONE
The Data Conservancy
![Page 15: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/15.jpg)
DataONEAreas of emphasis
• Data loss: preserving all the work that has been done; by preserving at-risk (orphaned) biological ecological environmental data from individual scientists
• Data dispersion: finding the needle in the haystack; by facilitating discovery and access of data through a single easy-to-use portal
• Data deluge: navigating the flood of increasingly heterogeneous data; by providing a toolbox that empowers scientists and organizations to more easily and effectively manage, analyze, and synthesize data
• Data Practice: using the best tools to do the job; by creating an informatics-literate workforce through innovative outreach and training efforts (e.g., best-practice videos, podcasts, on-line certificate programs, downloadable best practice guides and exemplars of data management plans)
![Page 16: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/16.jpg)
16
DataONE Technology Directions
• DataONE will enable new science and knowledge creation through universal access to data about life on earth and the environment that sustains it by:
– making the scientist an active member of the data preservation process,
– creating cyberinfrastructure that supports the full data life cycle,
– promulgating cultural changes that value data stewardship and data sharing,
– broadly promoting best practices– engaging citizens in science – domain-agnostic Solutions
![Page 17: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/17.jpg)
17
Partnering organizations
• Libraries & digital libraries • Academic institutions • Research networks • NSF- and government-funded
synthesis & supercomputer centers/networks
• Governmental organizations • International organizations • Data and metadata archives • Professional societies • NGOs • Commercial sector
![Page 18: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/18.jpg)
Why is this relevant to Ecoinformatics
Share similar Cyber infrastructure needs Architecture Portals Distributed approaches Replication Secure, controlled access Authentication methods Tools deployed, and supported Data discovery & interoperability methods Standards developed, deployed
Life Cycle Data Management tools (i.e Investigator toolkit, CI) R&D activities in the areas of CS, IS, SS, GIS, Env., etc. Opportunity for broad Governmental & International Participation (i.e. working groups, tool evaluations, etc.) Complementary to several of our groups goals, projects, activities Potential Microsoft related projects (i.e. MS Excel)
![Page 19: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/19.jpg)
Potential areas of collaboration
• NBII Metadata Expansion• Incorporation of additional species data
into NA EOL, NBII Species Mashups, etc • USGS Data Integration activities• NSF DataONE Grant• Potential Microsoft tools
![Page 21: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/21.jpg)
Technical Architecture & Discussions
DataONE: Enabling Data-Intensive Biological and Environmental Research
![Page 22: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/22.jpg)
22
Existing biological data archives
ESA’s Ecological Archive
Long Term Ecological Research Network
Fire Research & Management Exchange System
National Biological Information Infrastructure
Distributed Active Archive Center
Knowledge Network for Biocomplexity
![Page 23: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/23.jpg)
23
Example data holdings
Data Archive Types of Data ManagedMetadata
Standard(s)
Biodiversity, taxonomic, ecological BDP, DwC, DC, OGIS
Biogeochemical dynamics, terrestrial ecological Earth observation imagery
DIF, BDP, ECHO
Ecological, biodiversity, biophysical, social, genomics, and taxonomic
EML
Avian populations and molecular biology DwC
Biological and taxonomic DC subset
Biophysical, biodiversity, disturbance, and Earth observation imagery
EML
Biodiversity, biotic structure, function/process, biogeochemical,
climate, and hydrologic
EML
Metadata Interoperability Across Data Holdings
EML=Ecological Metadata Language
BDP=Biological Data Profile DwC=Darwin Core
DC=Dublin Core ECHO=EOS ClearingHOuse
OGIS=OpenGIS
DC subset=Dublin Core subset
DIF=Directory Interchange Format
![Page 24: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/24.jpg)
Distributed framework
Member Nodes
• diverse institutions
• serve local community
• provide resources for managing their data
Coordinating Nodes• retain complete metadata catalog • subset of all data• perform basic indexing• provide network-wide services• ensure data availability (preservation) • provide replication services
Flexible, scalable, sustainable network
![Page 25: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/25.jpg)
Supporting the data lifecycle
UCSBNode
UNMNode
ORCNode
1. Deposition/acquisition/ingest2. Curation and metadata management3. Protection, including privacy4. Discovery, access, use, and dissemination5. Interoperability, standards, and integration6. Evaluation, analysis, and visualization
The data lifecycle }
![Page 26: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/26.jpg)
Use Cases, Architecture Planning
http://mule1.dataone.org/ArchitectureDocs/index.html
![Page 27: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/27.jpg)
Changing science culture
1. Education and training
2. Engaging citizens in science
3. Building global communities of practice
![Page 28: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/28.jpg)
Career Long Learning: • best practice guides• exemplary data management
plans• podcasts, web-casts• workshops and seminars• downloadable curricula
Education and training
Best Practice Guide
How to Cite Your Data
6 in a series
Best Practice Guide
Using Metadata fore-research
5 in a series
Gold Star Data Management Plan
Here’s HowBest Practice Guide
How to Cite Your Data
6 in a series
![Page 29: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/29.jpg)
www.CitizenScience.org
Engaging citizens in science
![Page 30: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/30.jpg)
Building global long-lived communities of practice:
• Broad, active community engagement– Involvement of library and science educators engaging
new generations of students in best practices– Existing outreach and education programs
• Transparent, participatory governance• Adoption/creation of innovative and sustainable business
and organizational models
![Page 31: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/31.jpg)
Engagement Working Groups
External Advisory Committee
DIUG
Infrastructure and Research Working Groups
Director Development & Operations
Principal Investigator
R&D Operations
Coordinating Nodes
Member Nodes
Sociocultural barriers to data sharing and preservation
Long-term sustainability and governance
Community engagement and education
Citizen science and public outreach
Usability and assessment
Data integration and semantics
Data preservation, metadata, and interoperability Distributed storage
Federated security
Scientific workflows
Usability and assessment
DirectorCommunity Engagement & Outreach
Education and Outreach Team
Operations
Core CI Team
R&D
Executive Director
Exploration, Visualization, Analysis Exploration, Visualization, Analysis
DataNet Partners
NSF
Leadership Team
DataONEOffice
![Page 32: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/32.jpg)
Why is this relevant to Ecoinformatics
Share similar Cyber infrastructure needs Architecture Portals Distributed approaches Replication Secure, controlled access Authentication methods Tools deployed, and supported Data discovery & interoperability methods Standards developed, deployed
Life Cycle Data Management tools (i.e Investigator toolkit, CI) R&D activities in the areas of CS, IS, SS, GIS, Env., etc. Opportunity for broad Governmental & International Participation (i.e. working groups, tool evaluations, etc.) Complementary to several of our groups goals, projects, activities Potential Microsoft related projects (i.e. MS Excel)
![Page 33: USGS Bioinformatics Activities Ecoinformatics January 2010 Gladys Cotter Mike Frame Ecoinformatics January 2010 Gladys Cotter Mike Frame](https://reader036.vdocuments.us/reader036/viewer/2022062315/5697bfe31a28abf838cb4cc9/html5/thumbnails/33.jpg)
Thanks!
Leadership Team:Bill Michener – UNM, PISuzie Allard – UTJohn Cobb – ORNLBob Cook – ORNLPatricia Cruse – CDLMike Frame – USGSStephanie Hampton – UCSBViv Hutchison – USGSMatt Jones – UCSBSteve Kelling – CornellKathleen Smith - DukeCarol Tenopir – UTDave Vieglais – KU, DataONEBruce Wilson – Joint ORNL – UT