Using Biodiversity Data from the NBN Database for Research
y
Paula Lightfoot, NBN Trust Data Access Officer
Introduction to the NBN Database
1. Overview of available data
2. Finding and accessing data
3. Evaluating data quality
4. Using and referencing data
http://data.nbn.org.uk
Summary of Available Data
• 91 million georeferenced taxon occurrence records.
• 27 habitat datasets and 44 site boundary datasets to provide context and act as filters.
• 856 datasets from 150 data providers.
• Standard data format.
• Standard taxonomy from UK Species Inventory.
http://data.nbn.org.uk
Data Providers
• A large proportion of data
comes from skilled amateur
naturalists.
• Data collated taxonomically
and/or geographically.
• Some structured surveys,
much ad hoc recording.
Records in the NBN Database
by data provider type.
January 2014
(n = 91,206,588)
Geographic Coverage and Sampling Effort:
• Recorder effort and data mobilisation are not evenly spread across the British Isles.
• New NBN Gateway (v.5) extends coverage to include the Channel Islands and offshore data.
• National Biodiversity Data Centre is the repository for ROI data.
Sampling Effort
Collembola Recording Scheme
10,633 records of 336 species over 200 years
BTO Second Atlas of Breeding Birds in Britain and Ireland: 1988-1991
1,465,400 records of 272 species over 4 years
Sampling Effort
http://tombio.myspecies.info/
Orchesella villosa (a springtail)
Taxonomic breakdown of records in the NBN Database at January 2014 n = 91,269,685
Taxonomic Coverage
Currency of Data
Number of records in the NBN Database
by year of record (January 2014)
n = 89,091,428 (98% of total)
Data Attributes
Standard attributes in NBN Exchange Format:
Required: Unique record key, taxon, date, date type, coordinates/grid
reference/polygon, projection, precision (what? where? when?)
Optional: Survey key, sample key, absent, sensitive, site key, site name, recorder, determiner
Other attributes are not (yet) standardised across datasets: e.g. abundance, life stage, sex, verification status, record type, depth i.e. not standard fields and not standard units / vocabularies
Absence Data 10km Interactive Map
of Sargassum muticum Zero abundance (T/F) is a standard attribute Absence records are displayed on the NBN Gateway Interactive Map The NBN Database currently holds 30,625 absence records across 26 datasets (Jan 2014)
Effort-based Data
• The NBN Database holds some effort-based datasets (e.g. BTO Breeding Bird Survey, Shorewatch, Shore Thing, UK Butterfly Monitoring Scheme)
• The effort-based methodology should be described in the metadata.
• Effort data may be stored as attributes of the species observation e.g. number of observers, timespan of observation period.
• Effort data is not stored in a way that enables ‘per unit effort’ analysis. NBN Exchange Format is a flat file, not relational tables.
• The finest resolution currently available is 100m squares.
• Data providers can blur resolution of the ‘public’ version of the records to 1km, 2km or 10km, while granting full access to select users.
• ‘Full access’ includes recorder and determiner names and attributes (where available).
Data Resolution
Data providers
Data Resolution
Access resolution of all records in the NBN Database (n = 91,206,588)
Access resolution of records of designated taxa in the NBN Database (n = 20,548,842)
Data Resolution
Access resolution of vascular plant records in the NBN Database (n = 25,998,531)
Access resolution of dragonfly and damselfly records in the NBN Database (n = 1,486,554)
Exploring Data
Exploring Data
NBN Gateway Interactive Map – create and query layers of species, habitats and site boundaries
• Publicly accessible records have gone through quality control processes, e.g. checks by local and national experts.
• Some have also been checked using NBN Record Cleaner, based on:
Spatial distribution rules
Temporal rules: flight period or first/last year recorded
Identification difficulty / rarity / taxonomic uncertainty
• NBN Record Cleaner rules have been created by experts at national recording schemes for over 18,000 species including 77% of conservation priority species (NERC Act 2006).
• Nevertheless, erroneous records do occur. Always read the metadata.
Evaluating Data Quality and Accessibility
http://www.nbn.org.uk/record-cleaner.aspx
Evaluating Data Quality and Accessibility
Read the dataset’s metadata:
Evaluating Data Quality and Accessibility
Read the dataset’s metadata:
Requesting better access to data
For one off use:
• Request access as an individual.
• Apply taxonomic / geographic / date and dataset filters to request access to the records you need across multiple datasets.
For repeated use (strongly recommended!)
• Register your organisation on the NBN Gateway (quick and free!).
• Apply as an organisation for access to all datasets and permission to use data for research purposes.
• Make colleagues and students members of the organisation.
Over 200 organisations have user accounts on the NBN Gateway, around 80% of whom also share their own data
Accessing and Using Data
Downloading data from the NBN Gateway
Who you are (individual / organisation)
Why you are downloading the data (dropdown list and free text description)
Accessing and Using Data
Downloading data from the NBN Gateway
Include sensitive records
You will need to have been granted access to these records before downloading data
Accessing and Using Data
Downloading data from the NBN Gateway
Geographic filter
10km square
Site boundary
‘Within’ or ‘overlapping and within’
Accessing and Using Data
Downloading data from the NBN Gateway
Taxonomic filter
Taxon (up to Order)
Taxon reporting category (e.g. terrestrial mammals)
Designation
User-defined list
User-defined lists: e.g. species as proxy indicators of climate change, habitat condition, ecosystem services etc. Must be supplied and maintained (with metadata) by a named organisation. Must be relevant for repeated use, not just one-off use.
Accessing and Using Data
Downloading data from the NBN Gateway
Year Range
e.g. restrict to recent records only
Accessing and Using Data
Downloading data from the NBN Gateway
Dataset filter
You may wish to exclude some datasets e.g.
If they have not granted permission
If the metadata shows they are not suitable for your purpose
Accessing and Using Data
Downloading data from the NBN Gateway
Download
Zip file containing:
Observations (CSV file)
Metadata (TXT file)
Download date, time and filters used (TXT file)
Limitations: Filters are not ‘multi-select’. For data on 2 species at 5 sites, you have to do 10 downloads. You have to use a taxonomic, geographic or dataset filter – you can’t download everything!
Accessing and Using Data
Accessing data via the NBN REST API
• REST API available to view and download
• Full documentation available by end March
• rNBN tool for release this year
Custom downloads and REST API downloads are logged and reported to data providers, the same as downloads from the NBN Gateway.
Custom downloads from the NBN Database
• Filter by user-defined species list (one-off use)
• Filter by user-defined polygon
• ESRI shapefile download format
https://data.nbn.org.uk/Documentation/Web_Services/Web_Services-REST/
Accessing and Using Data
NBN Gateway Terms and Conditions
• Require written permission for research use
• Require the data provider(s) to be acknowledged
• Require the recorder to be acknowledged if appropriate and possible
• Require a waiver statement to be included
• Require OS Map images to be acknowledged
Accessing and Using Data
https://data.nbn.org.uk/Terms
Referencing Data
Guidance on referencing data is available on the NBN Website
• DOIs are not currently generated from the NBN Database
• This is being considered, but the data access controls and the fact that data may be withdrawn by data providers poses a challenge.
Links and References
Data providers who contributed to maps used in this presentation: 10km interactive map of Sargassum muticum: https://data.nbn.org.uk/Taxa/NBNSYS0000188809
Collembola Recording Scheme dataset: https://data.nbn.org.uk/Datasets/GA000566
BTO Breeding Bird Atlas 1988-1991: https://data.nbn.org.uk/Datasets/GA000147
National Biodiversity Network: www.nbn.org.uk
NBN Gateway: http://data.nbn.org.uk
NBN Record Cleaner: http://www.nbn.org.uk/Tools-Resources/Recording-Resources/NBN-Record-Cleaner.aspx
Guidance on referencing data from the NBN Database: http://www.nbn.org.uk/Use-Data/Using-Maps-or-Data/Using-and-referencing-data-from-the-Gateway.aspx
GBIF: www.gbif.org
NERC guidance on DOIs: http://www.nerc.ac.uk/research/sites/data/doi.asp
Guide to the NBN Exchange Format on YouTube: http://www.youtube.com/watch?v=2WfOjQOaVFI#t=24