andrew treloar, overview of aceas data workflow, aceas grand 2014
DESCRIPTION
Overview and summary – Dr Andrew Treloar – Australian National Data Service, ACEAS Grand 2014TRANSCRIPT
Challenges of data management in
synthesis projectsSummary of/reflection on afternoon session on Wed 7/5/14 at
ACEAS Grand Synthesis Workshop
Caveats
• I am not an ecologist• I see most things through a data lens• And so apologies for what I have noted about presentations this
arvo
Data identification and acquisition: Seagrass• Challenges• Lack of metadata – need corporate knowledge• Limited data available for open access exchange• Lack of info about how data was collected• Identifying relevant data sets• Hard to identify relevant variables in some data for particular questions• Getting data at right spatial and temporal scale• Implications of necessary assumptions• Data (including layers) constrains spatial resolution
• Opportunity for map improvement• But where does the improved map end up? (c.f. data synthesis, publication)
• Developing wetland plant database with range of traits• Drawing on a number of different existing data sets
• Using a range of dispersal models• Need for further data collection and modelling by researchers• Data acquisition challenges• Often sourced through personal contacts• Populating the database with the right traits
Data identification and acquisition: Aquatic
Data collation and blending: Animal telemetry• OzTrack platform provides a location to bring together tracking
data across disciplines• Analysis tools are the carrot to attract the data
• Obligation to make data available (because you may have degraded study animals QoL)• Sourced datasets through TERN DDP ("It's awesome!")• Challenges• Reuse hard because original studies determine tag set up• Raw data on its own not enough – need rich context from data
custodians/collectors• Who owns the data?
Data collation and blending: Northern Quoll• Challenges• Data mismatches between availability and study question (burned
patches, rockiness)• Studies set up for different purposes, and hence produce different data
Data analysis and synthesis
• Challenges – endemic genetics• Lack of adequate metadata (stuff just missing – DNA, location)• Inadequate response from authors• Need for format conversion
• Challenges – phenology monitoring• Need better data => protocols and standards for data capture• Tools for managing and sharing 1000s of images• No global standards for phenocams
• Challenges – drought induced mortality• Data is often biased, incomplete and patchy (but it's all we've got
sometimes)
Data publication and visualisation
• Challenges – aerobiology• Different data capture technologies influence data collected• Could only use 11 of the 17 possible data sets• Getting the data online delayed publication of first paper• Reluctance to release primary data (priority, errors/quality, journal
policies)• Ignorance of data value (commercial exploitation, value adding by others)
• Challenges – indigenous knowledge• Interaction between cultural landscape scales and cultural infrastructure
Overall issues• Fitness for purpose vs. It's all we have• When synthesising, may be constrained by lowest quality data set• E.g. spatial resolution for seagrass, existence of presence/absence only
• Need to capture context in metadata (seagrass, telemetry, endemics)• Motivators for data exchange/availability• Answer new questions through more data• Use tools that are made available as carrot
• Data gets collected but doesn't always get published• Some data owners are reluctant to share for understandable human
issues
Overall issues
• Hard to find data (if cited in paywall journals)• Role here for DDP, Research Data Australia
• Data quality (or purpose) mismatch• Non-interoperable data• Academic ethos• Hierarchical structure incompatible with data sharing• Academia selects for possessiveness• Underfunding => overcontribution => protectiveness
Possible actions• Anticipate Reuse: get groups who collect potentially combinable data to
agree on minimum elements they will collect that will make datasets more reusable/recombinable• More is More: concentrate on large long-term field projects with
standardised instruments and data products• Research Locally, Coordinate Globally: Research Data Alliance (rd-
alliance.org) provides location for working groups to reduce barriers to data exchange• Bribe, don't Bully: Provide tools with attractive functionality where data
sharing is easier (than what they do now)• Change the Norms: Discussion within discipline around data-sharing
norms
Thank you for the opportunity to come and listen@atreloar
andrew.treloar.net