andrew treloar, overview of aceas data workflow, aceas grand 2014

12
Challenges of data management in synthesis projects Summary of/reflection on afternoon session on Wed 7/5/14 at ACEAS Grand Synthesis Workshop

Upload: aceas13tern

Post on 20-May-2015

176 views

Category:

Data & Analytics


0 download

DESCRIPTION

Overview and summary – Dr Andrew Treloar – Australian National Data Service, ACEAS Grand 2014

TRANSCRIPT

Page 1: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Challenges of data management in

synthesis projectsSummary of/reflection on afternoon session on Wed 7/5/14 at

ACEAS Grand Synthesis Workshop

Page 2: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Caveats

• I am not an ecologist• I see most things through a data lens• And so apologies for what I have noted about presentations this

arvo

Page 3: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Data identification and acquisition: Seagrass• Challenges• Lack of metadata – need corporate knowledge• Limited data available for open access exchange• Lack of info about how data was collected• Identifying relevant data sets• Hard to identify relevant variables in some data for particular questions• Getting data at right spatial and temporal scale• Implications of necessary assumptions• Data (including layers) constrains spatial resolution

• Opportunity for map improvement• But where does the improved map end up? (c.f. data synthesis, publication)

Page 4: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

• Developing wetland plant database with range of traits• Drawing on a number of different existing data sets

• Using a range of dispersal models• Need for further data collection and modelling by researchers• Data acquisition challenges• Often sourced through personal contacts• Populating the database with the right traits

Data identification and acquisition: Aquatic

Page 5: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Data collation and blending: Animal telemetry• OzTrack platform provides a location to bring together tracking

data across disciplines• Analysis tools are the carrot to attract the data

• Obligation to make data available (because you may have degraded study animals QoL)• Sourced datasets through TERN DDP ("It's awesome!")• Challenges• Reuse hard because original studies determine tag set up• Raw data on its own not enough – need rich context from data

custodians/collectors• Who owns the data?

Page 6: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Data collation and blending: Northern Quoll• Challenges• Data mismatches between availability and study question (burned

patches, rockiness)• Studies set up for different purposes, and hence produce different data

Page 7: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Data analysis and synthesis

• Challenges – endemic genetics• Lack of adequate metadata (stuff just missing – DNA, location)• Inadequate response from authors• Need for format conversion

• Challenges – phenology monitoring• Need better data => protocols and standards for data capture• Tools for managing and sharing 1000s of images• No global standards for phenocams

• Challenges – drought induced mortality• Data is often biased, incomplete and patchy (but it's all we've got

sometimes)

Page 8: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Data publication and visualisation

• Challenges – aerobiology• Different data capture technologies influence data collected• Could only use 11 of the 17 possible data sets• Getting the data online delayed publication of first paper• Reluctance to release primary data (priority, errors/quality, journal

policies)• Ignorance of data value (commercial exploitation, value adding by others)

• Challenges – indigenous knowledge• Interaction between cultural landscape scales and cultural infrastructure

Page 9: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Overall issues• Fitness for purpose vs. It's all we have• When synthesising, may be constrained by lowest quality data set• E.g. spatial resolution for seagrass, existence of presence/absence only

• Need to capture context in metadata (seagrass, telemetry, endemics)• Motivators for data exchange/availability• Answer new questions through more data• Use tools that are made available as carrot

• Data gets collected but doesn't always get published• Some data owners are reluctant to share for understandable human

issues

Page 10: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Overall issues

• Hard to find data (if cited in paywall journals)• Role here for DDP, Research Data Australia

• Data quality (or purpose) mismatch• Non-interoperable data• Academic ethos• Hierarchical structure incompatible with data sharing• Academia selects for possessiveness• Underfunding => overcontribution => protectiveness

Page 11: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Possible actions• Anticipate Reuse: get groups who collect potentially combinable data to

agree on minimum elements they will collect that will make datasets more reusable/recombinable• More is More: concentrate on large long-term field projects with

standardised instruments and data products• Research Locally, Coordinate Globally: Research Data Alliance (rd-

alliance.org) provides location for working groups to reduce barriers to data exchange• Bribe, don't Bully: Provide tools with attractive functionality where data

sharing is easier (than what they do now)• Change the Norms: Discussion within discipline around data-sharing

norms

Page 12: Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014

Thank you for the opportunity to come and listen@atreloar

[email protected]

andrew.treloar.net