fair data in a um law study · lessons learned fair is not binary (your data is not either fair or...

33
FAIR data in a UM Law study Large-scale analysis of EU court decisions Kody Moodley , Pedro Hernandez-Serrano, Marcel Schaper, Michel Dumontier, Gijs van Dijck

Upload: others

Post on 12-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

FAIR data in a UM Law study

Large-scale analysis of EU court decisions

Kody Moodley, Pedro Hernandez-Serrano, Marcel Schaper, Michel Dumontier, Gijs van Dijck

Page 2: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Lambin et al. Radiother Oncol. 2013. 109(1):159-64. doi: 10.1016/j.radonc.2013.07.007

Page 3: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

We need to build a social, ethical and technological infrastructure that

facilitates the discovery and reuse of digital resources

for people and machines

@micheldumontier::IDS-TRAINING:2018-10-30

Page 4: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

An international, bottom-up paradigm for the discovery and reuse of digital content

by and for people and machines

Page 5: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Improving the FAIRness of digital

resources will increase their quality

and their potential and ease for

reuse.

@micheldumontier::IDS-TRAINING:2018-10-30

Page 6: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Give unique names for ‘things’ in your data:

Globally unique: not just unique in your dataset

Persistent: don’t keep changing these names

Resolvable: make ‘things’ in your data discoverable on

the Web (e.g. a webpage with more information about it)

Page 7: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Make machine-readable descriptions of your data

so we can use machines to index, search and filter it

Page 8: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Provide metadata describing your data that is accessible beyond its lifetime

Clearly define and communicate access and security protocols for your data

(FAIR != Open)

Page 9: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Represent your data and metadata using machine interpretable formats

Use common vocabularies for representing your data

Link your data to other related datasets

Page 10: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

License: who can reuse your data, under what conditions, for what purpose?

Provenance: who generated the data? when and how did they do this?

Community-standards: use the same data sharing, publishing platforms and

data formats, as your peers

Page 11: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

(A CDDI pilot study)

Large-scale analysis of EU court decisions

Page 12: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Community for Data-Driven Insights (CDDI)

CDDI investigates how Maastricht

University can become the first FAIR

university (2025) by implementing

eScience, Technology, Expertise, and

Services.

Page 13: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Team for this pilot study

Prof. Michel DumontierIDS @ UM

Project partner

Prof. Gijs van DijckFaculty of Law

Project director

Page 14: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Team for this pilot study

Dr. Kody MoodleyIDS@UM / Faculty of Law

Project manager

Pedro Hernandez-SerranoIDS@UM

Lead Data Scientist

Prof. Marcel SchaperFaculty of Law

Court decision expert

Page 15: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Team for this pilot study

Elden van DelftFaculty of LawCourt decision

expert

Marion MeyersDKE / Faculty of

LawData Scientist

Bogdan CovrigFaculty of LawData Scientist

Andreea GrigoriuIDS @ UM

Faculty of LawData Scientist

Page 16: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Goal

Long term

To build a FAIR data infrastructure that supports empirical legal research

at the Faculty of Law, and makes this kind of research accessible for legal

scholars with limited data science expertise.

Short term

To build a (FAIR) software platform to analyse court decisions

Page 17: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility
Page 18: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility
Page 19: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data sources

2,6 million court decisions

Daily, weekly & monthly updated with decisions

Access via download links on website & API calls

Page 20: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data

Metadata

Citationss

Case code

Cited laws

Cited cases

Publication date

Court

Page 21: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data extraction

Data extraction & cleaning scripts

Metadata

Citations

Tested scripts on sample of 2,6 million decisions

Plans to scale the entire data

extraction in the cloud

Page 22: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data representation

Properties?

Entities?

Relations?

Page 23: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data representation (common terms)

HCLS Dataset

Descriptions

Bioschemas.org

PROV-O

Dublin Core

PAV

ontology

Ontologies / Controlled Vocabulary (Community maintained)

Page 24: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data representation (common terms)

EU Vocabularies (EUROVoc)

Common Data Model (CDM) ontology

Page 25: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data representation (global identifiers)

62014CJ0587 ?

IW/2 1968/2 ?

Case C-16/18 ?

Identifiers for cases can change based on organisation (court) or database

ECLI:NL:CRVB:2014:952

European Case Law IdentifierCountry Court Year ID

Adopt the ECLI convention (uniquely identifies cases on EU level across organisations and databases)

Page 26: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Data representation (multiple formats)entity

attribute

relation

type

instance

Publish our data in both

Relational AND Graph

database formats

Page 27: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Legal Knowledge Graph (long term vision)

Page 28: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Findability & Accessibility

Vary according to the kinds of data, how much free storage and

some added features

Page 29: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Findability & Accessibility

Page 30: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Findability & Accessibility

Page 31: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Next steps

● Extract all citations & metadata for 2.6 million court decisions

● Convert information to graph (RDF) format - Data2Services pipeline

● Publish data in FAIR supporting repositories (Zenodo and OSF)

Page 32: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Lessons learned

● FAIR is not binary (your data is not either FAIR or not FAIR)

● FAIR != open

● A little FAIRness goes a long way

● Findability and accessibility was easier for us

● Interoperability and reusability can be a challenge when there

are few standards in your community

● Steps for making data FAIR may vary depending on the nature

of the project and the data

Page 33: FAIR data in a UM Law study · Lessons learned FAIR is not binary (your data is not either FAIR or not FAIR) FAIR != open A little FAIRness goes a long way Findability and accessibility

Thank you!

@MoodleyKody