research data sharing framework to enhance open science€¦ · research data sharing framework to...
TRANSCRIPT
Research Data Sharing Framework to Enhance Open Science
Hideaki TakedaNational Institute of Informatics
[email protected]: 0000‐0002‐2909‐7163
Research Data Sharing Symposium, Hitotsubashi‐Hall, February 29, 2015
Why Open Science Now?
Science has been expected to be openbut has not always been open becauseof many reasons. Some are political,and others are ethical, leagal,economical or systematic.
Internet/Web technology changedthe game.But how?
Internet/Web technology gives not only solutions to make Science open technically but also changes the Science itself open.
Internet/Web changes the society
Law
Norm
Market
Architecture
four modalities of regulation (Lawrence Lessig)
Law
Norm
Market
Architecture
four modalities of regulation (Lawrence Lessig)
Internet/WebOpen Flat
Open Flat
Open Flat
Open FlatOpen Flat
Internet/Web changes the societyopen and flat
Internet/Web changes Scienceopen and flat
Open Science is possible and inevitable now
So Science is becoming Open
• Open science has been discussed in philosophical, political, methodological, or any kind of views.
• “Open Science NOW” is geared and realized by Internet as Architecture
• Data sharing is the good first step for OpenScience
Data sharing
Researcher before Digital Age
papers
data
research target
Survey Paper working
Research & Writing
01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101
01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101
Researchers now
Data use Data publishing
Research, Writing & Data publishingpapers
data
research target
Survey Paper working
01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101110001111000011010101010111010110111110001111000011010101010111111000011010101011111110011100001101010101001101010101000011010101010
Researcher in Future
Data
Data use Data publishing
Integration of papers & data
Data publishing
Research = Data Supply‐chain
• A scholar is just a library way ofmaking another library– Daniel Dennett, “Memes and the Exploitation of Imagination”, 1990
• A scholar is just a data way ofmaking another data
Data sharing
Data Sharing? or
Data Publication?or
Open Data?
Data Life Cycle
• Data is created, shared, published, and archived
• But, just “published” is not enough, it should be “openly published” (open data)
Data ShareCreate Publish Archive
Research Phase In Progress Results
Open Data
• “A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share‐alike.” http://opendefinition.org/
• Open data is data publication with some open license
– Open license ensues the above condition
Data Life Cycle
• Different tools for different stages of life cycle– Data sharing: generating, federating, …– Data publishing: searching, harvesting, …– Data archiving: migration, …
• The architecture CAN be shared
Data ShareCreate Publish Preserve
Research Phase In Progress Results
StakeholderResearch Institute
Researcher/R. Group
Architecture of data sharing
Requirements for Research DataSharing
• Understandability
• Findability
• Persistency
– Persistency for data
– Persistency for metadata
• Controllability
Repository
Architecture of data sharing
Data
Format
Metadata
Metadata Schema
Systematic Integration across the layersInteroperability on each layer
Access Control
Identifier
Database, Search, MaintenanceDatabase, Search, Maintenance
ス Description Language, Schema design, Registry, Interoperabilityス Description Language, Schema design, Registry, Interoperability
Continuous Development, Community of Practice Continuous Development, Community of Practice Repository
Architecture of data sharing
Data
Format
Metadata
Metadata Schema
Authentication/authorization/audit, ID federation, securityAuthentication/authorization/audit, ID federation, securityAccess Control
Organization, systems, ID federationOrganization, systems, ID federationIdentifier
Understandability
Findability
Persistency
Controllability
Database, Search, MaintenanceDatabase, Search, Maintenance
ス Description Language, Schema design, Registry, Interoperabilityス Description Language, Schema design, Registry, Interoperability
Continuous Development, Community of Practice Continuous Development, Community of Practice Repository
Architecture of data sharing
Data
Format
Metadata
Metadata Schema
Authentication/authorization/audit, ID federation, securityAuthentication/authorization/audit, ID federation, securityAccess Control
Organization, systems, ID federationOrganization, systems, ID federationIdentifier
DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked Data
Organization Schema System Technology
Coordination and Competition
Dspace Fedora Weko
DOI ORCID FundRef
ID (Identifier)
Research Activities and Related Entities
Survey
Article Writing
Data
Digital Articles
Acquiring DataPublishing Data
Funding agencies
ResearchInstitutions
affiliated
Projects
Supported
Academic Societies
Digital objectsDigital objects Digital objectsDigital objects
Topics
Research Activities and Related Entities
Survey
Article Writing
Data
Digital Articles
Acquiring DataPublishing Data
Funding agencies Projects
ResearchInstitutions
affiliated
Supported
Academic Societies
Digital objectsDigital objects Digital objectsDigital objects
Topics
ID
ID ID
ID
ID IDID
ID
IDID
ID
Research Activities and Related Entities
Survey
Article Writing
Acquiring DataPublishing Data
Funding agencies Projects
affiliated
Supported
ID
ID ID
ID
ID IDID
ID
IDID
ID
Data
Digital Articles
ResearchInstitutions
Academic SocietiesTopics
Identifies for research
• ID for– Article– Data– Researcher– Institutions, affiliation– Funding agency, funded project– Academic society
– Topic– …
Identifies for research
• A research activity is represented with a structure of identifies
– Planned and submitted
– Organized and executed
– Concluded and evaluated
DOI
DOI (Digital Object Identifier)
• Service to translate DOI names to URIs containing digital objects
• Service managed by International DOI Foundation (IDF)
• Initially started by STM publishers to share identifiers for digital publications
• Distributed management– Delegation of registration tasks to Registration Agencies (RAs)
Roles of DOI
• Provide resolvable, persistent, interoperable links
– Resolvable: standard syntax + mapping by handle system
– Persistent• Technically: management of registry DBs
• Socially: organizational operations and duties for members
– Interoperability: sharing datamodel
Repository
DOI in Architecture of data sharing
Data
Format
Metadata
Metadata Schema
Access Control
Identifier
JaLC Metadata Schema
DataCite Metadata Schema
DataCiteJaLC
Members (data providers)
Domain‐specific metadata schemata
DOI
Experiment Projectto register DOIs for Research Data
• Goal− Establish operation flows to register DOIs for research data
• Objectives− Set policies in registering DOIs for research data
− Establish operation flows to register DOIs for research data with JaLC system.
− Test Data DOI registrations
− October 2014 – October 2015
38
Members of the project
9 projects with 14 organizations
Issues in Data DOI
• Flow of operations
• Persistent access
• Granularity of data in registration
• Dynamics of data
• Landing page
• Quantity of data
• Applications
40
Recommendations for Data DOIs
• Recognition of variety of the nature of data
• Minimal Commitment
– Persistency, Interoperability, Usability,manageability
• Design own DOI registration policy
Metadata and Metadata Schema
Linked Data
• Network of metadata
• Sharing metadataamong RA
– CrossRef
– DataCite
– (JaLC)Image
Title
Yokohama Museum
Isamu [email protected]
1989
近寄るとなぜか覗きたくなってしまう「真夜中の太陽」越しに「無言のうちに歩いている」を見る。いつもと違った作品に出会えます。
Description
WorkURI
URI
CreatorURI
3‐4‐1, Minato Mirai, Nishi‐ku, Yokohama
045‐221‐0300
MuseumPlaceURI
真夏の太陽
Date
Creator
Is_located_inLabel Address
Phone
Category
Image
Image
NameE‐address
wikipedia
Summary
• Open Science becomes possible and evitable by Internet/Web• Open Science backed by data‐sharing• Data‐sharing architecture
– Interoperability should be guaranteed– Layers
• Metadata Schema/Metadata/ID/Data format/Data/Repository/AccessControl
– Cooperation and Competition
• DOI is the promising ID for data but different in use from one for literature– DOI registration policy is needed
• Lots of issues in other layers