conditor towards a national reference repository for french scientific production valérie bonvallot...

11
Conditor Towards a national reference repository for French scientific production Valérie Bonvallot (CNRS-Inist) – Thierry Dautcourt (Inria) [email protected] [email protected] - Paris 11 may 2015 1

Upload: blanche-fisher

Post on 25-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

1

Conditor

Towards a national reference repository for French scientific production

Valérie Bonvallot (CNRS-Inist) – Thierry Dautcourt (Inria)[email protected] [email protected]

- Paris11 may 2015

2

A multi-partner project in the French higher education and research area : Ministry, public institutions with a scientific and technical vocation, Universities, Agencies, etc.

Conditor: A national recommendation from theDigital Scientific Library

Building a national reference repository for French scientific production

based on common reference repositories shared by universities and research organizations

3

Building a bibliographic reference repository to:

Share metadata describing French scientific production

Pool inventories of scientific production

Conditor: aims and scope

Archive

No full text

Decision-making tool

No indicator production

Portal

No browser interface for end users Current

Research Information

SystemNo research management

Conditor : a reference repository

with quality data allowing interoperability

4

International bibliographic

databases

WoS Scopus

Pubmed etc.

CRIS

Archives

Hal

Researchers, team leaders, information

specialists

Researchers, laboratory directors, research unit

managers …

Local databases

Structures, staff, NRA projects etc.

« STI » reference repositories

Addresses, themes, authors, journals, congresses etc.

Management reference repositories

Conditor: position in the French STI landscape

Institutional identification

databases

Commonreference repositories

Conditor

Management team

Str Ru Rct.

National Repertory of Research Structures (RNSR)

Au t Rh Rors

IdRef

ISSN

ORCID ISNI

5

Experimental principles: pragmatism

Working with multi-skill volunteers

Conditor: experiment

• National Center for Scientific Research (CNRS)

• National institute for agricultural research (Inra)

• National institute dedicated to computational science (Inria)

• French Research Institute for Development (IRD)

• Bibliographic agency for higher education (Abes)

• Bordeaux University

• Paris Dauphine University

Ministry of Higher Education and Research

Experimental group:

representatives from 8 organizations and establishments

Using resources we already have Assessing difficulties, benefits and involvement

6

Conditor: constitution method of a corpus

Several strict alignments of character strings

Name entities, search in addressesIncorporation of identifiers forresearch structures and authors

« Enriched » Conditor corpus

Mapping XML formattingNormalisation / homogenisation

• Identifiers • Document titles• Authors• Sources• Collations• Addresses• Document types

IdRef

RNSR

Reference system of CNRS structures

Step 1MetaData (MD)Treatment and curation

Step 2Detection of duplicates

Step 3Enrichment using reference repositories

Reference repositories used

« Matching group »

Data from 9 databases for the 2011 publication year from

Open archives Bibliograph.

database

Bibliometr. database Mini CRIS

LibraryCatalogue

7

No funding in database 1

No affiliation in database 1

Curation and enrichment

Record in 3 databases

BIRD HAL INRIA

8

Curation and enrichment

No funding in database 2

1 affiliation missing in database 1

Record not in INRA database

Record in 2 databases

HAL Inist

9

Improving some aspects in the corpus building◦ Detection of duplicates◦ Data incorporation from national structures and authors systems

What we learn◦ Conditor is « feasible »◦ Fully-automated treatment isn’t sufficient◦ A social structure is needed

Potential advantages

Sharing a common national warehouse of descriptive bibliographical records is essential to :

◦ Manage publications not found in databases used for evaluation◦ Avoid several manual data entries ◦ Improve information systems interoperability◦ Improve through use common reference data dictionaries repositories and

persistent digital identifiers (national research structures, parent organizations, authors, journals, fundings, congresses, etc.)

Conditor: conclusions

10

5 years corpus building

Design and development of functionalities in an iterative way and progressive implementation

Project launch

Year N Year N+1

What next?

Conditor service

Management functionalities- Retrieval- Modification- Deletion- Validation- Dissemination

3 years corpus 5 years corpus corpus

Treatment functionalities- Duplicate identification- Enrichment through reference

repositories

11

Kiitos 

Köszönöm 

мерси

Hhvala vam

 Tänan

Efharisto 

Paldies 

Ačiū

Grazzi 

Dank je

Dziękuję 

Obrigado/aMulţumesc 

Děkuji 

Dakujem 

Merci 

Tak 

Grazie 

Gracias Thanks 

Danke 

http://marie-aux-usa.skyrock.com/2966393337-Des-questions-des-reponses.html

http://www.bibliothequescientifiquenumerique.fr/?Conditor,65