digital author identification uksg 17 – 18 april 2007 daniel van spanje
TRANSCRIPT
2
DAI in DARE
• DARE: Digital Academic REpositories
– Universities + KNAW + NWO + KB– Infrastructure for linking the IR– Stimulate production of digital scientific
output– 2003 – 2006
• 2007 – 2010: SURFshare
3
Main issues in DAI
• Unique identifying number for researchers / authors
• National scale
• Benefits:– Improve searching for electronic publications– Integrate searching for electronic and non-
electronic publications– Link Library (Catalogue) and research
environment (Metis)
4
Two projects
• Pilot in 2005 – 2006– one university: Groningen
• Roll-out 2006 - 2007– 13 academic research organizations
• Project leader: Anneloes Degenaar
• DAI website at University of Groningen:– http://dai.weblog.ub.rug.nl/– http://dai-uitrol.ub.rug.nl/
6
Systems involved
• Institutional repository / DAREnet
• Metis
• Dutch Union Catalogue (NCC/PiCarta)
15
Names and other issues• Authors with the same name• Use of one or more initials• Changing names • Spelling variants• Diacritics• Pseudonymes • Name in religion• Nicknames• Collective names• Different structure of names in other languages and
cultures• …..
• Discussions on standardization and unification started in the Netherlands in the Orion project (2003-2004)
16
Proposed solution
• Need established
• “External”Requirements:– use existing mechanisms– local management – national function
• Solution: use “collocation” mechanism of libraries and Metis as source
19
How did we link
• Mechanisms– Initial load per organization
– Online input buttons (webtemplates)
– XML output
– Synchronization mechanisms
• Requirements– No overwrite of library data!
– Deduplication (Matching/merging)
20
Datamodel developed
• Datamodel copied from bibliographic model: three levels
• Metis name-information added to library data; no overwrite
• Affiliations and other fields added
21
Structure of bibliographic data
Bibliographic metadata YoP / LoP / / Title / Author
Imprint / LCSH / DDC
Groningen bibdat:Subject headings
Amsterdam bibdatSubject headings
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Copy level: •Location•holding•shelfnumber
Linked Authorityrecord
genera
llo
cal
copy
22
Structure of authority data
Thesaurusrecord Name of authorVariant names
Groningen data(Metis name)
Amsterdam data(Metis name)
Affiliation•Begin•End
Affiliation •Begin•End
Affiliation•Begin•End
Affiliation •Begin•End
Linked Authorityrecord
Libra
ry
reco
rdM
etis
Affi
liatio
n
25
Datamodel: fieldsAuthority file• Nationality• Language• Name (best known)• Name (most complete)• Maiden name• Name variants• Date of birth • Date of death• Profession / subject• Link to pseudonyms• notes• Entry date• Update date
• Note: proper name field includes subfields for first name, middle name, last name, prefix, suffix
Added fields
• Local researcher number• Metis name (preferred)• Metis name• Sex
• Code organisation• Name organisation• Start date employment• Enddate employment• Code function• Description of function• Code of employment• Notes• Entry date• Update date
26
Initial loadMetis makes list of names
Manual dedup of list
Dedup in Metis
Make Metis export
Format conversion
Load B-records (? Duplicates?)
Export DAI’s to Metis
Manual dedup by library staff
Load DAI in Metis
Load new names(not found)
Merge names with names found
Match names with auth file
27
Initial load
• Data enrichment in Metis• Export from Metis• Conversion to cataloguing system• Matching• Merging: merge / new / B-record
• Results depend on quality metadata– 95 % automatic / 5% manual– 70% automatic/ 30 % manual.– 50 % automatic / 50 % manual
28
Online process• DAI-button in Metis to create DAI-number
• Export DAI-button in NTA/Cataloguingsystem to Metis
• DAI-button in IR to create DAI-number
• Separate DAI-http-request for online input
• Online input via current cataloguing tool
• + Offline synchronization mechanisms between Metis and NTA
30
URL link instead of button
• http://www.pica.nl/dai/dai_redirect.php?action=maak_dai&user=<usernumber>&metis_export_url=http://oras.service.rug.nl:1111/metisdad&p_onderzoekernummer=00033&p_naam_medewerker=Rotteveel&p_voorletter=R&p_voorvoegsel=&p_titulatuur=&p_voorkeur=J&p_geslacht=M&p_geboortedatum=01-07-1974&p_code_functie=20&p_functie=Universitair%20hoofddocent&p_code_organisatie=22020200&p_organisatie_a=Medical%20Microbiology&p_begin_aanstelling=01-01-2005&p_einde_aanstelling=01-01-2006
32
Results of the DAI project
• Now:– 50% of the researchers have a DAI– Procedure for initial load in place– Start with online procedure – P rivacy statement
• Autumn 2007– Online procedure in place– Procedure for synchronization in place– 100% of the researchers will have a DAI in 2007 (ca.
40.000)
33
Things to do
• Finalize the roll-out, develop services (passport …) and implement a usergroup
• Add DAI in metadatastandards (DCX, MODS)
• International standardisation: ISPI
• Involve authors for controll and updating