neeo workpackage 5 neeo workpackage leader meeting - 3 warwick, uk 3 september, 2009 benoit pauwels
Post on 19-Dec-2015
213 views
TRANSCRIPT
NEEO Workpackage 5
NEEO Workpackage Leader Meeting - 3Warwick, UK
3 September, 2009
Benoit PAUWELS
• Current release: 1.4.5 (on DEMO) -- changes since 1.3.1– Permalink to publication– Link to scholar page of EO scholars near names in
search results– Publication lists + export to PDF, RTF, BibTex and
RIS possible– Folder with selected publications and separate
export facilities – Can produce download statistics as a graph and
as a table (not visible since backend service not ready)– Multilingual search (MLIA) working– Added translations for the site, not complete yet– Advanced search possible, some issues– Multilingual JEL searching facility– RSS feed– Portlet– Contextual help
EO portal
EO portal – advanced searching
EO portal – multilingual JEL searching
EO portal - MLIA
EO portal - export
EO portal - export
EO portal - portlet
• Outstanding– Integration of datasets
• Overview of all datasets
• Reference from/to publication
– Full text searching– Enriched metadata– JTrac:
• outstanding issues; bugs
• new issues come along with new functionalities (advanced search, MLIA, statistics, …)
• becomes public with release 1.5 of the EO portal
• Testing portal functionality– Checklist and distribute workload over IWG members and
other volunteers– Reporting bugs and issues through
[email protected], who feed into JTrac– Solved by TU-technical team
EO portal
• Validation software – available: http://www.economistsonline.org/validate– described in TG – annex 3 – 1.5
• Partners visible in DEMO (31/8/2009)– #14– some visible on homepage, but IR not yet harvested– some still with problems in admin file– some still with problems in delivered metadata
• Outstanding– Sciences Po: 20/9– UCL: no URL yet for their admin file– UCLouvain: ready, not yet added to EO– Toulouse: internal reorganization of IR; ready early september– Warwick: not before October 2009– Geneva: will reuse solution of UCLouvain, no content yet– Monash: reuse solution of UCLouvain, needs DC2MODS mapping
DIDL/MODS + Admin file
• Integration with Copyright Knowledge Bank– Implementation on Dspace made available to
partners
• « Version signposting » tool / Cover page– Beta implementation on Dspace; to be made
available to partners
• Allow for harvesting and integration of NEEO enriched metadata into local IR– Original + enriched metadata in EO to be
exposed as DIDL/MODS records over SRU
Other local adaptations
• Harvest other resources in economics– Quick wins =
• OAI-PMH• XML md format with unambiguous indication of
location of full text– DOAJ – Business and economics
• OAI-PMH/DoajArticle– Econis - EconBiz
• OAI-PMH/?
EO Gateway
• Usage metadata database in EO Gateway– Prototype ready: DoDoCo– Not yet based on real database storage– Currently fed with false data visible through portal
• Usage stats for NEEO publications– Start harvesting SWUP usage metadata from partners
over PMH– Registration through admin file– Current data providers: EUR, LSE, TU, ULB
• Usage stats for RePEc publications– Reuse LogEc data
Usage metadata
Usage metadata
Usage metadata
• Export of usage statistics per publication, author and institution– Draft portal specifications are available
• Usage of portal– AWStats report
Usage reports
• Meresco supports full-text indexing and searching• Appropriate Meresco software installed and ready
for use at TU• Not yet implemented: holidays
• In essence: – complete text is extracted from a PDF and added as an
extra field– extra search option in portal
• OCR?• FT index of all OA RePEc publications?• Optimization of search experience: noise,
boosting, relevance ranking
Full-text search engine
• Is now supported– search results screen– partner ‘more info’ screen– scholar publication list
• RSS feed = first 50 records of any search – title of RSS feed contains information on search query
• RSS item– title– date of harvest in EO– complete APA– abstract– permalink (to publication in the EO portal)
• Outstanding– most recent first; based on harvest date : Meresco issue– usability? only first 50 records given in feed; risk that user misses out on
publications
RSS
RSS
RSS
RSS
Title for the feed
Title of publicationDate of harvestAPAAbstract
RSS
• JEL– ready– optimizable: correct JEL codes
• Extraction of bibliographic references– No result– NEEO publications: ? use CitEc service/software for
extraction of citations– RePEc publications: reuse CitEc information
• Integration into EO Gateway– JEL: first tests underway - EUR has made enriched
md records available according to agreed specifications
Enrichment
• EO RePEc archive implementation underway
• Still waiting for integration into EconPapers• TK says:
Nereus could have made it easier by just giving us the data, let it be displayed as working papers, and then have gathered reaction on how it should be more properly displayed, instead of insisting the documents have to shown as similar to something that RePEc considers they are not. It's as a result of this extra requirement that we are stuck here.
• We deliver our metadata according to the revised ReDIF format– Doesn’t break current RePEc services– Wait for reactions from researchers on wrong presentation in
EconPapers and IDEAS
EO RePEc archive
• Solution based on Google Translate ready in portal
• Optimization: EXTRAKT software permits to integrate our own dictionaries
MLIA
• Deliverable D5.5
• Finalize DIDL/MODS + Admin file implementation at all partners• FT indexing and searching: NEEO, RePEc• Extraction of bibliographic references: NEEO, RePEc -- CitEc?• Integration of enriched metadata: NEEO, RePEc• Allow for harvesting and integration of NEEO enriched metadata into local IR• Exchange of usage metadata: NEEO, RePEc/LogEc• Portal
– Systematic testing of functionalities and delivered metadata– Some revision of specifications (end hence implementation) might be necessary– Solving bugs and issues– Usage reports– Integration of datasets
• Publication lists (PDF, RTF)– minor adjustments to layout
• EO RePEc archive
• Harvest non-NEEO/non-Repec information sources: quick wins: DOAJ, Econis• MLIA: optimization with EKTRAKT• Portlet• OCR
Activities for last 6 months