how to participate in the union catalogue project hussein suleman [email protected] sivulile –...
TRANSCRIPT
how to participate in the
Union Catalogue Project
Hussein [email protected]
Sivulile – Open Access South AfricaAdvanced Information Management Laboratory
Department of Computer ScienceUniversity of Cape Town
September 2005
Overview Union Archive and Catalogues OAI Harvesting Users, Data Providers and Service Providers ETDMS IRs and National Projects Future Directions
What is the Union Catalogue Project The NDLTD Union Archive is a single archive
containing records from multiple NDLTD member sites and consortia.
NDLTD collaborates with various Union Catalogues, each of which provides discovery services for ETDs based on data from the Union Archive.
Purpose of Union Catalogue Project Create an international catalogue of all ETDs (and
TDs). Provide a one-stop shop for researchers searching for
ETDs. Increase visibility for ETDs. Make it simpler for new service providers to
incorporate ETD metadata. Make it simpler for new data providers to gain
visibility outside their institutions.
Serve as a testbed for cutting-edge DL technology.
Harvesting and the OAI-PMH The Open Archives Initiative Protocol for
Metadata Harvesting is used to share metadata. Transfer of XML metadata records over WWW.
If an institution uses software that includes an OAI-PMH server, then others can harvest its metadata.
An institution can also use an OAI-PMH client to harvest from the Union Archive (or other institutions).
FAQ: Harvesting If harvesting occurs often, won’t my server be
overloaded? No. OAI-PMH only transfers records that changed.
Won’t harvesting expose records I want to protect? No. You can expose the records you want and hide
the rest. But I’ll lose control of my collection and its
branding. Not really. You share only the metadata and
provide URLs which can lead people back to your institution’s website.
Union Archive and CataloguesVirginia
Tech UpsallaIBICTBrigham Young . . .
Union Archive
OCLC SRU Virginia Tech
Elsevier SCIRUS
VTLS Virtua
Google Scholar?
. . .
sources
catalogues
Who hosts the Union Archive/Catalogues Union Archive
OCLC Union Catalogues: production
OCLC – SRU service VTLS Elsevier …
Union Catalogues: experimental Virginia Tech …
Using the Catalogue: as a user 1/4 To search for an ETD, start from the NDLTD website
(www.ndltd.org):
Using the Catalogue: as a user 2/4 Select a Union Catalogue and go to their site.
Using the Catalogue: as a user 3/4
Using the Catalogue: as a user 4/4 Using the SRU interface provided by OCLC:
Request: http://alcme.oclc.org/srw/search/NDL/SearchRetrieveService?query=dc.cr
eator%3Dsuleman&maximumRecords=10&recordSchema=default&startRecord=1&xsl=http://alcme.oclc.org/ndltd/XNDL.xsl
Response: <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/">
<version>1.1</version> <numberOfRecords lowestSetBit="2">4</numberOfRecords> <resultSetId>n42hu4</resultSetId> <resultSetIdleTime lowestSetBit="2">300</resultSetIdleTime> <records> <record> <recordSchema>info:srw/schema/1/dc-v1.1</recordSchema> <recordPacking>xml</recordPacking> <recordData> <oai_dc:dc …> <dc:title>Open digital libraries</dc:title> <dc:creator>Suleman, Hussein.</dc:creator> … </oai_dc:dc></recordData> <recordPosition lowestSetBit="0">1</recordPosition> </record> …
Using the Archive: as a service provider Anyone can be a service provider! Obtain search/browse software (check OAI
website) or write your own. Harvest metadata regularly from:
http://alcme.oclc.org/ndltd/servlet/OAIHandler Test your interface over a period of time. Request to be added to the NDLTD
browse/search page. Send email to [email protected]
Using the Archive: as a data provider To contribute your ETD metadata:
Install/configure a software module to share metadata using the OAI-PMH.
Test your OAI-PMH interface at the OAI’s website (www.openarchives.org) or the Repository Explorer (re.cs.uct.ac.za)
Register your archive with NDLTD:
Software Options ETDdb+derivatives – Virginia Tech DSpace
+enhancements for ETDs EPrints VTLS Valet software …
Any OAI-compliant IR software
Metadata Formats: DC/MARC/ETDMS Dublin Core required for OAI-PMH. MARC(21) is internal format in many ILSes. ETDMS = DC + 4 extra fields:
degree name e.g., PhD degree level e.g., doctoral degree discipline e.g., Computer Science degree grantor e.g., Virginia Tech
ETDMS defines mapping to MARC and DC.
ETDMS Ideally, Union Archive will harvest ETDMS
records, otherwise DC. Why ETDMS?
Dublin Core is not sufficient for ETDs? MARC is too complicated.
MARC is not popular within OAI community. ETDMS defines semantics of DC fields as well as
mappings to MARC fields.
http://www.ndltd.org/standards/metadata/current.html
ETDMS Sample<thesis xmlns="http://www.ndltd.org/standards/metadata/etdms/1.0/"
xsi:schemaLocation="http://www.ndltd.org/standards/metadata/etdms/1.0/ http://www.ndltd.org/standards/metadata/etdms/1.0/etdms.xsd">
<title>Conceptual Development and Empirical Testing of an Outdoor Recreation Experience Model: The Recreation Experience Matrix (REM)</title>
<creator>Walker, Gordon James</creator>…
<degree> <name>PHD</name> <level>doctoral</level> <discipline>Forestry</discipline> <grantor>Virginia Polytechnic Institute and State University</grantor> </degree></thesis>
Institutional Repositories IRs can include ETDs, and register an OAI-PMH
set (subset of metadata) with NDLTD. Do not register a general-purpose IR without
specifying an ETD set.
IRs can collect ETD metadata from departmental collections around campus. Register as either departments or campus or region
or country.
National Projects Union Archives can be run on a National or
Regional level (in addition to NDLTD’s). Register only the national archive with NDLTD.
Advantages: National registry of ETDs with local search.
Disadvantages: Researchers are restricted to local content. More effort in maintaining archive and services.
Recommendation: Maintain national registry, but register each
institution with NDLTD and use international discovery services.
Future Directions 1/2 Union Archive
Integration between NDLTD membership management and Union Archive – automatic registration of baseURLs.
Union Catalogues Elsevier SCIRUS integration with NDLTD website. Google Scholar. More experimentation…
Future Directions 2/2 ETDMS
Update standard to use DC namespace. Direct object location information?
Rights management? Subject headings? SURVEY!
Preservation Multiple copies of metadata in Union Archive. Multiple copies of source files.
that’s all folks!