how proquest handles original data provided by the publishers, and presents it in the full-text...

26
How ProQuest Handles Original Data Provided by the Publishers, and Presents It in the Full-Text Aggregator’s Database Steven A. Knowlton, MLIS ALCTS CRS Committee on Holdings Information Holdings Update Forum, 7/11/09

Upload: james-booker

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

How ProQuest Handles Original Data Provided by the Publishers, and Presents It

in the Full-Text Aggregator’s Database

Steven A. Knowlton, MLIS

ALCTS CRS Committee on Holdings InformationHoldings Update Forum, 7/11/09

Introduction• Where or from whom does ProQuest obtain coverage data for each title in our databases?

• In what format does ProQuest receive raw holdings data?

• What are the data elements in these source data? (For instance, are enumeration and chronology both present in the title lists we receive from the source, or does ProQuest supply the data? If ProQuest supplies the data, how much research in involved, and how difficult is it to provide holdings data for content coverage?

• How does ProQuest pass these coverage (holdings) data to Electronic Resources Access and Management Services like Serials Solutions, and in what format during data transfer?

• Is ProQuest implementing ONIX SOH standards in their coverage data for ejournals? What are the issues in implementation, if any?

About ProQuestBased in Ann Arbor

Since 1938

ProQuest is an information partner, creating indispensable research solutions that connect people and information.

Through innovative, user-centered technology, ProQuest offers a depth and breadth of global content that includes historical newspapers, dissertations, and uniquely relevant resources for researchers of any age and sophistication--including content not likely to be digitized by others.

Inspired by its customers and end users, ProQuest is working toward a future that blends information accessibility with community to further enhance learning and encourage lifelong enrichment.

PRODUCTS:Microforms

DissertationsDatabases

ProQuest and Serials Holdings

Examples of ProQuest databases with serials content:

ProQuest Science Journals

ABI/Inform

ProQuest Central

Accessing Serials Holdings in ProQuest

Within a database:

Accessing Serials Holdings in ProQuest

Within a database:

Accessing Serials Holdings in ProQuest

At the ProQuest website:

Accessing Serials Holdings in ProQuest

At the ProQuest website:

Accessing Serials Holdings in ProQuest

“Local Administrator” module:

- Holdings information customized for each subscriber

- Provides holdings data available within each and all of the ProQuest databases to which a customer subscribes

How ProQuest gathers holdings data

Content acquisition:

ProQuest acquires content from publishers through various techniques

1.Hard copies: chronology & enumeration captured by hand, embedded in file

2.Born-digital content: chronology & enumeration embedded in the file by content provider

How ProQuest gathers holdings data

Manufacturing: Preparing the content for access in the ProQuest platform

- ProQuest has filters to extract holdings data from the electronic content

- ProQuest populates only the holdings data supplied by providers

- Holdings data along with other information populated in the database fields

Example of raw data files ProQuest receives from content providers

XMD-entity REPOSITORY_FORMAT="1.0" PRXML_VER="2.2" ENTITY_TYPE="Article" PAGE_NO="1" ID="Ar00100" BOX="362 296 1177 764" LANGUAGE="English" CONTINUATION_TO="Ar00803" SNP="Ar00100S.png" SNP_WIDTH="350" SNP_HEIGHT="118">

<Meta NAME="Mother, son killed in one-car accident" DESCRIPTION="" SUBTYPE="" BASE_HREF="ISJ/2009/05/27" SOURCE_TYPE="PDF" PUBLICATION="ISJ" SECTION="Front Page" ISSUE_DATE="27/05/2009" WORDCNT="180" RELEASE_NO="ISJ20090527_0_0_0_Resized" PAGE_ID="ISJ20090527A01.PDF" PAGE_TYPE="Single" PAGE_WIDTH="399" PAGE_HEIGHT="824" DEFAULT_IMG_EXT="png" PDF_DESTINATION_MAPPED="OLV0_Entity_0001_0001”

XML:

Example of raw data files ProQuest receives from content providers

<!DOCTYPE ARTICLE SYSTEM "MCB.DTD"><ARTICLE AID="2680070401.sgm" PDFID="2680070401.pdf" DOI="10.1108/14720700710820443" COPYRIGHT="M" ARTTY1="Research paper"><FM><PUBFM><JTI>Vexillology Today</JTI> <VOL>7</VOL><ISS>4</ISS><PBD>2007</PBD><PPF>355</PPF><PPL>369</PPL><ISSN>1472&hyphen;0701</ISSN><NAME>Gilbert Smith, Frank Jones, and Peter McCartney</NAME> </PUBFM><ATL>Integrating corporate responsibility principles and stakeholder approaches into mainstream strategy&colon; a stakeholder&hyphen;oriented and integrative strategic management framework</ATL><AUG>

SGML:

Example of raw data files ProQuest receives from content providers

020209990629VUTVS01 00000136B32 no09 0001 090627 N S 0906270129 00007681{IT}N{SOURCETAG}0906270129{ACCESSION}000000{PUBLICATION}THE NEWS AND OBSERVER {DATE}090627{TDATE}Saturday, June 27, 2009{EDITION}FINAL{SECTION}NEWS{PAGE}A1

Tagged ASCII:

Converting Holdings Data

ProQuest does not add any holdings data

- Only convert what is supplied by the publisher

Formatting according to known publication frequency:

-Tag contains “20071201”

- Known to be a monthly serial

- ProQuest will display “December 2007”

Fields in ProQuest databases

Unique identifiers for:

• the serial title

• the volume/issue/date of the serial

• the article

Other fields as shown on next slide

Fields in ProQuest databases

ProQuest document ID

Unique identifiers for the article (e.g., 26615328)

Links to the unique identifiers assigned to the volume, issue, or date of the serial, and to the unique identifier assigned to the serial as a whole

• Content is available in databases as soon as it is manufactured

• Title lists appear shortly afterward

• Content Control staff runs a report to verify that the information we present in the Title List is accurate

• It is double-checked by a second set of staffers

ProQuest Title Lists

Electronic Resources Access and Management Services (ERAMS): organizations that work to convey holdings data from publishers or aggregators to library systems (“linking partners”)

ProQuest provides our holdings data in the SOH 1.0 format for ERAMS partners

How ProQuest provides data to ERAMS

Serials Online Holdings (SOH) standard

• Created by NISO in 2005 as part of the ONIX standards

• Intended for “communicating information about the holdings or coverage of online serial resources from a party that holds or supplies the resources to a party that needs this information in its systems”

• Coded in XML

Serials Online Holdings (SOH) standard

DATA ELEMENTS INCLUDED BY PROQUEST:• Publisher• Journal Title• Journal Issue • Format (HTML, PDF, etc.)• Embargo period

Serials Online Holdings (SOH) standard

<HoldingsRecord><RecordReference>PQPMID:83</RecordReference><NotificationType>00</NotificationType><SerialVersion><SerialVersionIdentifier><SerialVersionIDType>07</SerialVersionIDType><IDValue>03841294</IDValue></SerialVersionIdentifier><SerialVersionIdentifier><SerialVersionIDType>01</SerialVersionIDType><IDTypeName>PMID</IDTypeName><IDValue>83</IDValue></SerialVersionIdentifier><Title><TitleType>02</TitleType><TitleText>The Gazette</TitleText> </Title><Publisher><PublishingRole>01</PublishingRole><PublisherName>CanWest Digital Media </PublisherName></Publisher> <OnlinePackage><OnlineServiceName>ProQuest</OnlineServiceName><Website><WebsiteRole>03</WebsiteRole><WebsiteLink>http://proquest.umi.com/pqdweb</WebsiteLink></Website><HoldingsDetail><JournalIssue><JournalIssueRole>04</JournalIssueRole><JournalIssueDate><DateFormat>00</DateFormat><Date>19850102</Date></JournalIssueDate></JournalIssue><JournalIssue><JournalIssueRole>06</JournalIssueRole><JournalIssueDate><DateFormat>00</DateFormat><Date>20090626</Date></JournalIssueDate></JournalIssue><EpubFormat>10</EpubFormat></HoldingsDetail><Embargo><EmbargoType>02</EmbargoType><EmbargoValue>2</EmbargoValue></Embargo></OnlinePackage></SerialVersion></HoldingsRecord>

Serials Online Holdings (SOH) standard

<ProQuestDatabases> 

<DatabaseID>3</DatabaseID><DatabaseName>ABI/INFORM Global</DatabaseName><DatabaseDesc>Most scholarly and

comprehensive way to explore and understand business research topics. Search nearly 3000 worldwide business periodicals for in-depth

coverage of business and economic conditions, management techniques, theory, and practice of business, advertising, marketing,

economics, human resources, finance, taxation, computers, and more. Expanded international coverage. Fast access to information on 60,000

+ companies with business and executive profiles. Now includes The Wall Street Journal.</DatabaseDesc>

<Titles><Title><RecordReference>PQPMID:6</RecordReference></Title><Title><RecordReference>PQPMID:8</RecordReference></Title>

<Title><RecordReference>PQPMID:7510</RecordReference></Title><Title><RecordReference>PQPMID:7539</RecordReference></Title><Title><RecordReference>PQPMID:7896</RecordReference></Title><Title><RecordReference>PQPMID:7921</RecordReference></Title><Title><RecordReference>PQPMID:7940</RecordReference></Title><Title><RecordReference>PQPMID:7976</RecordReference></Title>

How ProQuest customers can receive holdings information updates

Database Content

Mailing List

• Gregg Zajic

• Jessica Lehr

• Reed Lenz

Thank You