how proquest handles original data provided by the publishers, and presents it in the full-text...
TRANSCRIPT
How ProQuest Handles Original Data Provided by the Publishers, and Presents It
in the Full-Text Aggregator’s Database
Steven A. Knowlton, MLIS
ALCTS CRS Committee on Holdings InformationHoldings Update Forum, 7/11/09
Introduction• Where or from whom does ProQuest obtain coverage data for each title in our databases?
• In what format does ProQuest receive raw holdings data?
• What are the data elements in these source data? (For instance, are enumeration and chronology both present in the title lists we receive from the source, or does ProQuest supply the data? If ProQuest supplies the data, how much research in involved, and how difficult is it to provide holdings data for content coverage?
• How does ProQuest pass these coverage (holdings) data to Electronic Resources Access and Management Services like Serials Solutions, and in what format during data transfer?
• Is ProQuest implementing ONIX SOH standards in their coverage data for ejournals? What are the issues in implementation, if any?
About ProQuestBased in Ann Arbor
Since 1938
ProQuest is an information partner, creating indispensable research solutions that connect people and information.
Through innovative, user-centered technology, ProQuest offers a depth and breadth of global content that includes historical newspapers, dissertations, and uniquely relevant resources for researchers of any age and sophistication--including content not likely to be digitized by others.
Inspired by its customers and end users, ProQuest is working toward a future that blends information accessibility with community to further enhance learning and encourage lifelong enrichment.
PRODUCTS:Microforms
DissertationsDatabases
ProQuest and Serials Holdings
Examples of ProQuest databases with serials content:
ProQuest Science Journals
ABI/Inform
ProQuest Central
Accessing Serials Holdings in ProQuest
“Local Administrator” module:
- Holdings information customized for each subscriber
- Provides holdings data available within each and all of the ProQuest databases to which a customer subscribes
How ProQuest gathers holdings data
Content acquisition:
ProQuest acquires content from publishers through various techniques
1.Hard copies: chronology & enumeration captured by hand, embedded in file
2.Born-digital content: chronology & enumeration embedded in the file by content provider
How ProQuest gathers holdings data
Manufacturing: Preparing the content for access in the ProQuest platform
- ProQuest has filters to extract holdings data from the electronic content
- ProQuest populates only the holdings data supplied by providers
- Holdings data along with other information populated in the database fields
Example of raw data files ProQuest receives from content providers
XMD-entity REPOSITORY_FORMAT="1.0" PRXML_VER="2.2" ENTITY_TYPE="Article" PAGE_NO="1" ID="Ar00100" BOX="362 296 1177 764" LANGUAGE="English" CONTINUATION_TO="Ar00803" SNP="Ar00100S.png" SNP_WIDTH="350" SNP_HEIGHT="118">
<Meta NAME="Mother, son killed in one-car accident" DESCRIPTION="" SUBTYPE="" BASE_HREF="ISJ/2009/05/27" SOURCE_TYPE="PDF" PUBLICATION="ISJ" SECTION="Front Page" ISSUE_DATE="27/05/2009" WORDCNT="180" RELEASE_NO="ISJ20090527_0_0_0_Resized" PAGE_ID="ISJ20090527A01.PDF" PAGE_TYPE="Single" PAGE_WIDTH="399" PAGE_HEIGHT="824" DEFAULT_IMG_EXT="png" PDF_DESTINATION_MAPPED="OLV0_Entity_0001_0001”
XML:
Example of raw data files ProQuest receives from content providers
<!DOCTYPE ARTICLE SYSTEM "MCB.DTD"><ARTICLE AID="2680070401.sgm" PDFID="2680070401.pdf" DOI="10.1108/14720700710820443" COPYRIGHT="M" ARTTY1="Research paper"><FM><PUBFM><JTI>Vexillology Today</JTI> <VOL>7</VOL><ISS>4</ISS><PBD>2007</PBD><PPF>355</PPF><PPL>369</PPL><ISSN>1472‐0701</ISSN><NAME>Gilbert Smith, Frank Jones, and Peter McCartney</NAME> </PUBFM><ATL>Integrating corporate responsibility principles and stakeholder approaches into mainstream strategy: a stakeholder‐oriented and integrative strategic management framework</ATL><AUG>
SGML:
Example of raw data files ProQuest receives from content providers
020209990629VUTVS01 00000136B32 no09 0001 090627 N S 0906270129 00007681{IT}N{SOURCETAG}0906270129{ACCESSION}000000{PUBLICATION}THE NEWS AND OBSERVER {DATE}090627{TDATE}Saturday, June 27, 2009{EDITION}FINAL{SECTION}NEWS{PAGE}A1
Tagged ASCII:
Converting Holdings Data
ProQuest does not add any holdings data
- Only convert what is supplied by the publisher
Formatting according to known publication frequency:
-Tag contains “20071201”
- Known to be a monthly serial
- ProQuest will display “December 2007”
Fields in ProQuest databases
Unique identifiers for:
• the serial title
• the volume/issue/date of the serial
• the article
Other fields as shown on next slide
ProQuest document ID
Unique identifiers for the article (e.g., 26615328)
Links to the unique identifiers assigned to the volume, issue, or date of the serial, and to the unique identifier assigned to the serial as a whole
• Content is available in databases as soon as it is manufactured
• Title lists appear shortly afterward
• Content Control staff runs a report to verify that the information we present in the Title List is accurate
• It is double-checked by a second set of staffers
ProQuest Title Lists
Electronic Resources Access and Management Services (ERAMS): organizations that work to convey holdings data from publishers or aggregators to library systems (“linking partners”)
ProQuest provides our holdings data in the SOH 1.0 format for ERAMS partners
How ProQuest provides data to ERAMS
Serials Online Holdings (SOH) standard
• Created by NISO in 2005 as part of the ONIX standards
• Intended for “communicating information about the holdings or coverage of online serial resources from a party that holds or supplies the resources to a party that needs this information in its systems”
• Coded in XML
Serials Online Holdings (SOH) standard
DATA ELEMENTS INCLUDED BY PROQUEST:• Publisher• Journal Title• Journal Issue • Format (HTML, PDF, etc.)• Embargo period
Serials Online Holdings (SOH) standard
<HoldingsRecord><RecordReference>PQPMID:83</RecordReference><NotificationType>00</NotificationType><SerialVersion><SerialVersionIdentifier><SerialVersionIDType>07</SerialVersionIDType><IDValue>03841294</IDValue></SerialVersionIdentifier><SerialVersionIdentifier><SerialVersionIDType>01</SerialVersionIDType><IDTypeName>PMID</IDTypeName><IDValue>83</IDValue></SerialVersionIdentifier><Title><TitleType>02</TitleType><TitleText>The Gazette</TitleText> </Title><Publisher><PublishingRole>01</PublishingRole><PublisherName>CanWest Digital Media </PublisherName></Publisher> <OnlinePackage><OnlineServiceName>ProQuest</OnlineServiceName><Website><WebsiteRole>03</WebsiteRole><WebsiteLink>http://proquest.umi.com/pqdweb</WebsiteLink></Website><HoldingsDetail><JournalIssue><JournalIssueRole>04</JournalIssueRole><JournalIssueDate><DateFormat>00</DateFormat><Date>19850102</Date></JournalIssueDate></JournalIssue><JournalIssue><JournalIssueRole>06</JournalIssueRole><JournalIssueDate><DateFormat>00</DateFormat><Date>20090626</Date></JournalIssueDate></JournalIssue><EpubFormat>10</EpubFormat></HoldingsDetail><Embargo><EmbargoType>02</EmbargoType><EmbargoValue>2</EmbargoValue></Embargo></OnlinePackage></SerialVersion></HoldingsRecord>
Serials Online Holdings (SOH) standard
<ProQuestDatabases>
<DatabaseID>3</DatabaseID><DatabaseName>ABI/INFORM Global</DatabaseName><DatabaseDesc>Most scholarly and
comprehensive way to explore and understand business research topics. Search nearly 3000 worldwide business periodicals for in-depth
coverage of business and economic conditions, management techniques, theory, and practice of business, advertising, marketing,
economics, human resources, finance, taxation, computers, and more. Expanded international coverage. Fast access to information on 60,000
+ companies with business and executive profiles. Now includes The Wall Street Journal.</DatabaseDesc>
<Titles><Title><RecordReference>PQPMID:6</RecordReference></Title><Title><RecordReference>PQPMID:8</RecordReference></Title>
<Title><RecordReference>PQPMID:7510</RecordReference></Title><Title><RecordReference>PQPMID:7539</RecordReference></Title><Title><RecordReference>PQPMID:7896</RecordReference></Title><Title><RecordReference>PQPMID:7921</RecordReference></Title><Title><RecordReference>PQPMID:7940</RecordReference></Title><Title><RecordReference>PQPMID:7976</RecordReference></Title>