slide 1 the ohiolink ebook aggregator study amy pawlowski joanna voss charleston conference...

www.ohiolink.edu

The OhioLINK eBook Aggregator Study

Amy PawlowskiJoanna Voss

Charleston ConferenceNovember 5th, 2015

OhioLINK Vision & Mission

Vision: Provide Ohio students, faculty, & citizens with the best academic library content to achieve their goals and aspirations.

Mission: OhioLINK creates a competitive advantage for Ohio's higher education community by cooperatively and cost-effectively acquiring, providing access to, and preserving an expanding array of print and digital scholarly resources in order to advance teaching, learning, research, and the growth of Ohio's knowledge-based economy.

OhioLINK is…. A member organization

• 121 member libraries

• Wide range of types of institutions: 5 ARL libraries, the Cleveland Clinic, State Library of Ohio, small theological unions, colleges of arts and music, law and medical libraries, etc.

• All public universities, two year colleges, and technical schools in Ohio

• Members not only consume services – they fund, build, and sustain shared services

OhioLINK: Putting the “I” in “IT” for over 20 years Shared Print Collections• 121 libraries• 50 Million Items• 600,000 items delivered per year

Shared E-Resources• Annual content purchase of $42,000,000• Millions of articles in 10,000 journals• More than 100 databases• 110,000 ebooks• And more

Technology Services• E-Journal, E-book, and

database platforms• Statewide publishing platform for

Electronic Theses and Dissertations• Management and support of major

Platforms – Central Catalog,Discovery Layer, Link Resolver

• E-book Cataloging (Metadata Services)

Quick History of eBooks at OhioLINK

• 2 Longstanding Packages – locally loaded in the EBC (OhioLINK Electronic Book Center)– Springer– Oxford

• 2011- 2013 eBook ITN (eBook Pilot)– Worked with YBP and ebrary on hybrid of a DDA & profile

purchasing model• Ashgate, Roman & Littlefield, Cambridge

– Picked up a Wiley package (side effect of the ITN)– Very staff intensive, both for central office and librarians

assisting in the pilot

Shaping OhioLINK’s eBook Strategy

• Recognized that there is no “silver bullet” • Our strategy would have to involve several approaches

• Devised a plan to start looking at data and talking with aggregators and publishers to make decisions

OhioLINK “eBook Bake-off”

• How we started to gather information from potential vendor partners

• Invited a group of 5 aggregators to present their platforms and content to CIRM (OhioLINK’s resource selection committee)– each vendor had an equal time slot over one day– Also invited YBP to present last to talk about how they

could augment or assist with the vendors who presented

Focused Data Analysis: Can We Answer Our Questions?• After bake-off, we decided to focus on how best to

pursue the purchasing of University Press eBooks for OhioLINK

• Had useful information for YBP (GOBI)• We had many questions we wanted to answer. Such as:

– Which aggregators provide the most UP titles?– Which aggregators provide the most percentage of front

list titles?– Is there any relation to what OhioLINK institutions are

purchasing in print with what we can purchase in electronic?

Data Sources

• Limit analysis to university press content• Look at 5 aggregators with university press content

– EBSCO– JSTOR– Project Muse– ProQuest (ebrary)– University Press Scholarship Online (Oxford)

• Focus on most-purchased content in print and already-owned electronic

Data SourcesData Type Source

Publisher availability per aggregator Aggregator website / from reps

Title-level availability per aggregator Aggregator website / from reps

Print output from selected publishers GOBI search output

Publishers with content in EBC EBC metadata export

To make it manageable:• Limit title-level comparison to YOP 2014• Limit publisher comparison to top 10 most-purchased university presses• Request data in spreadsheet format

Methodology

• Phase 1: Publisher-level analysis• Where is the publishers’ content available?• Where might previously-purchased content be available?

• Phase 2: Title-level analysis• Which titles are unique to each aggregator?• How much of print is captured electronically by aggregators?

Publisher Analysis

• How do we handle text data that’s just a little bit different?– Excel – Fuzzy Lookup Add-on

• http://www.microsoft.com/en-us/download/details.aspx?id=15011

• Compile master list of all publishers represented in aggregators

• Match up variations of same name (e.g. Oxford University Press, Oxford U.P., etc.)

• Result:– Side-by-side table of publishers per aggregator

http://www.microsoft.com/en-us/download/details.aspx?id=15011

http://www.microsoft.com/en-us/download/details.aspx?id=15011

Publisher Comparison Table

Findings

• Very few publishers have content on all 5 aggregators

76%

19%

3% 2% 6 publishers on all 5 aggregators

Publishers on Multiple Platforms

One Platform Two Platforms Three Platforms Four Platforms Five Platforms

Findings

• The top most-purchased university presses are not distributed evenly across aggregators

EBSCO Ebrary/ProQuest JSTOR Oxford (UPSO) Project Muse0

1

2

3

4

5

6

7

8

9

10

Most-Bought Publishers per Platform

# P

ub

lish

ers

Findings

• Uniqueness measurement limited by specificity of publisher data

• Publishers representing 80% of EBC content represented on both EBSCO and ebrary

EBSCO Ebrary Project MUSE

JSTOR University Press Online

0

200

400

600

800

1000

1200

1400

1600

1388

560

111 7818

1009

176

31 19 1

Publishers by Platform

Total Publishers Unique Publishers

Lingering Questions

• We know we won’t get all content in one place

• Need to overcome publisher-level data issues

• How do we tell if these aggregators really cover the content we want?

Title-Level Analysis

• How do we match up LOTS of text data that’s just a little bit different?

• OpenRefine (formerly Google Refine)– http://openrefine.org/

– Filter title lists by most-purchased university presses– Limit to YOP 2014– Reconcile not-quite-matching text & ISBN data– Count occurrences per title

http://openrefine.org/

http://openrefine.org/

OpenRefine Methodology

• Combine titles lists from 5 aggregators– Reconcile not-quite-matching data– Count the number of occurrences per title

• Unique ID based on ISBN and platform• Not all provided lists had both print and eISBNs

– Link books using both print and e-ISBNs– Number of occurrences counted across both

• Group publisher name variations

Open Refine Title Analysis

Findings

• Each aggregator has varying levels of content unique to its platform

ProQuest EBSCO JSTOR UPSO Project Muse0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Uniqueness of 2014 Top-10 UP Titles per Platform

On 5 sites

On 4 sites

On 3 sites

On 2 sites

Unique on 1 site

Findings

• There is unique content from each publisher across multiple aggregators

Cambr

idge

Univer

sity

Press

Colum

bia U

niver

sity

Press

Harva

rd U

niver

sity

Press

MIT

Pre

ss

New Y

ork

Univer

sity

Press

Oxfor

d Univ

ersit

y Pre

ss

Prince

ton

Univer

sity

Press

Univer

sity

of C

alifo

rnia

Press

Univer

sity

of C

hicag

o Pre

ss

Yale U

niver

sity

Press

0

200

400

600

800

1000

1200

1400

Titles Appearing on Only One Platform

ProQuest

EBSCO

UPSO

JSTOR

Project Muse

(in alphabetical order)

Findings

• Focus on Oxford University Press– Not all print content is

available in electronic via these aggregators

– Academic content most represented in aggregators

EBSCO

ProQue

st

UPSO

All Agg

rega

tors

All OUP in

YBP

0

500

1000

1500

2000

2500

3000

3500

4000

2014 Oxford Print Books in eBook Aggregators by Content

Level

Not Categorized

POP

JUV

PROF

GEN-AC

ADV-AC

Conclusions

• Problems:– Capture existing purchased shared electronic content – Transition (shareable) print buying patterns to (shareable)

electronic

• Solution: – Confirmed there is not a one-stop shop answer

• Investigating the use of multiple aggregators– Need to keep in mind how this will affect the usability of cross

platforms for our users

– Print is not going away• We need to think not just about an eBook solution, but take

the use of print into consideration. They are not mutually exclusive.

www.ohiolink.edu

Questions

Amy Pawlowski

Deputy Director, OhioLINK

[email protected]

Joanna Voss

Collections Analyst, OhioLINK

[email protected]

mailto:[email protected]

mailto:[email protected]

slide 1 the ohiolink ebook aggregator study amy pawlowski joanna voss charleston conference...

Documents

ohiolink central

ohiolink member libraries

ohiolink collective

ohiolink vision missionvision

ebook itn ebook pilotworked

academic libraries

ebook shared collections

shared services slide