digitizing emerald’s backlist anna torrance (backfiles project manager) “the farther backward...
TRANSCRIPT
Digitizing Emerald’s Backlist
Anna Torrance(Backfiles Project
Manager)
“The farther backward you can look, the farther forward you are likely to see”
Winston Churchill
Overview
About Emerald & the Backfiles
Rationale
Project processes and set-up
Salient points from last year’s session(Workshop with Rebecca Goldthwaite & David Durand)
Big Questions
Lessons learned
About Emerald Group Publishing Limited
Academic Journal publisher, established in 1967
200 employees
Largest collection of management and LIS journals available today 185+ business & library and information science
Recently acquired almost 2,000 Series, Serials and Books
Global business with customers in over 80 countries
Work with over 90% of the top business schools
“Research you can use”
Online Usage and Dissemination
20 million downloads in
2006
1.5 million articles
downloaded each month on average
61,000 articles online
over 13 years of content
200,000 online abstracts
What is Emerald Backfiles?
•Digital archive back to Volume 1 Issue 1 with some articles dating back as far as 1899
•Over 120 journal titles providing over 60,000 articles on key management disciplines
•Each backfile transformed into a fully searchable PDF
•Contains early articles from seminal publications (British Food Journal, European
Journal of Marketing, Journal of Documentation)
What is Emerald Backfiles?
Backfiles CoverageEmerald Management Xtra
1899 1904 1909 1914 1919 1924 1929 1934 1939 1944 1949 1954 1959 1964 1969 1974 1979 1984 1989 1994 1999 2004
Accounting
Education & Healthcare
HRM
Innovation & Intl Bus iness
Info and KM
Library Managem ent
Mngm ent & Econom ics
Marketing
Quality & Operations
Strategy & General Mngm ent
Engineering
Subj
ect
Year
The Decision to Digitize
Users increasingly searching & accessing content online – Emerald received its 50 millionth article download in 2007
Over 12% of cited Emerald articles were not available online prior to the launch of Emerald Backfiles
Four of the five most cited papers in Emerald journals were published before 1994 (when we first started to capture content electronically)
Digitizing the archive will bring this knowledge to new readers and re-open historical scholarship
Contributing to preservation of content
Pre-project preparation
Research – is this what our customers want?
Content analysis – commissioned initial inventory from British Library. Confirmation that 97.5% of content could be sourced.
Extended negotiations - by 2-3 months to ensure obligations spelled out for both parties & that journal list was accurate.
Project Timeline
2007 - Apr project received board approval
May BL commissioned to carry out inventory analysis
Jun attended ToC workshop
Jul analysis complete & contract negotiations started
Aug sales teams training
Sept contract signed & work started
Oct announcements made to contributor community and wider press
2008 - Jan official launch
1st quarter delivery date (anticipated in March)
Benefits of Partnering with the British Library
Experts in their field – a trusted partner
Extensive networks from which to
source content that they did not hold
Digitisation on site meant no
unnecessary damage to the collection or
transportation costs
BL’s relationship with Innodata Isogen ensured a total
service solution – critical due to
Emerald’s tight delivery deadline
BL holds both common and hard to find content – 98% of Emerald
articles were provided by the
BL
How was this project different from other British Library initiatives?
Co-ordination of the elements of the production team outputs
Emphasis on establishing content standards that could apply over the life of journals that in some cases have more than 100 years of production
DigitizationProcess
•Graphic creation as per specifications
•Graphic QC
•Renaming of graphics as per specifications
•Upload the package
•Make Corrections
•Receipt of PDF / TIFF
•Double Key•For 99.95% accuracy
•Compare
•SGML Tagging
•Visual QC
•SGML Validation
•Quality Checks
•Yes
•No
•Final Inspection
•Errors reported
•Rename the file as per the naming convention. Package SGML &
Graphics.
•Quality Audit
BL sources content
Content scanned &
TIFF file sent to
Innodata
Pass to BL for final QA check
Package sent from BL to Emerald
Project Process
Journal Control List
CHECK
CHECK
Hard copy journal
Scanning
Publishing/QA
Digitization
Content load onto website
Web pagesWebsite
CHECKCHECK
Project Team
Initial pre-project workshop included members from all departments to discover interdependencies – issues identified but not all resolved
Emerald Project Board – cross-functional
British Library Project team
Maintained 3 way communication throughout the project (Emerald, BL and Innodata). Regular face-to-face meetings of great benefit when discussing content selection. BL also held weekly tele-conferences with Innodata production team
2 project managers –technical and non-technical
Project Manager x2 – abalancing act
• Coordination• Balancing tensions between business requirements and IT milestones• 2 parallel workstreams
• Shared ownership (possible confusion)
Lessons Learned from last year’s TOC presentation
Involved contributor community from the outset Regular communication Editorial network helped to source rare journal copies Continuing involvement of editorial teams by linking
Backfiles to other initiatives (leading journal campaigns)
“Editorial is key – these are the people with detailed content
knowledge”
Lessons Learned from last year’s TOC presentation
Article auditsTime taken to ensure journal control list
was accurateBUTCould have taken a much bigger cross-
section of content going further back
“Even if you are only digitizing PDFs, think of them as XML just so that you
have an in-depth knowledge of the content”
Lessons Learned from last year’s TOC presentation
Every SGML file checked Spot checking by all 30+ members of the editorial
department in addition to regular QA processes
Regular meetings Time commitment emphasised from the outset Training to engage sales teams who were tasked with
selling a product they could not demonstrate
“Don’t short change QA”
“Engage team members and make clear time commitment up
front”
Lessons Learned from last year’s TOC presentation
Not every article from every issue will be present at go-live
Some articles may never be sourcedBUT Fully searchable PDFs essential SGML files must be accurate
“Ask yourself; what is the minimum required to declare
victory?”
Big Questions
What is an article? Older journals more like magazines. Problems differentiating article and non article
content (NAC).
At first, all pages scanned (adverts, NAC) Product process stopped for 2-3 weeks to reassess
content standards which were largely based on the formats of current publications.
Regular meetings with BL Decided to scan whole issue of earlier journals
British Food Journal – 1899 Back
Big Questions
What are the copyright implications?
Emerald retains copyright of the Backfiles product but not every article within the Backfiles
Network of almost 50,000 authors notified Notice put on each article with option to contact Emerald
if authors feel copyright breached Policy established to remove any articles at author’s
request
To date, no removal requests received
Big Questions
What should we do with articles without abstracts? Where no abstract provided, first paragraph of the
article used
How should we apply DOIs? Could not apply in usual manner (including journal ISSN) Decision made to apply sequentially without ISSNs
How should acquisitions be handled going forwards? Cut-off point in July 2007 New acquisitions will not be included in the Backfiles
product until a new version is released
Big Questions
How should we sell the Backfiles? By journal? By subject? As a single collection?
Regional pricing – consistency with Emerald’s core product
Regional launches
When should we sell the Backfiles? Danger of missing 2008 cycle
Managing Expectations - externally
Clear web branding strategy: Icons to identify content as Backfiles (different to
archive content) PDF-only stated on every abstract page
Inform the author community early on
Clear communication to customers about launch date and
availability
Managing Expectations – internally
Address tensions between IT/Production and Sales & Marketing
Sales – want to provide customers with a delivery date ASAP
Production – want as much time as possible to source all articles and ensure quality
IT – do not want to commit to a date too early due to number of unresolved requirements
Marketing – want product features and specification early in order to create material
New Business Model
Providing flexible purchasing options for librarians who prefer to purchase access in perpetuity
1899-1993Backfiles(access in
perpetuity)
1994-2001(content rented – access as long as
institution is a subscriber)
2002-present(access in
perpetuity)
2002-Emerald sub starts
Offer librarians option to
purchase access in perpetuity to
this content
Lessons Learned
Choice of partner plays an important role –British Library made the whole process simple, efficient and cost-effective
Every journal is different – an in-depth knowledge of ALL content is essential
Obtain the widest/earliest possible cross-section of content
Don’t underestimate the importance of the content control list!
Lessons Learned
Spend time identifying all issues up front but do not try to answer them all at once
File naming conventions need to take into account all eventualities over the full content range – it’s difficult to change conventions mid project
If outsourcing digitization, take as much time as you need to get the first batch perfect
Be aware of tensions between project delivery date and business requirements – managing expectations is key
Conclusion
Official launch at American Library Association at end of January 2008 and product on schedule for delivery in 1st Quarter of 2008 as planned.
Pre-launch order list healthy – exceeded expectations.
New acquisitions in Series, Serials & Books could mean another Backfiles project in the not too distant future!
Thank You
Questions?