taming the wilde

23
Taming the Wilde Collaborating with Expertise for Faster, Better, Smarter Collection Analysis Jackie Bronicki, Collections and Online Resources Coordinator Cherie Turner, Chemical Sciences Librarian Shawn Vaillancourt, Education Librarian Frederick Young, Systems Analyst

Upload: charleston-conference

Post on 24-Jun-2015

88 views

Category:

Education


2 download

DESCRIPTION

2014 Charleston Conference Thursday, Nov 6, 2:15 PM

TRANSCRIPT

Page 1: Taming the Wilde

Taming the Wilde

Collaborating with Expertise for Faster, Better, Smarter

Collection Analysis

Jackie Bronicki, Collections and Online Resources CoordinatorCherie Turner, Chemical Sciences Librarian

Shawn Vaillancourt, Education Librarian Frederick Young, Systems Analyst

Page 2: Taming the Wilde

OutlineOutline of Presentation

Page 3: Taming the Wilde

Research Question 1: What are the best measurements for evaluating the current scope of the collection?

Research Question 2: What subject areas are not adequately covered in the current collection?

Research Questions

Page 4: Taming the Wilde

• Influenced by the Cornell University Library Print Collection usage report

• No language analysis

• No patron analysis

• Limited formats

Methodology

Page 5: Taming the Wilde

• 889,825 total monograph items in final dataset

• 425,865 titles that have not circulated (48%)

• 787,590 titles circulated 5 or fewer times (88%)

• 861,910 titles that have not circulated in the last year (97%)

Results

Page 6: Taming the Wilde

A B C D E F G H J K L M N P Q R S T U V Z0

50000

100000

150000

200000

250000

Distribution by LC Class

Page 7: Taming the Wilde

𝑃𝐸𝑈=𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑈𝑠𝑎𝑔𝑒

𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑜𝑓 𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠

𝑃𝐸𝑈 𝐵=1.43%1.32%

=1.08

1.32%

1.43%

If PEU>1 OverusedIf PEU<1 Underused

𝑅𝐵𝐻=𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑜𝑓 𝐼𝐿𝐿𝐵𝑜𝑟𝑟𝑜𝑤𝑖𝑛𝑔

𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑜𝑓 𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠%

𝑅𝐵𝐻𝐵=0.79%1.43%

=0.6

Mean RBH=1.54±5.18If RBH>Mean RBH OverusedIf RBH<Mean RBH Underused

Comparing Circulation to ILL Usage

Page 8: Taming the Wilde

LC Subclass

Percent of Holdings

Percent Usage PEU

Holdings Usage

Percent of ILL Borrowing RBH ILL Usage

B 1.32% 1.43% 1.08 Overused 0.79% 0.60 UnderusedBC 0.09% 0.08% 0.82 Underused 0.05% 0.51 UnderusedBD 0.24% 0.20% 0.84 Underused 0.24% 1.01 UnderusedBF 1.22% 1.78% 1.46 Overused 2.00% 1.64 OverusedBH 0.07% 0.09% 1.29 Overused 0.05% 0.68 UnderusedBJ 0.22% 0.27% 1.21 Overused 0.18% 0.79 UnderusedBL 0.42% 0.65% 1.56 Overused 0.69% 1.65 OverusedBM 0.10% 0.07% 0.67 Underused 0.09% 0.95 UnderusedBP 0.13% 0.26% 1.95 Overused 0.34% 2.57 OverusedBQ 0.04% 0.10% 2.63 Overused 0.32% 8.05 OverusedBR 0.36% 0.33% 0.91 Underused 0.70% 1.96 OverusedBS 0.22% 0.16% 0.73 Underused 0.36% 1.62 OverusedBT 0.16% 0.13% 0.85 Underused 0.40% 2.53 OverusedBV 0.18% 0.15% 0.86 Underused 0.44% 2.49 OverusedBX 0.52% 0.29% 0.56 Underused 1.69% 3.23 Overused

If PEU>1 OverusedIf PEU<1 Underused

If RBH>Mean RBH OverusedIf RBH<Mean RBH Underused

Mean RBH=1.54±5.18

Comparing Circulation to ILL Usage

Page 9: Taming the Wilde

LC Subclass Holdings Usage ILL Usage ActionB Overused Underused No ChangesBC Underused Underused Ease OffBD Underused Underused Ease OffBF Overused Overused Growth OpportunityBH Overused Underused No ChangesBJ Overused Underused No ChangesBL Overused Overused Growth OpportunityBM Underused Underused Ease OffBP Overused Overused Growth OpportunityBQ Overused Overused Growth OpportunityBR Underused Overused Change PurchasingBS Underused Overused Change PurchasingBT Underused Overused Change PurchasingBV Underused Overused Change PurchasingBX Underused Overused Change Purchasing

Comparing Circulation to ILL Usage

Page 10: Taming the Wilde

The More Important Question…..

Page 11: Taming the Wilde

• Sierra Infrastructure– What data existed where?– Title vs. Item – Call Number

• Defining Input/Output Variables – What we could output (circulation)

• MaRC

• Scope of Project

• Building a proper sample

Initial Challenges – Research Team

Page 12: Taming the Wilde

Challenges to Possibilities

• Understanding the question

• Does the System Provide an Answer?

• What can we do?

Page 13: Taming the Wilde

• High Expectations

• Inconsistency of Data– Bad input– Batch overlay– Doesn’t exist

Data Mining Challenges – Research Team

Page 14: Taming the Wilde

• Scaled Expectations

• Learning curve

• Piecing the Data Together

Data Mining Challenges – Systems Team

Page 15: Taming the Wilde

Research Question 1: What are the best measurements for evaluating the current scope of the collection?

Research Question 2: What subject areas are not adequately covered in the current collection?

Research Questions

Page 16: Taming the Wilde

Initial Output Criteria

Bibliographic Record

Call NumberSubject HeadingsPublication/Copyright Date

ISBNRecord NumberTitle

Item Record

Copy NumberTotal Number of CheckoutsStatus

Order Record

Order Date

Page 17: Taming the Wilde

Final Output Criteria

Bibliographic Record

Item Record

Call NumberTotal CheckoutsLast Year CheckoutsYear to Date Checkouts

Location

Call NumberPublication/Copyright DateRecord NumberTitle

PublisherCatalog DateISBN

Page 18: Taming the Wilde

• Fields for our analysis– Call Number– Request Date– Filled Date– Format

• Fields for later analysis– Lending Library– Title– Author– Publication Date– Publisher– Language– Library Type– ISBN– OCLC Number

ILL Output Criteria

Page 19: Taming the Wilde

Except….

• What was MaRC telling us?

• How were fields used?

Got Data?

Page 20: Taming the Wilde

• ISBN?

• Location: 143,823 records deleted

• Call numbers: 14894 records deleted

Data Cleaning

Page 21: Taming the Wilde

• Understanding the infrastructure– Order records– Bib records• MaRC

– Item records• Understanding local practice• Experts provide guidance and practical

solutions!

Lessons Learned

Page 22: Taming the Wilde

Print and Electronic Serials

• Challenges

– Different systems store records– Different kinds of usage information available– Holdings based analysis– Subscription or Subscription + Aggregated– Vendor supplied records

Page 23: Taming the Wilde

Aguilar, W. (1986). The application of relative use and interlibrary demand in collection development. Collection Management, 8(1), 15-24. Knievel, J. E., Wicht, H., & Connaway, L. S. (2006). Use of circulation statistics and interlibrary loan data in collection management. College & Research Libraries, 67(1), 35-49.. John N. Ochola PhD (2003) Use of circulation statistics andInterlibrary loan data in collection management, Collection Management, 27:1, 1-13,DOI:10.1300/J105v27n01_01 Mills, Terry R. (1982). The University of Illinois Film Center Collection Use Study. http://files.eric.ed.gov/fulltext/ED227821.pdf  "Report of the Collection Development Executive Committee Task Force on Print Collection Usage." (2012).Cornell University Library, http://staffweb.edu/system/files/CollectionUsageTF_ReportFinal11-22-10.pdf