crowd-sourcing the creation of "articles" within the biodiversity heritage library
DESCRIPTION
An analysis of crowd-sourced "article" creation and user-generated metadata for a digital repository of biodiversity literatureTRANSCRIPT
![Page 1: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/1.jpg)
Crowd-sourcing the creation of “articles” within the Biodiversity
Heritage Library
Bianca [email protected]
Trish [email protected]
![Page 2: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/2.jpg)
The BHL is…
• A consortium of 13 natural history, botanical libraries and research institutions
• An open access digital library for legacy biodiversity literature.
• An open data repository of taxonomic names and bibliographic information
• An increasingly global effort
BHLLITA 2011
![Page 3: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/3.jpg)
Problem: Books vs. ArticlesLibrarians manage books Users need articles
BHLLITA 2011
![Page 4: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/4.jpg)
Solution: “Article-ization”
Creating articles manually, through the help of our users: BHL PDF Generator
Creating articles through automated means: BioStor http://biostor.org/issn/0006-324X
BHLLITA 2011
Page, R. (2011). Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library. BMC Bioinformatics, 12(187). Retrieved from
http://www.biomedcentral.com/1471-2105/12/187
![Page 5: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/5.jpg)
LITA 2011 BHL
![Page 6: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/6.jpg)
Create-your-own PDF
BHLLITA 2011
![Page 8: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/8.jpg)
What is an “article” anyway?
BHLLITA 2011
![Page 9: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/9.jpg)
the Good, the Bad, the Ugly
BHLLITA 2011
![Page 10: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/10.jpg)
the Good, the Bad, the Ugly
BHLLITA 2011
![Page 11: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/11.jpg)
the Good, the Bad, the Ugly
BHLLITA 2011
![Page 12: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/12.jpg)
Questions for Data Analysis
• What is the quality, or accuracy, of user provided metadata?
• What kinds of content are users creating?
• How can we improve the PDF generator interface?
BHLLITA 2011
![Page 13: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/13.jpg)
Stats
• Jan 2010-Apr 2011 – Approx 60,000 pdfs created from PDF
Generator– 40% of those (approx 24,000) were ingested
into CiteBank (PDFs without user-contributed metadata excluded)
• 5 reviewers analyzed 945 pdfs (approx 3.9% of the 24,000+ articles going into Citebank)
**Thanks to reviewers Gilbert Borrego, Grace Costantino, and Sue Graves from the Smithsonian Institution
BHLLITA 2011
![Page 14: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/14.jpg)
Methodological approach
• Quantitative – numerical rating system
• Rated titles, authors, beg/end pages• Its “findability” within CiteBank
search often determined how it was rated
BHLLITA 2011
![Page 15: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/15.jpg)
Ratings System
Title
• 1=has all characters in title letter for letter• 2=does not have all characters in title letter for
letter but still findable in CiteBank search • 3= does not have all characters in title letter for
letter and is NOT findable via the CiteBank search
LITA 2011 BHL
![Page 16: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/16.jpg)
Ratings System
Author
• 1=has all characters in author(s) last name letter for letter
• 2=has at least one author’s last name spelled correctly
• 3=has no authors or none of the author’s last names are spelled correctly
LITA 2011 BHL
![Page 17: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/17.jpg)
Ratings System
Article beginning & ending pages
• 1=has all text pages for an article, from start to end
• 2=subset of pages from a larger article • 3=a set of pages where the intellectual content
has been compromised.
LITA 2011 BHL
![Page 18: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/18.jpg)
Analysis steps
LITA 2011
![Page 19: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/19.jpg)
ResultsTitle average
1.68
Title average 1.68
Author(s) average 1.33
Beg/End pg average 1.41
Title & Author average 1.50
Overall average (combines first 3 above)
1.47
LITA 2011 BHL
![Page 20: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/20.jpg)
What did we learn?
• Ratings were better than we expected
• Many users took the time to create decent metadata
• “good enough” is not great but is still “findable”
LITA 2011 BHL
![Page 21: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/21.jpg)
BHL-Australia’s new portalhttp://bhl.ala.org.au/
there’s always room for improvement
Other factors
But of course…..
BHLLITA 2011
![Page 22: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/22.jpg)
Changes we madefor UI so far
• Asking users if they want to contribute their article to CiteBank
• Making article title a required field and validating it so its at least 2 or more characters
• Review button for users to review page selections and metadata (inspired by BHL-AUS)
• Reduced text and increased more intuitive graphics (inspired by BHL-AUS)
BHLLITA 2011
![Page 23: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/23.jpg)
Brief survey of proposed changes
• Overwhelmingly positive response to proposed change
there’s always room for improvement
But of course…..
BHLLITA 2011
![Page 24: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/24.jpg)
Success Factors
• Monitor the creation of the metadata to look at user behavior and patterns
• Engage with your users
• Incentivize your users
LITA 2011
![Page 25: Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library](https://reader035.vdocuments.us/reader035/viewer/2022070304/54c1e1fa4a7959410c8b4575/html5/thumbnails/25.jpg)
@BioDivLibrary
/pages/Biodiversity-Heritage-Library/63547246565
/photos/biodivlibrary/sets/
/group/biodiversity-heritage-library
Bianca [email protected]
Trish [email protected]
http://biodiversitylibrary.org
BHLLITA 2011