Tagging MPLP: A Comparison of Novice & Expert Domain User Generated Tags in a Minimally
Processed Digital Photographic Archive
Edward Benoit, IIISchool of Information Studies, UW-Milwaukee
Introduction/Background
• Howard Zinn• The postmodern archives• Rising backlog problem• Minimal processing/MPLP• Minimally processed digital archives
Study Focus• Supplemental metadata from social tags• User prior domain knowledge as quality
control• Research questions:– What are the similarities/differences between tags
generated by expert and novices?– In what ways do tags generated by expert/novice
users correspond with full metadata?– In what ways do tags generated by expert/novice
users correspond with existing users’ query terms?
Methodology
• Mixed methods, quasi-experimental two-group design
• 60 participants (novice & experts) generate tags for 15 photographs & 15 documents
• Pre- and post-questionnaires• Analysis:– Open coding– Descriptive statistics
Sample Collection: Groppi Papers
http://collections.lib.uwm.edu/cdm/landingpage/collection/march
Participants• Scoring: – Expert x= 7.57– Novice x= 2.77
• Ages: 18-63, x= 31.73• Gender (M/F/O):
23.3%/75%/1.7%
Race Frequency% of Participants
White 44 73.3%
Black 9 15.0%
Hispanic/Latino 10 16.7%
American Indian 4 6.7%
Asian/Indian 2 3.3%
Pacific Islander 0 0.0%
Other 1 1.7%
• 48.3% from WI or IL• 58.3% non-students
Participants’ Prior Use/Knowledge
Coding Scheme
• Replication of metadata• Format focused• General identification• Specific identification• Description• Broader context• Emotional
Wisconsin Historical Society, WHS-26541
• Image removed for copyright. Accessible at: http://www.wisconsinhistory.org/whi/fullRecord.asp?id=26541
Results: Number of TagsTotal Unique Min Max x
Expert 1705 396 15 196 56.83
Novice 2142 291 15 577 71.4
Combined 3847 396 15 577 64.12
Results: Types of Tags
Replication Format Gen ID Spec ID Description Broader EmotionExpert 17.54% 0.00% 20.12% 11.79% 31.91% 17.13% 1.52%Novice 14.01% 3.08% 29.43% 12.42% 24.01% 15.74% 1.31%
Results: Matching Metadata
% Matching % Non-matching
Expert 34.17% 65.83%Novice 25.18% 74.82%Combined 36.69% 63.31%
Results: Matching Queries
MatchNon-match % Match
% Non-match
% of Q.T. matching Tags
Expert 248 97 71.88% 28.12% 0.58%Novice 184 69 72.73% 27.27% 0.43%Combined 312 147 67.97% 32.03% 0.73%
• Query log analysis for one month on existing collection resulted in 42,755 unique query terms
Results: Tagging Motivation
How I would find the item
How others would find the item
The content of the item
The item’s format
The connection between items
The accuracy of the provided information
The previous user’s tags
My previous tags
Expert 4.27 4.10 4.50 3.33 3.43 3.50 3.63 3.87Novice 4.60 4.60 4.67 3.23 3.63 3.70 3.90 4.10
Combined 4.43 4.35 4.58 3.28 3.53 3.60 3.77 3.98
Results: Motivating Taggers
Account & login Newsletter/website recognition
Social media recognition
Non-monetary rewards
Anonymously submission
Monetary rewards
Expert 3.67 3.30 2.97 3.67 3.87 4.40
Novice 3.07 3.23 2.97 3.83 3.90 4.40
Combined 3.37 3.27 2.97 3.75 3.88 4.40
Conclusion/Future Directions
• Replication of presented metadata• Benefits of domain expert tagging• Benefits of including both domain expert and
novice tags• Further study needed on:– Alternative factors– How to motive tag generation
Thanks for listening!
Please hold your questions for later