audiovisual collections, the spoken word and user needs of scholars in the humanities

24
Audiovisual collections, the spoken word and user needs of scholars in the Humanities Observations based on related work in The Netherlands 2005-2012

Upload: roelandordelmannl

Post on 16-Jan-2017

256 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Observations based on related work in The Netherlands 2005-

2012

Page 2: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

E-research

• New and/or rapid ways to gain knowledge• Digital resources and information technology• Big data & data mining (social sciences)• Digital Humanities / E-Humanities• Digitization, Infra, Tools, Standards• CLARIN.eu / DARIAH.eu

Page 3: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Emerging focus on audiovisual

• Multi-modal, multi-semiotic: – multiple layers of meaning / interpretation– E.g., “quote + intonation + images + discourse”

• New dimensions for scholarly research• Large investments in digitization:

– Images for the Future: 200k hours of film, video and audio

– Various digitization projects for scientific collections

Page 4: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

?METADATARULES

Page 5: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Metadata & Annotations

• Annotations:– General (document level)– Specific (segment level)

• Metadata: typically sparse / document level• Requirements dependent on research field• Annotation generation:

– Manual (Individual, Teams, Crowd) – Automatic: (un/lightly) supervised

Page 6: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Monitoring radio transcriptions

INGEST SUPERVISION // ARCHIVIST SUPPORT:

Quickly assess quality of ASR

Page 7: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Spoken word search 2005-2012

• Wide range of projects in various domains– Radio

• Daily ingest: selection of programs• Woord.nl: public access to radio content

– Historical video collections with sparse data– ``Oral History’’

• Development of an ASR service for cultural heritage institutions

Page 8: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

1st experiment on ASR for humanities: access to personal recordings of Dutch novelist WF Hermans

Page 9: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Access to interview collection with camp survivors World War II

Page 10: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

FEMINIST MOVEMENTAccess to interview collections

Page 11: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

INTERVIEWS ON BOMBARDEMENT OF ROTTERDAM

Alignment of transcripts for indexing

Page 12: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Access to Radio interviews Experiments with various types of access and result presentation: speaker changes, speaking rate, search strategies, word clouds

Page 13: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Access to Historical Speeches:Alignment & Linking

Page 14: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

ACCESS TO DISTRIBUTED ORAL HISTORY COLLECTIONS

• Infrastructure for searching collections at various institutes in The Netherlands

• Harvesting of Metadata (OAI-PMH)

• ASR as a service• Evaluated with

Oral Historians

Page 15: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Observations on speech search

• Large variation in ASR performance• Performance (and decisions on use)

should be assessed in context of application: audiovisual search

• Usefulness in audiovisual search should be assessed in context of use scenarios

• Use scenarios require specific presentation/visualization requests

Page 16: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Usefulness of results• Perception of usefulness

– Usefulness in context of search/data exploration– Educate / Expectation management– Guide searching – Show why (errors, confidence, trust-levels, cut-offs)– Focus on research needs

• Improve on ASR quality– Educate: how to record an interview (Oral History)– Use available textual resources (alignment, vocab optimization)

• Improve on search application– Visualization– Result presentation

• documents versus segments• combination of information sources• cross/within-collection linking

Page 17: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Methodology (1)• E-research is an intervention in current practices!• Promise:

– increased efficiency, relevance, novelty• Interest of scholars:

– tools that facilitate or simplify existing practice (RIN report, 2011)• Co-development ICT-researchers & scholars to adjust

expectations. Examples:– Finding more in less time may not be a goal in itself for humanities

researchers– Deep engagement with primary texts versus results on the

segment level

Page 18: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Methodology (2)• 4 stages:

1. Preliminary archival search• Browsing as a general interest• Purpose driven (checking details, complementary resources)• Item-oriented (finding first mentioning of something)• Collection-oriented (thematic, source, person, event)

2. Content analysis• Visualization, compression, aggregation• (optionally) go back to (1)

3. Presentation and dissemination• Enhanced publications (persistent identifiers on segment level)

4. Curation• Trusted digital repository

• (spoken) search scenarios: facilitate these stages

Page 19: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

ASR for research• Triple-A: Accessible, Affordable, Accurate• Individual researchers sending files to ASR?• Embedded in suite of research tools?• What about integration in search applications?

– Stagnation due to inadequate local infrastructures• Variation across collections requires ‘tailor-

made’ approaches: e.g., speaker adaptation, vocabulary adaptation, alignment, collection of related resources (information trail)

Page 20: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

ASR service

Model of use:• Free test bundle (10h)• Various small/medium/large

bundles• Reduced costs (only

hardware and maintenance)• Management by CH body• Maintenance by industry

partner

Upload: via http, ftp, api

Account information

Page 21: Audiovisual collections, the spoken word and user needs of scholars in the Humanities

Dutch Queen Wilhelmina addressing the Dutch people from London during WWII

Page 22: Audiovisual collections, the spoken word and user needs of scholars in the Humanities
Page 23: Audiovisual collections, the spoken word and user needs of scholars in the Humanities
Page 24: Audiovisual collections, the spoken word and user needs of scholars in the Humanities