simon tucker nlp presentation efficient user-centred access to multimedia meeting content simon...

33
Simon Tucker www.amiproject.org NLP Presentation Efficient user-centred access to multimedia meeting content Simon Tucker and Steve Whittaker University of Sheffield {s.tucker, s.whittaker}@shef.ac.uk

Upload: christina-daniels

Post on 24-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Simon Tuckerww

w.a

mip

roje

ct.o

rg

NLP Presentation

Efficient user-centred access to multimedia meeting content

Simon Tucker and Steve Whittaker

University of Sheffield

{s.tucker, s.whittaker}@shef.ac.uk

Simon Tucker NLP Presentation

AMI Project

•Meetings are a critical way in which knowledge is created and shared within organisations

•Most of this knowledge is never recorded

•AMI provides Multimodal Access to Multimedia Records of Meetings

•16 Partners

•Follow on project AMIDA – Real Time

Simon Tucker NLP Presentation

Sheffield AMI Work

•User Requirements

•Temporal Compression of Speech

•Reducing the amount of time required to listen to a meeting recording but still getting the important information.

•Dynamic Visual Summarization Techniques

•A number of methods for dynamically presenting summary information interactively.

•Temporal Compression of Video

•Audio motivated video compression.

Simon Tucker NLP Presentation

Meeting browsers

•The primary means of accessing meeting records is via a browser.

•In previous work we segregated browsers into four categories according to their focus.

•The focus is either the primary means of presentation or navigation that the browser used.

•This segregation allowed us to get a good idea of the current browser space.

Simon Tucker NLP Presentation

Browser ExamplesAudio Video

Artefact Discourse

Simon Tucker NLP Presentation

User Requirements

•Can make use of two different methods to collect user requirements

•Practice–centric•Examination of current practices.

•Collection through observation.

•Technology-centric•Exposure to new technology.

•Collection through user opinion.

Simon Tucker NLP Presentation

Practice-centric AMI study

•Meetings already generate a large amount of information exchange.

•Personal Notes.

•Minutes.

•Post-meeting email discussion.

•Informal meeting discussions.

•Approach taken is to record (where possible) and then analyse these records.

•Use this analysis information to determine how meeting records are used and what are any problems associated with such records.

Simon Tucker NLP Presentation

Study details

•We examined the meeting recording practices of two firms.

•We studied a core team over a series of meetings.

•Thus we can study the lifecycle of meeting documents.

•Meetings in both firms were task oriented rather than being about the generation of ideas.

•We collected permission to make recordings from each meeting participant

•We also allowed participants to request that the recordings be switched off.

•Names were removed from transcripts.

Simon Tucker NLP Presentation

Existing Tools and Problems

Type of Record Functions ProblemsPublic Record (Minutes)

*Group Todos(actions/decisions)

*Summary/Gist

Group Archive (history)

Not timely

Lacks context & completeness

Requires effort to produce

Private Record (Personal Notes)

*Personal Todos(actions/decisions)(context for actions)

Briefing for non-attendees

Personal Archive

Esoteric

Detracts from ability to contribute

Simon Tucker NLP Presentation

Analysis of State of the Art Tools

•Important to assess the state of the art.

•Assessed the efficiency of the first generation AMI meeting browser in answering typical questions about a meeting.•Generated a number of questions about a single

meeting.

•Subjects asked to answer these questions using the meeting browser.

•‘Thinkaloud’ was encouraged and we examined the accuracy of the answers.

•The questions were either about specific information (what was the total budget?) or were more general (what was Ed’s contribution to the meeting?).

Simon Tucker NLP Presentation

Tools Analysis Results

•Inefficient for access•Too much low

level detail

•Assumption of large display

•Users need abstraction / summarisation tools

Simon Tucker NLP Presentation

Efficient Access to Meeting Data

•There is a clear need for efficient access to meeting data.

•Meetings contain a lot of irrelevant information (both in general and for specific participants).

•Minutes and notes capture important information but lack contextual information.

•State of the art tools lack abstraction – generally present the raw recordings, unfiltered.

•We focus on lightweight components allowing for efficient access to meeting data.

Simon Tucker NLP Presentation

Temporal Compression of Speech

•Intended for environments which necessitate speech only access.

•e.g. Mobile phone, travelling in car etc.

•Aim is to reduce the length of the recording but to retain the important content.

•Two techniques for reducing the length:

•Speed Up: Play the full clip back at a faster rate.

•Excision: Remove sections of the recording.

Simon Tucker NLP Presentation

Speed Up

•Simplest approach is to directly alter the playback rate.

•Has the side effect of altering the pitch of the speakers.

•Use an overlap and add algorithm to speed up whilst keeping pitch constant.

•Has the problem of not reflecting how speakers naturally increase their speech rate.

•Use a variable playback rate to better match how human speakers alter their speech rate.

Simon Tucker NLP Presentation

Excision

•Simple approach is to remove non-informational parts of the recording e.g. silence.

•Limited by the amount of silence.

•Derive measures of word importance and only play back the important words; missing words are mentally replaced.

•Far from “natural” speech.

•Use larger parts of speech (utterances) and locate important utterances and play only those back.

Simon Tucker NLP Presentation

Examples

Simon Tucker NLP Presentation

•Initial Exploratory Experiment

•Gain an understanding of the space.

•Informally assessed a large number of techniques.

•Located promising directions for research.

•Follow up detailed study

•Examined a subset of the techniques explored.

•Used a measure of gisting ability to assess success.

•Examined short and long meeting clips.

•Also examined effect of a user interface.

Experimental Overview

Simon Tucker NLP Presentation

Measuring Gisting Ability

•A key facet of our techniques is that they support the discovery of gist rather than facts.

•Therefore the metrics we have used previously do not adequately capture the proposed usage of these tools.

•Key components of the performance metric:

•Must be quick to assess and to score (experimenter and subject time)

•Objective measure

Simon Tucker NLP Presentation

Measuring Gisting Ability (2)

•Our solution was to use a hybrid gold standard scheme.

•We measure the importance of utterances from the transcript and select a number of utterances from the full range of importance.

•We then ask judges to rank these utterances in order of importance.

•Subjects then listen to the meetings and perform the same ranking.

•The objective score is then the difference between the gold standard and subject rankings

Simon Tucker NLP Presentation

Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it? To all the people who can Quest like A

Tribe does Before this, did you really know what live was?

Comprehend to the track, for it's why cuz Gettin measures on the tip of the vibers Rock and roll to the beat of the funk fuzzWipe your feet really good on the rhythm rug If you feel

the urge to freak, do the jitterbug Come and spread your arms if you really need a hug Afrocentric living is a big shrug A life filled with *HORN* that's what I love A lower plateau is what we're above If you diss us, we won't even think of

Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [Phife Dawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?

To my Tribe that flows in layers Right now, Phife is a poem sayer At times, I'm a studio conveyor Mr. Dinkins, would you

please be my mayor?You'll be doing us a really big favor

Boy this track really has a lot of flavor When it comes to rhythms, Quest is your saviorFollow us for the funky

behavior Make a note on the rhythm we gave ya Feel free, drop your pants, check your ha-ir Do you like the garments

that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savor Doesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air

Speech Recording

Transcript

Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it? To all the people who can Quest like A

Tribe does Before this, did you really know what live was?

Comprehend to the track, for it's why cuz Gettin measures on the tip of the vibers Rock and roll to the beat of the funk fuzzWipe your feet really good on the rhythm rug If you feel

the urge to freak, do the jitterbug Come and spread your arms if you really need a hug Afrocentric living is a big shrug A life filled with *HORN* that's what I love A lower plateau is what we're above If you diss us, we won't even think of

Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [Phife Dawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?

To my Tribe that flows in layers Right now, Phife is a poem sayer At times, I'm a studio conveyor Mr. Dinkins, would you

please be my mayor?You'll be doing us a really big favor

Boy this track really has a lot of flavor When it comes to rhythms, Quest is your saviorFollow us for the funky

behavior Make a note on the rhythm we gave ya Feel free, drop your pants, check your ha-ir Do you like the garments

that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savor Doesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air

Temporal Compression

Utterance Identification

Judge Target Utterance Rankings

Gold Standard Target Utterance

Ranking

that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savorDoesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air

5

Boy this track really has a lot of flavorWhen it comes to rhythms, Quest is your saviorFollow us for the funky 4

please be my mayor?You'll be doing us a really big favor3

Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [PhifeDawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?

2

Tribe does Before this, did you really know what live was?1

Listener

please be my mayor?You'll be doing us a really big favor5

Boy this track really has a lot of flavorWhen it comes to rhythms, Quest is your saviorFollow us for the funky 4

that we wear? I instruct you to be the obeyer A rhythm recipe that you'll savorDoesn't matter if you're minor or major Yes, the Tribe of the game, rhythm player As you inhale like a breath of fresh air

3

Tribe does Before this, did you really know what live was?2

Will Nipper the doggy give a big shove? This rhythm really fits like a snug glove Like a box of positives is a plus, love As the Tribe flies high like a dove [PhifeDawg] Can I kick it? (Yes, you can!) *7X* Well, I'm gone (Go on then!) Can I kick it?

1

Listener Ranking of Target Utterances

ComparisonComprehension

Efficiency

Measuring Gisting Ability

Simon Tucker NLP Presentation

Results

•Removing unimportant utterances performed better than speed up.

•Listeners understood the gist of a recording faster.

•All techniques performed better than applying no compression.

•With longer clips understanding was the same.

•Speed up required more interface interactions than excision.

No Compression

Word Excision Utterance Excision

Speed Up

Compression Type

0

0.001

0.002

0.003

0.004

Mea

n C

ompr

ehen

sion

Eff

icie

ncy

Simon Tucker NLP Presentation

•Using summary information to locate points of interest within a meeting transcript.

•Traditional summaries can be customized but are largely presented statically.

•Underpinned by two concepts:

•User is able to dynamically alter the summarization level.•Alteration shown in real time.

•Applying different presentation techniques.

Dynamic Summarization

Simon Tucker NLP Presentation

•Using the same process to evaluate as was used for the speech work.

•An initial lightweight evaluation of a number of UI concepts intended to find promising directions of research.

•A follow up study examining the techniques in more detail with a more rigorous evaluation protocol.

Development Procedure

Simon Tucker NLP Presentation

•Two unit levels examined:

•Words

•Utterances

•Two presentation techniques:

•Unit shading.

•Unit excision.

•Two hybrid techniques:

•Combining the four techniques into one

•An experimental fish-eye view

Dynamic Summary Display

Simon Tucker NLP Presentation

•Word Excision

•Word Shading

Examples

Simon Tucker NLP Presentation

Initial results

•Shading works well.

•Operating at the word level is satisfactory.

•Fish-eye was not liked.

•The combinatorial approach did not really offer anything novel.

Simon Tucker NLP Presentation

Follow Up Study

•Focus solely on the Word Excision and Word Shading techniques (highest rated in the previous experiment).

•Two questions (one specific, one general) about a number of meetings.

•Use the two interfaces (plus a control plain text transcript) to answer the questions (one question per meeting).

•Measure the time taken to answer, the accuracy and the amount of interface actions used when answering the questions.

•Collect subjective preference data and user comments about each of the techniques.

Simon Tucker NLP Presentation

Follow Up Study Results

•Subjects were largely accurate – there was no effect on interface type on the accuracy

•No effect of interface type on time taken to answer – i.e. there was no efficiency loss as a result of using the dynamic interfaces.

Simon Tucker NLP Presentation

Preference and Process Results

•Subjects overwhelmingly preferred the Word Excision Condition.•Subjects scored the Word

Excision and Plain Transcript conditions equally.

•The Word Shading condition required less interface actions than the Word Excision condition.•Specifically users spent

more time changing compression levels in the Word Excision condition.

Simon Tucker NLP Presentation

Video Compression

•The same techniques for audio can also be applied to video.

•Compress the audio recording and use this compressed version to derive an audio-video recording.

•Informal evaluation indicates a different modality for video.

Simon Tucker NLP Presentation

Video Examples

•Type of video being used

•Word excised video

•The cuts are now much more disconcerting.

•Sped Up video

•More comfortable to watch but disconcerting at high compression levels.

•Can also do non-linear compressed video

•Speed up only the non-silent parts.

•Can also e.g. speed up through unimportant parts

Simon Tucker NLP Presentation

Summary

•Looking at Interfaces for Browsing Meeting Recordings

•Problems with abstraction in current meeting recording technology and automatic browsing systems

•Temporal Compression of Speech

•Reducing the time required to listen to a speech recording but keeping the important information.

•Utterance Excision.

Simon Tucker NLP Presentation

Summary

•Dynamic presentation of meeting transcripts

•Real time selection of summary level.•Word Shading.

•Temporal Compression of Video

•Applying the above to video recordings.•Speed up more effective.