g. marchionini, alicamte, nov. 2003 the open video digital library: balancing theory and practice...

G. Marchionini, Alicamte, Nov. 2003

The Open Video Digital Library: Balancing Theory and Practice

JBIDI 2003

Gary MarchioniniUniversity of North Carolina at Chapel Hill, USA

[email protected], Spain

November 10-12, 2003


Outline

• Challenge: Combining DL research with a production-level DL

• Overview of the Open Video DL

• The Production Process

• The Research Process

• Open Video Redesign based on User Studies and challenges of sustainability


Open Video Vision/Contributions• An open repository of video files that can be re-used in a

variety of ways by the education and research communities– Encourages contributions– A testbed for interactive interfaces

• An easy to use DL based upon the agile views interface design framework– Multiple, cascading, easy to control views (pre, over, re, shared,

peripheral)– Views based upon empirically validated surrogates– An environment for building theory of human information

interaction• A set of methods and metrics that reveal how people

understand digital video through surrogates


Background & Status• Begun 1995 with colleagues at UMD & BCPS• Current funding: NSF# IIS-0099538• Collaborators/Contributors: I2-DSI, ibiblio, CMU, UMD,

NIST, Internet Archive, NASA• ~ 0.5 TB of content• ~2000 video segments• ~1500 different titles • ~4000 unique visitors per month (20,000 in Oct 03)• I2-DSI video channel• MPEG-1, MPEG-2, MPEG-4, QT• OAI provider• Ongoing user studies


Agile Views Interface Research

• Provide a variety of access representations (e.g., indexes) and control mechanisms

• Usual search and browse capabilities

• Leverage both visual and linguistic cues

• Create and test surrogates for overview preview, shared and history views


Open Video Release 2

• Incorporate more visual surrogates

• Improve search options

• Add recommendations

• Improve contributions forms

• Save results/partitions

• Provide user registration

• Provide help/descriptions


MPEG etc.MPEG etc.

MPEG etc.

Digitization

Segmentation

Keyframe Extraction

Keyword (text)

Surrogates

Keyword (audio)

Metadata

Client(Browser& Apps)

Database(MySQL)

AVI

Search

Contribute

Browse

Open Video Server

Production System

Distributed Files


Acquisitions

• Contributors provide tapes (e.g., HCIL, Prelinger, NASA)– Digitize– Add metadata

• Contributors provide files & metadata (e.g., Informedia)

• Crawl trusted sites (Internet Archive, LoC)

• Individual contributions


Surrogate Creation

• Segmentation– Manual– Automatic (interframe grayscale correlation)

• Keyframe extraction– MERIT (UMD)– VAST (nth frames, export QuickTime fast

forwards)– Various scripts for managing GIFs, JPEGs,

assemblying storyboards


Web Database

• MySLQ database• Metadata tables• PHP middleware• Contribution forms• Administrative tools

– Logging– Demos– Related web objects– Backups


Other Tools in the DL Toolkit

• ISEE– Asynchronous remote use of video– Video player, chat tool, shared browser– Linked via time codes

• VIVO (video indexing and visual organizer)– Multi-level indexing with inheritance (collection, video,

segment, scene, shot, frame)– Manual frame extraction

• Peer to Peer Sharing


User Study Framework

GOALSlearning, work, entertainment

TIMEtime spent searching and viewing results

MENTAL LOADperceptual loadcognitive load

PHYSICAL LOADamount of muscle

movement

EFFORT TASKS

VIDEO CHARACTERISTICS

INDIVIDUAL CHARACTERISTICS

SURROGATES, AGILE VIEWS

PERFORMANCEretrieval (precision, recall)

recognition (objects, action)gist comprehension

(linguistic, visual)

SATISFACTIONperceived usefulnessperceived ease of use

flowuser satisfaction

OUTCOMES

domain experiencevideo experiencecultural experiencecomputer experienceinfo seeking experiencemetacognitive abilitiesdemographicsdisplay controls

keywordsstoryboard w/ text, audioslide show w/ text, audiofast forward w/ audioposter frames

select video for viewingselect scene for viewingcopy and use scenescopy and use framesother tasks?

genre: documentary, narrativetopic: literal, figurativestyle: visual, audio, textual, place


The Surrogates• Storyboard with text keywords (20-36 per board@ 500

ms)• Storyboard with audio keywords• Slide show with text keywords (250ms repeated once)• Slide show with audio keywords• Fast forward (~ 4X)• Fast forwards 32X, 64X, 128X, 256X• Poster frames• Real time clips• Text titles


Surrogate ExamplesType of surrogate Examples

Text surrogate Title, keyword, description, etc.

Still image surrogate Poster frame, storyboard/filmstrip, slide show, video stream, key-frame-based table of contents, etc.

Moving image surrogate Skim, fast forward, etc.

Audio surrogate Spoken keywords, environmental sounds, music, etc.

Mutlimodal surrogate Text surrogate + still image surrogate, still image surrogate + audio surrogate, etc.


MetricsText Still image Action

Recognition Object recognition (text) Object recognition (graphical) Action recognition

Inference Gist determination (free text)Gist determination (multiple-

choice)

Visual gist determination


User Studies• Study 1: Qualitative Comparison of Surrogates (ECDL

02)• Study 2: Fast Forwards (JCDL 03)• Study 3: Narrativity (CHI 02)• Study 4: Shared views and History Views (Geisler

dissertation)• Study 4: Poster frames and text (eyetracking, CIVR 03)• Study 5: TREC evaluation• Current studies

– Hughes MP, Gruss MP

• Redesign effects and integration of surrogates in AV


Example: Coney Island: 9:19 at 32X


How Much Affection 19:48 at 64X


Iran 14:00 at 128X


On the Run 14:09 at 256X


The Challenges Ahead

• Sustain system give usage and new contributions

• Preserve the files and behaviors

• Extend research on how people understand videos through surrogates


http://www.open-video.org• Marchionini, G. & Geisler, G. (2002). The Open Video Digital Library. dLib Magazine, 8(12).

http://www.dlib.org/dlib/december02/marchionini/12marchionini.html • Marchionini, G. (2003). Video and Learning Redux: New Capabilities for Practical Use. Educational

Technology, March 2003.• Slaughter, L., Marchionini, G. & Geisler, G. (2000). Open Video: A Framework for a Test Collection.

Journal of Network and Computer Applications, 23(3), 219-245.• Wildemuth, B. Marchionini, G., Wilkens, T., Yang, M., Geisler, G., Fowler, B., Hughes, A., & Mu, X.

(2002). Alternative Surrogates for Video Objects in a Digital Library: Users’ Perspectives on Their Relative Usability. Proceedings of the 6th European Conference ECDL 2002 on Research and Advanced Technology for Digital Libraries, Berlin: Springer. (Rome, September 16-18, 2002) 493-507.

• Geisler, G., Marchionini, G.,Wildemuth, B., Hughes, A., Yang, M., Wilkens, T., & Spinks, R. (2002). Video browsing interfaces for the Open Video Project. Proceedings of CHI 02, Extended Abstracts (Minneapolis, MN, April 20-25, 2002). NY: ACM Press. 514-15.

• Wildemuth, B., Marchionini, G., Yang, M., Geisler, G., Wilkens, T., Hughes, A. & Gruss, R. (accepted). How fast is too fast? Evaluating fast forward surrogates for digital video, JCDL 2003.

• Geisler, G. (2003). AgileViews: A Framework for creating more effective information seeking interfaces. Unpublished doctoral dissertation, UNC-Chapel Hill.

• Nelson, Michael L., Marchionini, Gary, Geisler, Gary, and Yang, Meng (2001). "A Bucket Architecture for the Open Video Project [short paper]." JCDL ’01, ACM - IEEE Joint Conference on Digital Libraries (June 24-28, 2001, Roanoke, Virginia).

• Geisler, Gary, and Gary Marchionini (2000). The Open Video Project: A Research-Oriented Digital Video Repository [short paper]. In Digital Libraries '00: The Fifth ACM Conference on Digital Libraries (June 2-7 2000, San Antonio, TX). New York: Association for Computing Machinery, 258-259.

g. marchionini, alicamte, nov. 2003 the open video digital library: balancing theory and practice...

Documents