audio and video chris mcconnell department of radio-tv-film november 30, 2006
TRANSCRIPT
Audio and Video
Chris McConnell
Department of Radio-TV-Film
November 30, 2006
Overview• Structured vs. Unstructured Data• The Big Challenge• Problems with Audio• Some Audio Solutions• Problems with Video• Some Video Solutions• Conclusion
Structured vs. Unstructured Data
• Data management folks often talk about two kinds of data in the enterprise.
• Structured data has a format that is enforced through software.
• Unstructured data is not easily understood by computers.
Structured Data
• Structured data is contained in a way that makes it easy for computers to index and search
• The short answer is that structured data is in a database.
• Things like XML files lie in a sort of gray area between structured and unstructured.
Unstructured Data
• Unstructured data is just about anything else that doesn’t have a predictable structure.
• Text documents (email, HTML, Word0
• Images
• Audio
• Video
Unfortunately, it gets more complicated
• It’s useful to break out this category in a few different ways.
• Text vs. “bitmap” data
• Text can be read by crawlers, discovery packages, etc.
• Good IA should make text easier to understand for automated tools.
Bitmap Data
• Bitmap data represents images, audio, or video as a series of numbers that represent each slice of a file.
• Darn near impossible for computers to extract meaning from the content itself.
• Many vector-based data types like Illustrator, Flash, MIDI, and maybe PDF suffer from a similar problem.
Metadata• Some bitmap data formats like mp3
offer the ability to add metadata that provides more context or structure for the content.
• However, video files provide operational meta data, but, if there’s space for metadata about the content, it’s rarely used, making video an even greater IA challenge.
Some Audio Problems
• Even if mp3s have metadata that reveal title, artist, and other information, this information is rarely used on the Web.
• Web browsers do not display this information.
• Web pages rarely provide context in an useful way.
Case Study: Slumber Party
• Pitchfork record review site allows users to click through directly to mp3s on record label sites.
• When mp3 is playing, users have little idea what they’re listening to.
• The record label has an even worse IA situation…
More Audio Problems
• To keep users from downloading mp3s, many labels and mainstream publications use Flash or Javascript players.
• These players are often oft-the-shelf solutions, which do not display IDv3 tags.
• Example: XXL
Some Solutions
• Use embedding to make an mp3 playable within a page, with appropriate labels.
• An mp3 download link can be provided for downloading or users with older browsers.
• Working with Flash, add context to the Flash file.
• Example: Warners Bros. Flaming Lips site.• However, it is difficult to provide a URI for
embedded content.
Embedding mp3s
• This allows you to create contextual text that describes the file.
• In addition, it creates a URI for a contextual page for the song.
<EMBED src="file.mp3" autostart=true>
Problems with Video
• As with mp3s, many video files play in a browser window with little additional context. Example.
• Video is often difficult to describe or label, especially if it’s so-called “viral video”
Medium-Specific Video Problems
• Time-based media are difficult to scan or “scrub” for particular content.
• Search engines cannot index video content, unless transcripts are provided on an HTML page.
• Users may not have the patience to get to the good stuff.
Tagging Solutions
• Tagging allows users to create descriptive labels for video. Example: YouTube
• YouTube’s tagging system still does not allow users to identify interesting content inside the video.
• “Deep Tagging” allows users to mark and label points in the video. Example: MotionBox
References
• Weglarz, Geoffrey. “Two Worlds of Data – Unstructured and Structured,” DM Review September 2004.
• “Add Voice to your Site: The html EMBED Tag” Access from http://www.world-voices.com/resources/addaud.html
• “Deep Tagging and the Embeddable Motionbox Player” Motionbox Blog. September 13, 2006.