the mysteries of metadata

121
Confidential HP The Mysteries of Metadata The Mysteries of Metadata Workshop at Content World 2001, Burlingame, CA. May 15, 2001 Workshop at Content World 2001, Burlingame, CA. May 15, 2001 Amit Sheth [email protected] Founder/CEO, Taalee (www.taalee.com) [Taalee is now Semagix: www.semagix.com ] Also, Director, Large Scale Distributed Information Systems (LSDIS ) Lab, University Of Georgia (lsdis.cs.uga.edu) Metadata Extraction is a patented technology of Taalee, Inc. Semantic Engine and WorldModel are trademarks of Taale. Inc.

Upload: amit-sheth

Post on 06-May-2015

14.450 views

Category:

Education


1 download

DESCRIPTION

Amit Sheth, "The Mysteries of Metadata,"Workshop (Tutorial) at Content World 2001, Burlingame, CA. May 15, 2001

TRANSCRIPT

Confidential HP

The Mysteries of MetadataThe Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001Workshop at Content World 2001 Burlingame CA May 15 2001

Amit Shethamittaaleecom

FounderCEO Taalee (wwwtaaleecom) [Taalee is now Semagix wwwsemagixcom ]

Also Director Large Scale Distributed Information Systems (LSDIS) Lab University Of Georgia(lsdiscsugaedu)

Metadata Extraction is a patented technology of Taalee IncSemantic Engine and WorldModel are trademarks of Taale Inc

HP 2

Workshop Agenda

What is Metadata

Metadata Descriptions and Standards

Metadata StorageExchangeInfrastructure

(Automated) Metadata CreationExtractionTagging

Metadata UsageApplications

HP 3

What is Metadata

Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo

ApplicationsContent managementCataloguingInformation retrieval searchhellip

A Web content repository without metadata is like a library without an index - Jack Jia IWOV

HP 4

Information Interoperabilitykey metadata objective and benefit

System

Syntax

Structure

Semantics Protocols Metadata Domain ModelingOntologies

HP 5

Semantics

Meaning Understanding

Facts Context Reasoning

Related to exchange usage application

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 2

Workshop Agenda

What is Metadata

Metadata Descriptions and Standards

Metadata StorageExchangeInfrastructure

(Automated) Metadata CreationExtractionTagging

Metadata UsageApplications

HP 3

What is Metadata

Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo

ApplicationsContent managementCataloguingInformation retrieval searchhellip

A Web content repository without metadata is like a library without an index - Jack Jia IWOV

HP 4

Information Interoperabilitykey metadata objective and benefit

System

Syntax

Structure

Semantics Protocols Metadata Domain ModelingOntologies

HP 5

Semantics

Meaning Understanding

Facts Context Reasoning

Related to exchange usage application

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 3

What is Metadata

Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo

ApplicationsContent managementCataloguingInformation retrieval searchhellip

A Web content repository without metadata is like a library without an index - Jack Jia IWOV

HP 4

Information Interoperabilitykey metadata objective and benefit

System

Syntax

Structure

Semantics Protocols Metadata Domain ModelingOntologies

HP 5

Semantics

Meaning Understanding

Facts Context Reasoning

Related to exchange usage application

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 4

Information Interoperabilitykey metadata objective and benefit

System

Syntax

Structure

Semantics Protocols Metadata Domain ModelingOntologies

HP 5

Semantics

Meaning Understanding

Facts Context Reasoning

Related to exchange usage application

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 5

Semantics

Meaning Understanding

Facts Context Reasoning

Related to exchange usage application

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 6

A metadata classification

Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)

Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)

Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)

Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)

Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML

Document Type Definitions C program structure)Document Type Definitions C program structure)

Domain Specific Metadataarea population (Census)area population (Census)

landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies

OntologiesClassificationsClassificationsDomain ModelsDomain Models

User

Move in thisMove in thisdirection todirection to

tackletackleinformationinformation

overloadoverload

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 7

Types of Metadata for digital media

Media type-specific metadataegtexture of imagesfont sizehellip

Media processing-specific metadataegsearch retrieval personalized filtering

Content Specific metadataegrocket related video and documents

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 8

Metadata for Digital DataMetadata for Digital Metadata for Digital Data

Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 9

Types of Specs and Standards(or MetaModels)

Domain Independent (MCF) RDF MOF DublinCore

Media Specific MPEG4 MPEG7 VoiceXML

DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)

Application Specific ICE (Syndication)

ExchangeSharing XCM XMI

Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 10

what RDF can do for metadata

Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata

Domain Independent Metadata standard

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 11

RDF (Resource Description Format)

PropertyValueResource

bullRDF data consists of nodes and attached attributevalue pairs

bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata

bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 12

RDF Example 1

URIAMITdccreator

dctitleMysteries of Metadata

URITALK

ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 13

RDF Example 2

URIAMITdccreator

URILIB amittaaleecom

BIBEmailBIBName

BIBAff

dctitleMysteries of Metadata

URITALK

Amit Sheth

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 14

RDFS (RDF Schema)

Enables resource description communities to define

(and share) vocabularies (museum library e-

commercehellip)

Vocabulary (in RDFS) = the meaning characteristics

and relationships of a set of properties

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 15

RDF Based Web

HTML

Resources

RDFXMLDescriptions

RDFSchemas

Sourcehttpwwww3crlacuk

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 16

Dublin Core Metadata Initiative

Simple element set designed for resource description

International inter-discipline W3C community consensus

ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)

Sourcewwwdesireorg

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 17

Dublin Core RDF

ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 18

MOF (Metadata Object Facility) and XMI

MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes

associations and subtyping) a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 19

NewsML

NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 20

NewsMLhellip

It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 21

Example of the end-to-end flow -NewsML

The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly

The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers

Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction

Sourcehttpwwwmediabrickscom

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 22

PRISM

Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 23

PRISM Design

Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary

Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 24

PRISM Example

ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10

xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt

ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt

ltrdfDescriptiongtltrdfRDFgt

(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 25

VoiceXML

A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech

(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input

Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications

High- level voice-specific language simplifies application development

Source httpwwwvoicexmlorg

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 26

Voice Based Internet Applications

Source httpwwwvoicexmlorg

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 27

Voice XML Metadata

Voice Specific metadata

Supports Syntactic interoperablity

Text data to voice data

Voice XML = XML + Voice Metadata

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 28

VoiceXML ndash Possible Services

Information retrieval ndash News sports traffic stock quotes

e- Transactions (e- commerce e- tailing etc)

Financial banking stock trading

Catalog browsing (generally as an adjunct to paper)

Telephone services

Personal voice dialing One- number find- me services

Intranet ndash Inventory HR services corporate portals

Unification ndash My Whatever personal portals personal agents unified messaging

Source httpwwwvoicexmlorg

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 29

MPEG7

set of description scheme and descriptors to describe the content of multimedia data

Provides a language to specify description schemes

A scheme for coding the description

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 30

Application Examples for MPEG7

A few application examples are

Digital libraries (image catalog musical dictionary)

Multimedia directory services (eg yellow pages)

Broadcast media selection (radio channel TV channel)

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 31

Information and Content Exchange (ICE)

Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax

Authors Adobe Kinecta MS Sun Vignette et al

Status latest spec version 11 May 2000 submitted to W3C for review

Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip

Web Site httpwwwicestandardorg

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 32

What is the ICE Protocol

Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define

roles and responsibilities of involved parties Subscriber vs

Syndicator Requestor vs Responder Sender vs Receiver

format and method of content exchange (eg sequenced

packages pull vs push model)

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 33

ICE Applications

ICE vocabulary + domain vocabulary = complete application

ICEestablishes and manages the syndication

delivers data

logs events

=gt content-independent metadata

industry-specific vocabulary defines the content =gt domain-specific metadata

Source httpwwwicestandardorg

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 34

ICE Explained

ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information

Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe

timestamp=05-15-2001T110001 iceversion=10 gt

ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321

subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt

ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt

PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)

ltcomic-stripgtltice-itemgt ltice-item-groupgt

ltice-responsegt ltice-payloadgt

Content (domain-specific

metadata)

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 36

XCM (eXtended Content Management)

a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are

Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval

Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle

Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability

Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 37

XCM

eXtended Content Management

Content DevelopmentManagement

Content DeliveryApplication ContentManagement

Content AuthoringDigital Asset Management

Software ConfigurationManagement

Document ProcessManagement

Metadata ManagementRecombinationPersonalization

Edge Network Delivery

Streaming Media DeliveryCaching

Source httpwwwvignettecom

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 38

Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain

FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph

hydrography transportation

TitleTitle Dakota Aquifer

Online linkageOnline linkagehttpgisdasckgsukansedudasc

Direct Spatial Reference MethodDirect Spatial Reference Method Vector

Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator

hellip hellip hellip

UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation

TopicTopic Dakota Aquifer

AdressAdress IdIdhttpgisdasckgsukansedudasc

Measuring TechniquesMeasuring Techniques Vector

CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator

hellip hellip hellip

Kansas StateKansas State

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 39

Different views of Metadata

Domain Independent Specifications (RDF)

FrameworksInfrastructures (XCM)

MetadataApplication Specific

ICE

Media Specific

MPEG7 VoiceXML

Domain Specific

NewsML FGDCUDK

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 40

Creating and Serving Metadata to Power the Life-cycle of Content

Taalee Infrastructure Services Taalee Content Applications

Where is the content

Whose is it

ProduceAggregate

CatalogIndex

What other content is it related to

Integrate Syndicate

What is the right content for this

user

Personalize

What is the best way to

monetize this interaction

Interactive Marketing

BroadcastWirelineWirelessInteractive TV

Taalee Semantic MetaBase

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 41

Taaleersquos Intelligent Content Process

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 42

Metadata Creation and Semanticization

bull Automatic Content ClassificationCategorization

bull Metadata CreationExtractionTypes of metadata created

Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 43

FormsTypesIngest of Content

Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 44

Content HandlingIngest

InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents

Centralized Distributed MobileMigratory

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 45

Information Extraction for Metadata Creation

GlobalEnterpriseWeb Repositories

METADATAMETADATA

EXTRACTORSEXTRACTORS

Digital Maps

NexisUPIAP

Documents

Digital Audios

Data Stores

Digital Videos

Digital Images

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 46

Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach

INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT

NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection

SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues

CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning

LAYOUT

Date =gt day month int lsquorsquo int

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Traditional TextCategorization

StatisticalAI Techniques

Classify Place ina taxonomy

feed

Customer Training

Set

RoutingDistribution

Customer Article Feed

4715

Standard Metadata

Feed Source iSyndicate

Posted Date 11202000

Classification of Article 4715

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Knowledge-base amp StatisticalAI Techniques

ClassifyPlace ina taxonomy

MetadataCatalog

Content Manager

Precise syndicationfiltering

fd

Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom

EquantTicker Symbol FTE ENTExchange NYSETopic Company News

Standard metadata

Semantic metadata

FTECompany AnalysisConference Calls

EarningsStock Analysis

NYSEMember Companies

Market NewsIPOs

Automated Content Enrichment (ACE)

Taalee Enterprise Customization Suite

Taaleersquos Categorization amp Automatic Metadata Creation

Taalee Training

Set

Customer Training

Set ee ENTCompany AnalysisConference Calls

EarningsStock Analysis

Classification of Article 4715

Article Feed4715 RoutingDistribution

Map to another taxonomy

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 49

Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)

ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN

Video Segmentwith Associated Text

Segment Description

SemanticMetadata

AutoCategorization

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 50

Automatic Categorization amp Metadata Tagging (Web page)

Video withEditorialized Text on the Web

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 51

Automatic Categorization amp Metadata Tagging (Feed)

TextFromBllomberg

AutoCategorization

AutoCategorization

Semantic MetadataSemantic Metadata

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 52

Taalee Extraction and Knowledgebase Enhancement

Extraction Agent

Web Page Enhanced Metadata Asset

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 53

Basis for Semantics

A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary

B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)

Knowledge Base

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 54

Basis for Semantics

C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 55

Alternatives for Metadata Extraction

Statistical methodsCluster Analysis

LearningAI and Collab Filtering

Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain

Word or Phrase

OntologiesDomain Models

KnowledgeBaseBy Entities and Relationships

deeperunderstanding

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 56

Open Directory Project (ODP) ClassificationTaxonomy amp Directory

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 57

Ontology

Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 58

Ontology

Description includesAttributesDomain RulesFunctional Dependencies

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 59

An Ontology

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Example Interrelated ontologies

LANDUSE

COMERCIAL

INDUSTRIALRURAL

RESIDENTIAL

AGRICULTURAL

MILITARYRECREATIONAL

LAND(SITE)

CULTIVATEDAREA

GREENLANDAREA LAND

BANK

ZONING

LANDFILLSITE

WASTEDISPOSAL

RECYCLING

HAZARDOUS

LANDFILLRESOURCE REC

SOLID SEWAGE

shredding

magneticseparation

screening

washing

NATURALDISASTER

EARTHQUAKE

causes

LANDSLIDE

VOLCANO

STORMFLOOD

FIRE

AVALANCHE

TSUNAMI

causes

causes

causes

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 61

Large Vocabularies TaxonomiesOntologies

WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Confidential HP

Metadata enabledApplications

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 63

Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing

traditional queries based on keywordsattribute based queriescontent-based queries

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 64

Oingocom

Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 65

Use of Categories for Search

After 3 or 4 clicks

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 66

Metadata is the basis of making Content Intelligent

Precisely what the user asked for

Closely-related high-value information beyond what

was requested

Ability to explore any dimension around the immediate

point of interest Intelligent content helps the user

ldquothinkrdquo about and fulfill their information needs with less effort

Intelligent content can bemore effectively managed packaged and distributed

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 67

Metadata and Intelligent Content

Taalee makes content more ldquointelligentrdquo through automatic analysis of every

individual asset to generate a catalog containing

bull Context of the Content

bull Semantic Metadata describing entities (ie Company Industry etc) and

bull Relationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it

Intelligent Content=+

Adding related metadata and relationshipsdramatically increases the ability to

automatically access needed content via multiple dimensions

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 68

More than metadata

Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create

Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities

Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 69

Metadata amp Search

Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 70

Metadata Usage Keyword Attribute and Content Based Access

The VisualHarness system at LSDISUGA

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 71

Keyword Search vs Attribute Search with Semantic metadata

Virage Search on football touchdown

Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline

Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline

Metadata from Typical Cataloging of Football

Assets

Taalee Metadata on Football Assets

Rich Media Reference Page

Baltimore 31 Pit 24

httpwwwnflcom

Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter

ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000

LeagueTeamsScore

PlayersEvent

Produced byPosted date

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 72

Taaleersquos Semantic Search

Highly customizable precise and freshest AV search

Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field

Delightful relevant informationexceptional targeting opportunity

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 73

Cre

atin

g a

Web

of

rela

ted

info

rmat

ion

Wha

t can

a c

onte

xt d

o

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Taalee Directory

Georgia Bulldogs

System recognizes ENTITY amp CATEGORY

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Taalee Directory

Careless whisper

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 76

Semantic Relationships

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 77

Metadata Application Example

Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing

Please contact Taalee for live demonstrations

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Personalized Directory

Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you

Please enter such semantic keywords below

Change Context

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Personalized Queries amp Hot Topics

PERSONALIZATION

3 Julia Roberts Collection

Movie Trailer Notting Hill

Trailer - Runaway Bride

Patrick

Movie Trailer Stepmom

Conspiracy Theory

4 Pink Floyd Collection

Personalized Queries

Set the Controls for the Heart of the Sunhellip

Wish You Were Here

Round And Around

Keep Talking

The Post War Dream

1 My Stock Portfolio

Microsoft suffers serious hack attack

Cisco Systems Inc

Analyst Safa Rashtchy on Yahoo

PeopleSoft Inc

ATampT Corp

2 My Football Fantasy Team

Gators Spurrier ready for big game

Techs Vick looks to become complete QB

Bucs excited about Hamilton

Jasper Sanks rumbles into the end zonehellip

Edwards explains reasons for leaving BYU morehellip

morehellip

morehellip

morehellip

1 Election 2000

2 Middle East Peace Conflict

3 Napster Controversy

Video Explaining the electoral map

Race for White House hots up

Seniors Give Gore Florida Edge

More die as Israel steps up security

Israel braces for suicide bombs

Pentagon probes Coles security

The Brain Behind Napster

Napster Lawsuit

Creative Nomad II morehellip

HOT Topics

morehellip

morehellip

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 80

Metadata Targeting

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

SemanticInteractive Targeting

Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 82

Web Extreme Personalization

Realtime Feeds

Interests Preferences

Time-ShiftedContent Aggregator

Web sites and Pages

ContentDatabases Personalized

Content

Semantic EngineTM

Personalized Content

Content

Structured Hi-Quality

Semantic Metabase

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 83

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site

Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 84

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)

Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu

My Stocks

CSCO

NT

IBM

Market

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 85

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

Different types of recent audio content about Cisco are available

The user clicks to see a listing of Analyst Calls on Cisco (next slide)

Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 86

Application of Semantic Metadata and Automatic Content Enrichment

MyStocks

News

Sports

Music

MyMedia

$

My Stocks

CSCO

NT

IBM

Market

CSCO

Analyst Call

Conf Call

Earnings

1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis

CSCO Analysis

Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 87

iTV Taaleersquos Extreme Personalization

Content Provider

(DBS DISH Wink AOL-TV)

Semantic EngineTM

Meta-DataTagged Content

ContentldquoProgramsrdquo

Immediate Interests

Preferences

Personalized Content Capsules

Redirects and Programming

Structured Hi-QualitySemantic Metabase

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 88

Metadata for Automatic Content Enrichment

Interactive Television

This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in

This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO

Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata

Conference Call itself can have embedded metadata to support personalization andinteractivity

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 89

Metadata in Enterprise Apps

Filter Search ConsolidatePersonalize ArchiveLicensing Syndication

Production SupportProduction SupportSony

Categorize

Catalog

Integrate

CollectionCollection ProcessingProcessing

NetworkContent

AffiliateFeeds

Public Sources Rich Data

Metabase

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 90

t

A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno

More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon

At least 60 people died in this needless fire senior local official Karimu Alabi said

Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze

Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses

At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University

Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were

Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)

Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)

-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color

Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush

Video

bull Value-add for production broadcast amp syndication

bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers

bull Greatly enhances news-room productivity and time-to-market

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 91

-- Breaking News --Gore Demands That Recount Restart

Gore Says Fla Cant Name Electors

Bush Meets Colin Powell at Ranch

Market Tumbles on Earnings Warning

Barak Outlines His Peace Plan

(133) ndash 120600 - ABC

(253) - 120600 - CBS

(516) - 120600 - ABC

(246) - 120600 - FOX

(133) - 120600 - NBC

(533) - 120600

(357) - 120600 - CBS

(427) - 120600 - ABC

(344) - 120600 - FOX

(724) - 120600 - CBS

(133) - 120600 - CBS

TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters

The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the

(133) - 120600 - ABC

(233) - 120600 - CBS

(312) - 120600 - NNS

(032) - 120600 - CBS

(133) - 120600 - CBS

DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 92

Retrieve Scene Description Track

Enhanced Digital Cable

Video

MPEGDecoder

Node = AVO Object

Create Scene Description Tree

GREATUSER

EXPERIENCE

Metadatarsquos role in emerging iTV infrastructure

MPEG-247MPEG

Encoder

SceneDescriptionTree

License metadata decoder and semantic applications to

device makers

Channel salesthrough Video Server Vendors

Video App Servers and Broadcasters

Enhanced XML

Description

ldquoCisco Systemsrdquo

Node

TaaleeSemanticEngine

ldquoCisco Systemsrdquo

Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks

Atlanta Falcons Players John KitnaCoaches Mike Holmgren

Dan ReevesLocation Atlanta

Object Content Information (OCI)

Metadata-richValue-added Node

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 93

Intelligent Metadata Creation

Content which doescontain the wordsthe user asked for

Extractor Agents

Content which does not contain the words

the user asked for but is about what he asked

for

Value-added Metadata

Content the user did not think to ask for but

which he needs to know

Semantic Associations

+ +

Metadata for Intelligent ContentMetadata for Intelligent Content

Usage

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 94

Intelligent Contentvia

Value-Added Metadata

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 95

Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable

users to access content

bull If a keyword is not in the content it cannot be found

bull The burden is on the user to think of and ask for the ldquorightrdquo keyword

For example If a story is about ldquoRoger Clemensrdquo but does not contain the

words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user

searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo

Understanding of the content is needed to create new metadata

Taalee understands Roger Clemens is a PERSON who Plays a SPORT called

Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)

to add missing metadata to describe content more completely

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 96

Guided Demo for Value Added Metadata ndashExample one

bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson

bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata

on the following RMR page

bull Here is what you see

Produced by NFLcom Posted Date 9202000 League NFL

Teams Atlanta Falcons Players Jamal Anderson

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoWeek 3 top 10 Anderson TD runrdquo

bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of

Atlanta Falcons team

bull Note that other search engines and directories will not be able to do this

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 97

Guided Demo for Value Added Metadata ndashExample Two

bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield

bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page

bull Here is what you see

Produced by ESPN Posted Date 3032001 League National League

Teams Los Angeles Dodgers Players Gary Sheffield

bull Now click on the button to play the asset (button marked ldquoREALrdquo)

bull View the source HTML page that has the original story and locate this story with the

heading ldquoI want outrdquo

bull Verify that Team=Los Angeles Dodgers or League=National League was not present in

the source content

bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user

searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of

Los Angeles Dodgers team

N t th t th h i d di t i ill t b bl t d thi

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 98

Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)

Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo

Click on first result for Jamal Anderson

View metadata Note that Team name and League name are also included

in the metadata

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 99

Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)

Click on first result for Gary Sheffield

View metadata Note that Team name and League name are also included

in the metadata

Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo

View the original source HTML page Verify that

the source page contains no mention of Team nameand League name They

were Taaleersquos value-additions to the metadata to facilitate easier search

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 100

Intelligent Content ndash Value-Added Metadata

Posted Date

Posted Date

Date of asset posting ndashExtracted automatically

League Name

Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations

Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships

Team NameTeam Name

Producer Name

Producer Name

Rich MediaSports AssetRich Media

Sports Asset

Name of content provider that produced the asset

Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added

by Taalee using its semantic relationships

The asset is richly fully described in the many ways the users chose to interact

Player NamesPlayer Names

SportSportName of

sport

LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset

Name of players mentioned explicitly in the asset ndash Extracted automatically

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 101

Intelligent Contentvia

Semantic Associations

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 102

Semantic Associations

bull Traditional search engines rely solely on (syntactic) keywords to find content

bull They do not understand the meaning context or relationships of keywords

For example a search engine may see that the word ldquoCommerce Onerdquo occurs

but it does not know that Commerce One is a COMPANY which Participates in

the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba

As a result search engines cannot go beyond returning a list (or directory view)

of what the user has asked for Their ability to provide associated information is

extremely limited static and difficult to scale Taaleersquos Semantic Content Model

goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 103

Example (test on httpdirectorymediaanywherecom)

Search for company lsquoCommerce Onersquo

Links to news on companies that compete against

Commerce One

Links to news on companies Commerce One competes

against(To view news on Ariba click

on the link for Ariba)

Crucial news on Commerce Onersquos

competitors (Ariba) can be accessed easily and

automatically

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 104

Internal Source 1Research

Internal Source 2

External feedsWeb(eg Reuters)

1

2

3

4

Cisco story from PW Source 1passed on to addsemanticassociations

ConsultsKnowledgeBasefor Ciscorsquoscompetition

Returns resultLucent is a competitor of Cisco

Lucent story from external

feeds picked for publishing as ldquosemantically

relatedrdquo to Ciscostory ndash passed

on to Dashboard

Story onLucent

Story onCisco

XCM-compliant metadata XML or other format

SemanticApplication

ASPEnterprise hosted

Extractor Agent 1

Extractor Agent 2

Extractor Agent 3

Metadata centricContent Management Architecture

SemanticEngine

World Model

TaaleeMetabase

Third-partyContent Mgmt

AndSyndication

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 105

Semantic Associationssupported by Taalee Semantic Engine

Intelligent Content = What You Asked for + What you need to know

COMPANYCOMPANYRelated Stock News

Related Stock News

IndustryNews

IndustryNews

CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or

Related INDUSTRY

SECEPAEPA

RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY

Technology Products

Technology ProductsImportant to INDUSTRY or COMPANY

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 106

Semantic Web Application ExampleFinancial Advisor Research Dashboard

Automatic Collation of semantically related digital media information from Multiple Sources

Research Inferred Automatically

Semantically Related News Not Specifically Asked For

Semantic SearchPersonalization etc

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

A vision for future

Semantic Web Complex Relationships and Knowledge Discovery

Eg InfoQuilt project at LSDIS Lab Univ of Georgia

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 108

Beyond RDF ndash one proposal (cf Ora Lassila)

Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility

Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data

RDF + DL = ldquoFrame System for WWWrdquo

Source wwwontoknowledgeorgoil

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

HP 109

Semantic Web - next step in Web evolution

ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]

ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]

ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]

A personal definitionSemantic Web The concept that Web-accessible

content can be organized semantically rather than though syntactic and structural methods

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

What is DAML (DARPA Agent Markup Language)

a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags

wwwdamlorg

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

DAML Example

Sou

rce

http

w

ww

zdn

etc

omp

cwee

kst

orie

sju

mps

04

270

2432

946

00h

tml

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Three layered Architecture Of Semantic Web

Logical Layer

Formal Semantics and Reasoning Support ndash OIL DAML-O

Schema Layer

Definition of Vocabulary RDF Schema

Data Layer

Simple data model and syntax for metadata - RDF

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

OIL ndash as RDF Extension

ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype

rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt

ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt

ltoilNOTgtltrdfssubClassOfgt

ltrdfsClassgt

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

DAML and OIL ndash Evolving towards Semantic Web

OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Knowledge Discovery Knowledge Discovery --ExampleExample

Earthquake Sources(USGS NEIC)

Nuclear Test Sources(Oklahoma Observatory etc)

Nuclear Test May Cause Earthquakes

Is it really true

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Complex RelationshipsComplex Relationships

A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region

NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate

EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude

NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Knowledge Discovery Knowledge Discovery --ExampleExample

When was the first recorded nuclear test conducted

1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900

Increase in number of earthquakes since 1945

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip

For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes

Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip

Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake

Demo

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

ResourcesReferences

RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences

Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998

  • The Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001
  • Workshop Agenda
  • What is Metadata
  • Information Interoperabilitykey metadata objective and benefit
  • Semantics
  • Types of Metadata for digital media
  • Metadata for Digital Data
  • Types of Specs and Standards(or MetaModels)
  • what RDF can do for metadata
  • RDF (Resource Description Format)
  • RDF Example 1
  • RDF Example 2
  • RDFS (RDF Schema)
  • RDF Based Web
  • Dublin Core Metadata Initiative
  • Dublin Core RDF
  • MOF (Metadata Object Facility) and XMI
  • NewsML
  • NewsMLhellip
  • Example of the end-to-end flow - NewsML
  • PRISM
  • PRISM Design
  • PRISM Example
  • Voice Based Internet Applications
  • Voice XML Metadata
  • VoiceXML ndash Possible Services
  • MPEG7
  • Application Examples for MPEG7
  • Information and Content Exchange (ICE)
  • What is the ICE Protocol
  • ICE Applications
  • ICE Explained
  • XCM (eXtended Content Management)
  • XCM
  • Different views of Metadata
  • Creating and Serving Metadata to Power the Life-cycle of Content
  • Taaleersquos Intelligent Content Process
  • Metadata Creation and Semanticization
  • FormsTypesIngest of Content
  • Content HandlingIngest
  • Extracting a Text DocumentSyntactic approach
  • Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
  • Basis for Semantics
  • Basis for Semantics
  • Alternatives for Metadata Extraction
  • Open Directory Project (ODP) ClassificationTaxonomy amp Directory
  • Ontology
  • Ontology
  • An Ontology
  • Large Vocabularies TaxonomiesOntologies
  • Metadata enabledApplications
  • Use of Categories for Search
  • More than metadata
  • Metadata amp Search
  • Taaleersquos Semantic Search
  • Metadata Application Example
  • Personalized Directory
  • Personalized Queries amp Hot Topics
  • SemanticInteractive Targeting
  • Web Extreme Personalization
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • Application of Semantic Metadata and Automatic Content Enrichment
  • iTV Taaleersquos Extreme Personalization
  • Metadata for Automatic Content Enrichment
  • Metadata in Enterprise Apps
  • Semantic Associationssupported by Taalee Semantic Engine
  • Semantic Web Application ExampleFinancial Advisor Research Dashboard
  • A vision for future
  • Beyond RDF ndash one proposal (cf Ora Lassila)
  • Semantic Web - next step in Web evolution
  • What is DAML (DARPA Agent Markup Language)
  • Three layered Architecture Of Semantic Web
  • OIL ndash as RDF Extension
  • DAML and OIL ndash Evolving towards Semantic Web
  • Knowledge Discovery - Example
  • Complex Relationships
  • Knowledge Discovery - Example
  • Knowledge Discovery - Examplehellip
  • Knowledge Discovery - Examplehellip
  • ResourcesReferences