all presentation material copyright eurostep group ab ® the semantic web made simple david price...

28
All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 [email protected]

Upload: joy-jennings

Post on 18-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

All Presentation Material Copyright Eurostep Group AB

®

The Semantic Web Made Simple

David PriceDecember 2004

[email protected]

Page 2: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Agenda

• The Current Web– and its technologies– How’s it work now?

• The Semantic Web– is adding semantics– How’s it going to work in the future?

Page 3: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

The Current Web

• Web core concepts– People read Web pages– Web page authors can control basic layout– Web pages need to link to each other– Web pages need to link to online media

• that people read, view, listen to or interpret

– People use tools that search/recall Web content (Yahoo, Google, Lycos, their own bookmarks)

Page 4: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

What’s on a Web Page?Text that’s actually

graphicsCategories of

articles

A photograph

Online shopping linkArticle title and link

Article abstractDate and time

Location, temperature and unit

Page 5: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

What we saw

• Things on the NY Times site– Text that’s actually a graphic– Categories of articles– Online shopping link– A photograph– Article title and link– Article abstract– Date and time– Location, temperature and unit

• How did we know that?– Because we are humans who can read English

and who can interpret what we see

Page 6: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

What did the editors do?

• Determined the layout of the pages as a whole– Should it look like a real paper? Should there be

advertising?

• Wrote the text• Decided on navigation

– Articles categories called “International”, “National”, “Sports”, etc.

• Article categories list items link to separate page for each category with list of articles

– Users will have to scroll down the page to see the headline articles

– Articles titles will link directly to separate page for each article

Page 7: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

How did they do that?

• They used HTML and graphic images• Hypertext Markup Language (HTML) allows

editors to– control presentation and layout

• Paragraph, Bold, Table/Column/Row

– add links to other pages• Hyperlink Reference

– show graphics• Image of many types are natively supported by browsers

– link to other media that have software to present them

• music, video, PDF, documents, presentations

Page 8: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

A peek under the covers

Page 9: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

How does that work?

• HTML is a standard language– World Wide Web consortium standardized it

• Companies have written software that reads HTML and presents it to you– These are Web browsers

• The presentation capabilities of HTML, the related media and browsers are pretty powerful

Page 10: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

How does HTML really work?

• What do the browsers understand?– <P>This is a paragraph.</P>

• Present the text “This is a paragraph.” as a new paragraph

– <A HREF=“newsitems.html”>News</A>• Present a hyperlink of text “News” and if it’s selected

present new page from file “newsitems.html”

– <TR><TD>dog</TD><TD>cat</TD><TR>• In the current row of the table, present text “dog” in

column 1 of table and text “cat” in column 2 of table

– <IMG SRC=“p1.jpg” />• Present an image from whatever is in the file named

“p1.jpg”

Page 11: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

So, What’s the problem?

• Only a human being can read a Web page and extract any meaning from it– The Web browser does understand paragraph,

image, link– The Web browser does not know it’s linking to a

“News Article” or the image is a “picture of photographs”

• It’s the meaning that’s really important• Wouldn’t it be powerful if computers could

get some of the meaning out of Web pages?

Page 12: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Why is it a powerful idea?

• Using our NY Times/newspaper site example…– Suppose you were an Environmental Group– Suppose you want to monitor news stories about

the environment or pollution– You could write a program that searches the Web

media outlets– That program could trigger a notification about

articles on environmental issues– Or, it could contact members of your group in

specific locations when it finds legislation related to pollution in particular US states

– This would save your members a lot of time searching for themselves, wouldn’t it?

Page 13: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

The Semantic Web

• Figuring out how to get meaning out of things on the Web using software is what “The Semantic Web” is all about– “using software” means “without humans doing

the interpretation”

• How would one do that?– Clearly, HTML is not sufficient, so more powerful

languages are required– Clearly, cannot replace everything already on

the Web, so ways to add meaning are required– Need to combine better

languages/communication, computer science and the study of what things mean

Page 14: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Semantics

• People have been studying what things exist and what they mean for centuries– This is called Philosophy

• People have been studying how people communicate for decades– This is called Linguistics

• People have been studying how computers can “learn” for a few decades– This is called Artificial Intelligence

Page 15: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

The Semantic Web

• Vision of Web “inventer” Tim Berners-Lee and others– Wrote an article in Scientific American in 2001

• Goals– Go beyond processing by human beings– Make Web content computer processable

• How?– Add semantics using ontologies– Use inference/reasoning over ontologies

Page 16: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Ontologies

• Ontology– A big word from philosophy, linguistics, and

computer science– A formal, machine readable specification of a

domain of interest• Names things and adds knowledge about and

constraints on the things• Allows relationships between terms within and between

different ontologies

• Semantic Web researchers and W3C have been working several years now

Page 17: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

OWL History

• US researchers produced DAML-ONT in 2000– DARPA Agent Markup Language – Ontology Language

• European researchers produced OIL about the same time– Ontology Inference Layer

• Merged to produce DAML+OIL and submitted as Note to W3C and formed the W3C WebOnt group in 2001

• W3C WebOnt Group produced OWL in 2003– OWL is now a W3C Recommendation

• This is not really that important for our purposes… just remember that OWL didn’t appear overnight

Page 18: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

What is OWL?

• The World Wide Web Consortium (W3C) created the HTML and XML standards

• OWL is a next-generation W3C Web standard– its purpose is to add “semantics” to the Web

• Therefore, it can be distributed and is Web-enabled and does not assume a single source for everything

– In concept, it is very much like other data modelling languages (it calls models or schemas “ontologies”)

• class, subclass, property, property type, instance/individual

– supports set theory and logic-based statements about the classes and individuals

– it has more than one syntax, XML being one

Page 19: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

RDF underlies OWL

• RDF is another W3C standard, the Resource Description Framework– RDF is simple in concept but sufficient for many

basic Semantic Web tasks (e.g. who created this presentation?)

– It allows you to assign a property with a value to a Web page (or any Web resource)

Resource

http://www.eurostep.com/TheSemanticWeb.ppt

Property Creator

Value David Price

Page 20: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

RDF underlies OWL

• RDF is another W3C standard, the Resource Description Framework– RDF is simple in concept but sufficient for many

basic Semantic Web tasks (e.g. who created this presentation?)

– RDF is often represented by nodes and arcs

http://www.eurostep.com/TheSemanticWeb.ppt “David Price”Creator

Page 21: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Back to the NY Times

Page 22: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

What we saw… again

• Things on the NY Times site– Text that’s actually a graphic– Categories of articles– Online shopping link– A photograph– Article title and link– Article abstract– Date and time– Location, temperature and unit

• How did we know that?– Because we are humans who can read English

and who can interpret what we see

Page 23: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

A peek under the semantic covers

Newspaper ontology

Article Authors

Date

Article title

Article Subjects

Page 24: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

Without using an editor …

Now these are semantics a

software application can understand… Articles and

Authors

Page 25: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

On Annotating the Web

• You might ask: But what about the current Web content, we’re not going to rewrite it all are we?

• And we’d answer: Of course not, but you can “annotate” them to add semantics.

• What this means is:– Descriptive ontologies like the one for Newpapers are

being developed– Descriptions are then linked to already existing Web

pages, including any multi-media content (e.g. video)– The Semantic Web community calls this “annotating

a Web resource”– You’ll also hear people use the term “metadata” too

Page 26: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

So, How does OWL Work?

• An Ontology– is a formal description of a field of interest– defines Classes – the kinds of things of interest

• Article, Person, etc.

– defines Properties – the relationships and characteristics related to Classes

• Article is WrittenBy Person, Person has Name

• Then, based on the Ontology people create content– An author writes articles using software that understands

the Newspaper Ontology– The Publisher gathers all the articles, classifieds, etc. and

links them into the online version of the NY Times

Page 27: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

But how does that help?

• If everyone, or at least a reasonably large community, agreed on an ontology for Newspapers– then sharing articles between sites is possible– presentation can be layered on top of the semantic

content of the articles– Web robots, only smarter than Google, can find

and relate content about specific subjects, by specific authors, etc.

• The key is getting agreement on the ontologies– This is ongoing in various standards bodies,

consortia, etc. but remains a major issue for the Semantic Web

Page 28: All Presentation Material Copyright Eurostep Group AB ® The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

®

All Presentation Material Copyright Eurostep Group AB

In Conclusion

• The Semantic Web goal is to make semantic content of Web pages available for software applications

• Work has been ongoing for several years– Building on decades of research

• The OWL language is a key development– As are the languages upon which it is based,

such as RDF Schema• But that’s for another day…