harpers: a semantic web(ish) site for harper’s magazine

29
Harpers.org: a Semantic Web(ish) site for Harper’s Magazine Paul Ford Associate Web Editor, Harpers.org [email protected]

Upload: jasia

Post on 19-Jan-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Harpers.org: a Semantic Web(ish) site for Harper’s Magazine. Paul Ford Associate Web Editor, Harpers.org [email protected]. Harper’s is…. A magazine of literature, politics, culture, and the arts published continuously from 1850 A small non-profit. Available content. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Harpers.org: a Semantic Web(ish) site for Harper’s Magazine

Paul FordAssociate Web Editor, [email protected]

Page 2: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Harper’s is…

- A magazine of literature, politics, culture, and the arts published continuously from 1850

- A small non-profit

Page 3: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Available content

- The Weekly Review, an emailed summary of world events, from 2000

- The Harper’s Index, a statistical portrait of the world, from 1998

- Public domain, scanned-in archives from 1850-1982

- Readings- Occasional features

Page 4: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

And that’s it.

- Maybe full text of issues will be offered someday, but not soon. So…

- How do we get more value out of limited content?

Page 5: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Solution

- Hack up the what we have into bits by content type, then…

- Reassemble it according to link targets…

- Which are arranged in a taxonomy…

- Creating a very small “Semantic Web” for Harpers.org

Page 6: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

A quick demo…

- >>>

Page 7: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

How it works

- Simple set of ontological relationships (partOf, supervisorOf)

- Taxonomy of content- & narrative content

- that is split into smaller pieces

- & links into the taxonomy

Page 8: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Markup

- Text: “Country Y announced that it had cut off relations with country Z. On Wednesday, something happened to persons X and Y.”

Page 9: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Markup

<event> Country Y announced that it had

cut off relations with country Z.</event>

<event>On Wednesday, something

happened to persons W and X.</event>

Page 10: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Markup

<event on=“2004-03-12” id=“24848”>

Country Y announced that it had cut off relations with country Z.

</event>

Page 11: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Markup

<event on=“2004-03-12” id=“24848”>

<link to=“#CountryY”>Country Y</link> announced that it had cut off relations with <link to=“#CountryZ”>country Z</link>.

</event>

Page 12: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Conditionals

- Some text required conditional markup

- Text: “Country Y announced that it had cut off relations with country Z, and on Wednesday, something happened to persons X and Y.”

Page 13: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Conditionals: ugly, but simple<event>Country Y announced that it had cut off

relations with country Z <cond is=“id”>, and</cond> <cond not=“id”>.</cond></event><event> <cond is=“id”>on</cond> <cond not=“id”>On</cond>on Wednesday, something happened

to persons X and Y.</event>

Page 14: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Conditionals: ugly, but simple- Narrative version

- Country Y announced that it had cut off relations with country Z, and on Wednesday, something happened to persons X and Y.

- Timeline-friendly version- Country Y announced that it

had cut off relations with country Z.

- On Wednesday, something happened to persons X and Y.

Page 15: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

All of it gets slurped up

- And turned into a set of triples

- Then processed in-memory- With HTML pages spit out

as a result

Page 16: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Hard, then easy

- Hard to get started (lots of events, facts, and links)

- Easy to keep going, if you don’t mind the markup and use a good text editor

Page 17: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Tools used

- emacs, vi, bbedit- XSLT2.0 (SAXON)- CVS

Page 18: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Why not RDF?

- Not right for redundant content and conditionals

- Easy enough to transform arbitrary structured XML into RDF with XSLT, as needed

- (Or into RSS1.0, RSS2.0, Atom, etc.)

?

Page 19: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

For free…

- From 300 individual pages…

- To 1100 pages of “remixed” content – all unique and relevant

- And Google-friendly

Page 20: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

And also for free…

- Semantically relevant in-site advertising, if we want it

- Topic-sorted, reusable content

- Permanent, readable URIs

Page 21: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Do people get it?

- Some do, and others just navigate the site as usual

- Harper’s was fine with the learning curve

- “Odd but useful” – Gawker

Page 22: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Results

- Uptick in traffic and subscription revenues

- Low cost of maintenance- Ever-increasing database of

facts and events – adding one Weekly Review adds value to 50 different pages

- Happy client

Page 23: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Why the SemWeb(ish) framework?

- Leaves plenty of room to grow- Web-only content- Full text of issues- Subscriber services- Etc

- Take advantage of new SemWeb tools- Incorporate RDF sources into the

taxonomy- Anticipate Semantic Web browsers

Page 24: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Next?

Page 25: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Make it pretty

- Redesign- Hide some of the

navigation- Turn links on and off

Page 26: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Make it scale

- Currently maxes out at about 20-30 megs of content, due to limits of in-memory DOM representation (10-12x XML document size)

- Use a publicly available storage layer (Kowari, Jena, etc)

- Go triple-crazy

Page 27: Harpers: a Semantic Web(ish) site  for Harper’s Magazine

Make it easy to query and navigate

- “Show me everything related to George Bush and Iraq.”

or- “Show me everything related

to politicians and the Middle East.”

- New navigation- ?

Page 28: Harpers: a Semantic Web(ish) site  for Harper’s Magazine
Page 29: Harpers: a Semantic Web(ish) site  for Harper’s Magazine