1 caida’s xml-based web publishing environment web technology forum at san diego supercomputer...

22
1 CAIDA’s XML- based Web Publishing Environment Web Technology Forum Web Technology Forum at San Diego Supercomputer Center, Tuesday, at San Diego Supercomputer Center, Tuesday, 25 July 2000 25 July 2000 Margaret Murray Cooperative Association for Internet Data Analysis (CAIDA)

Upload: hilary-potter

Post on 24-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

1

CAIDA’s XML-based Web Publishing

Environment

Web Technology Forum Web Technology Forum

at San Diego Supercomputer Center, Tuesday, 25 July 2000at San Diego Supercomputer Center, Tuesday, 25 July 2000

Margaret Murray

Cooperative Association for Internet Data Analysis (CAIDA)

Page 2: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

2

Presentation Outline

Motivation for Using XML XML Strategy: Technologies, Tools,

Timing Website Features Planned Enhancements

Page 3: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

3

Motivation: Legacy Website

StrengthsStrengths Large readership Developers control

their own web content Flexible organization

WeaknessesWeaknesses Difficulty with

navigation No recognizable

CAIDA ‘look’ Consistency and

Quality Control Issues Repetitive, bloated

web coding

Page 4: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

4

Site Redesign Goals

Clean up web tree; implement consistently named directory structure

Let developers control content but not necessarily its placement

Enforce version control Reduce site maintenance

Page 5: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

5

CAIDA History with Apache

Experience: Apache web server in

use since ‘97 (on Solaris 2.5)

Apache Jserv Java servlet engine

CAIDA Java servlets available on website:

• Otter

• GTrace

New Apache Tools: Cocoon Web

Publishing Framework• 100% Java

Open source W3C standards based

(XML, DOM, XSL) Positive reviews http://xml.apache.org/cocoon/http://xml.apache.org/cocoon/

Page 6: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

6

Strategy: The Cocoon Approach

Separate content, processing, and style Content = XML files

• Parsed by Cocoon’s Xerces XML parser Processing = Cocoon processors

• Built-in processors (e.g., XSP, XSLT, SQL ) Style = XSL

• Implemented by Xalan stylesheet processor

Page 7: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

7

Key Transition Milestones Organize new directory structure Define a ‘concordance’

Write a redirect cgi script

Make it easy to translate existing HTML into XML

Develop a CAIDA-specific DTD Incorporate a new navigation paradigm

Page 8: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

8

Transition Schedule Organize directory structure and design navigation

paradigm: 1 mo Install and test Cocoon components 1 mo Develop CAIDA DTD, site-specific XSL, and

content template: 1mo Convert existing HTML to XML: 2 mos Train web page developers (cvs, tidy, xml):as

required

Page 9: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

9

Problems with Initial Deployment

ExpectedExpected Change is hard Non-programmers had

big ramp-up to learn cvs

Some missing pages Auto-redirect script not

entirely complete

UnexpectedUnexpected Discovered bad DNS

mapping for main URL Load-related errors cvs server memory

limitations restricted scope of checkouts, updates

Page 10: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

10

Solution:

XML Development XML Development site = mirror.caida.orgsite = mirror.caida.org Dynamic XML

processing http://mirror.caida.org/~username/

Developer checks in tested page

Webmaster updates main web tree

HTML Public site = HTML Public site = www.caida.orgwww.caida.org wget gens HTML file

for each XML page All non-XML files are

shared via symbolic links

No more load related errors

Fast!

Page 11: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

11

Cocoon Maintenance

Common ErrorsCommon Errors Syntax; missing end tag;

unmatched quotes Cached copy displayed

Weird ErrorsWeird Errors “Method longer than 65535

bytes” Broken image maps   --> ‘?’

Daily cron jobDaily cron job (scripts by Dan Andersen)

Makefile driven xml-wget (Gnu) Rotate-dirs:

• www --> www.old, www.pre --> www

Symlinks for all non-XML files

Emergency scriptEmergency script

Page 12: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

12

Directory Structure

Emphasize easy-to-type, easy-to-say URLs All lower case Meaningful, recognizable directory names Use default index.xml; drop filename from

URL

Match to navigation paradigm

Page 13: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

13

Site DTD and Tools

Some CAIDA tags <navbar /> <footinfo (lastupdated,

maintainedby?, URL) > Page Layout Design (by

Oliver Jakubiec)

3-level navbar Site-map Color scheme

Tools tidy

• html config file

• xml config file checkbot find sed scripts

Page 14: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

14

Cocoon Install Solaris 7 jdk/jre 1.2.2 jserv 1.1 API Apache httpd 1.3.6

jserv, modperl, modssl modules

Webalizer 2.00 cvs 1.9

Cocoon 1.7.5-dev Xerces-J 1.0.4 XML

Parser Xalan-J 1.1 httpd.conf, jserv.conf cocoon.properties

Page 15: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

15

Documentation and Training

“How to Update CAIDA Web Pages” Two target audiences (programmers, non-programmers)

One-page Overview One-on-one training sessions Problems

• arcane cvs syntax

• Testing procedures

Page 16: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

16

Website Features (navbar)

In .xml file: <navbar />

location.txt contains <location section="HOME" subsection="overview" />

In root.xsl: Selects menus (3

layers) Sets up clickable

image maps Sets color scheme Adds file ‘locater’ text

to top of page

Page 17: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

17

Website Features (auto-gen URL)

In .xml file: <url> http://

<xsp:expr>request.getServerName() </xsp:expr> <xsp:expr>request.getRequestURI() </xsp:expr> </url>

XSP gets server name (www.caida.org or mirror.caida.org)

XSP gets page name XSL adds formatting;

places at bottom of page

Page 18: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

18

Website Features (tocs, sitemap)

Table of Contents List style Pulldown style Table style

http://www.caida.org/home/sitemap/

All pages at top three levels have access to sitemap

Want to automatically generate

Page 19: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

19

Planned Enhancements

Searchable databases Tools taxonomy IEC courses Internet Atlas

Context-driven contact email cvs “crutches” or substitute

Page 20: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

20

Summary

Apache XML technology holds great promise

Apache Cocoon provides an extensible website infrastructure

CAIDA is just beginning to take advantage of Cocoon’s capabilities

Page 21: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

21

For more information on XML-Apache and Cocoon: www.apache.org

Apache Server XML-Apache

xml.apache.org/cocoon/ http://mailman.real-time.com/pipermail/cocoon-users/ http://mailman.real-time.com/pipermail/cocoon-devel/

Page 22: 1 CAIDA’s XML-based Web Publishing Environment Web Technology Forum at San Diego Supercomputer Center, Tuesday, 25 July 2000 Margaret Murray Cooperative

22

Margaret Murray, CAIDA Technical Mgr.

[email protected]