revising riverbot outline and specifications christian skalka

Post on 21-Dec-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Revising Riverbot

Outline and Specifications

Christian Skalka

The Riverbot System

• Riverbot is an information retrieval service for ww boaters

• Various webpages (USGS, USACE) report real time gage data for thousands of rivers

• Small subset of online gages of interest to whitewater (ww) boaters

Popularity of Riverbot

• Serving the online mid-atlantic ww boating community for over four years

• Approximately 30-40 hits per day

• Currently over 600 subscribers (stable after initial exponential growth)

Riverbot Components

1. Database (db) of gages for ww rivers, and mailing list accounts

2. Web robot for retrieving current gage data

3. Web interface for checking gage data, mailing list signup

4. Daily mailing list report

Database

• Implemented as formatted UNIX files

• Database search and editing via file I/O, locks used to preserve consistency

• Relatively small size of DB (~650 users, data for ~60 gages) ensures efficiency

Web Robot

• Written in Perl, using the LWP library for www access

• Retrieves web pages listed in db

• Uses Perl pattern matching to parse webpages, extract data for db update

• Activated as a cron job, several times daily

Web Robot

• Robot essentially a series of http get requests, parses results

• Respects the robot exclusion standard, which mediates robot traffic on websites:– http://www.robotstxt.org/wc/norobots.html – Allows any site to specify where robots

can/cannot roam– USGS, USACE allow robots

Mailing List Report

• Riverbot gage report mailed to subscribers daily at 8:00AM

• Written in Perl, scheduled as a cron job

• Automatically composes report based on particular subscriber’s choices

Web Interface

• Riverbot website (http://www.riverbot.com) provides a simple interface:

- View current gage levels

- Sign up for mailing list

• Written in HTML, uses Perl CGI to interact with db

Riverbot Homepage

Gage Selection Page

Mailing List Signup Page

Current Site Specification

Time for the next step

• Extend coverage to entire US, revise web interface

• Allow users more fine-grained, secure control over accounts

• Implement database in SQL to accommodate expanded coverage, user base

• Enhance administration toolkit

Time for the next step

• New site currently being implemented by David Van Horn as independent study

• Work-in-progress

• Available by mid-summer(?)

New Site Specification

New user account features

• Users are distinct entities, and may have several distinct accounts

• Web interface allows user and accounts creation, editing, and deletion

• User passwords authenticate changes to user and account profiles

Accounts Management Interface

Administration Interface

• Common tasks: deleting accounts, adding and editing gage information

• Administration currently hacked as a collection of shell scripts and file editing

• New implementation, good programming practices prescribe better tools

• Web interface simple and effective

Site Administration Interface

Implementation Details

• New backend written in PLT Scheme– Modern dialect of Lisp, developed at

Brown/NEU/Rice– Functional, safe language

• Site running on PLT Scheme Webserver– Modern webserver under development at NEU

Why PLT Scheme: Safety

• PLT Scheme is a safe language:– Predictable behavior– Buffer overflow attacks prevented

• Web sites written in PLT Scheme more secure than:– Websites written in e.g. C– Websites running on e.g. Apache

Why PLT Scheme: State

• Subsequent webpages naturally viewed as I/O during phases of single program

• Fact: http does not allow state to be maintained between requests

• With CGI scripting:– state is a hack; maintained in urls, dynamically

generated form actions

– Various phases of computation must be defined as separate scripts

Why PLT Scheme: State

• PLT Scheme webservers use continuations:– Phases of computation represented in single

program– During particular phase, continuation represents

phases still to be computed– Webserver maintains db of continuations,

accessed by generated url in form action

• A principled approach to modelling state within the confines of http requests

Another Issue

• Problem: incorrectly entered and dead email addresses mean bouncebacks

• Most significant administrative task is deleting bounceback accounts

• New user interface may help, but what happens when site goes national?

Automated Administration

• Solution: use mail preprocessing to filter bouncebacks from incoming messages

• Route bouncebacks to logs

• Automatically delete accounts that bounceback repeatedly

• Many available mail filters, e.g. procmail

Miscellaneous Upgrades

• New site graphics, riverbot logo

• Bumperstickers

• Showcase whitewater photography?

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems

Conclusion

• Riverbot is a popular and useful website serving the whitewater community

• Time to go national (international?)

• Significant implementation, integration of advanced languages and software systems

top related