thoughts about computer science research in information-rich applications areas william y. arms
DESCRIPTION
Thoughts about Computer Science Research in Information-rich Applications Areas William Y. Arms Cornell University March 14, 2000. Changes in Computer Science. Over 25 years, computer science has broadened From: a narrow range of academic topics To include: systems - PowerPoint PPT PresentationTRANSCRIPT
1
Thoughts about
Computer Science Research in Information-rich Applications Areas
William Y. ArmsCornell University
March 14, 2000
2
Changes in Computer Science
Over 25 years, computer science has broadened
From: a narrow range of academic topics
To include:
• systems
• human computer interactions
• economic, legal, and social aspects
3
Computer Science Today
• Past achievements in computer science are a powerful force in the national prosperity.
• Universities have excellent students who have tremendous opportunities.
• An extensive body of theoretical and practical knowledge has accumulated.
• Exciting research can be found in every direction.
5
Computing and Information Science(Cornell)
Interdisciplinary partnerships:
• Computational biology, genomics, protein folding, etc.
• Computational science
• Computer graphics, architecture, design, film-making
• Digital libraries, information management
• Computational finance, economics
Computer science can contribute to each of these fields.
Each field can stimulate new research in computer science.
6
The University as a Test Bed
University tradition of innovation in computing:
• Time sharing (MIT, Dartmouth)
• Networks and distributed computing (Carnegie Mellon, MIT)
• Online information (Illinois, etc.)
• Wireless and nomadic computing (???)
Advantages:
• Tight feedback loop between researcher and user
• Innovation valued for its own sake
• Access to resources (equipment, people, money)
8
Example: Digital Libraries
In 1990, there were many experiments in building digital libraries:
• CORE (Bellcore, Cornell, OCLC) Lesk, et al.
• Gopher (Minnesota) Gopher team
• Mercury (Carnegie Mellon) Arms, et al.
• WAIS (Thinking Machines) Kahle, et al.
• World Wide Web (CERN) Berners-Lee, et al.
• Z 39.50 (Major libraries) Lynch, et al.
The leaders of all projects were either computer scientists or had spent most of their working life in state-of-the-art computing.
9
Foundations of the Web
Technology Ancestors
Internet ARPAnet/NSFnet, X.25, ISO
URL Domain Name System
HTML SGML, TeX, PostScript
HTTP TCP / FTP / Gopher, Z 39.50, SQL
MIME Email, ODA
Security None, SNA, Kerberos
Business model None, pay-by-use, subscription
10
Example: Web Search Engines
Lycos (Mauldin, Carnegie Mellon)
Technical basis:
• Research in text-skimming (Ph.D. thesis)• Pursuit free text retrieval engine (TREC)• Robot exclusion research (private interest)
Organizational basis:
• Center for Machine Translation• Grant flexibility (DARPA)
11
Example: Web Search Engines
Google (Page and Brin, Stanford)
Technical basis:
• Research in ranking hyperlinks (Ph.D. thesis)
Organizational basis:
• Grant flexibility (NSF Digital Libraries Initiative)• Equipment grant (Hewlett Packard)
12
The Internet Graph
Theoretical research in graph theory
• Six degrees of separation• Pareto distributions
Algorithms
• Hubs and authorities (Kleinberg, Cornell)
Empirical data
• Commercial (Yahoo!, Google, Alexa, AltaVista, Lycos)• Not-for-profit (Internet Archive)
13
The Limits of the Web
• The web has grown upon existing computer science knowledge.
• The strengths of that knowledge have enabled enormous growth.
• The limits of that knowledge have constrained the growth.
Al Demers
14
The Web: Limits to Growth -- Databases
Transaction processing databases: e.g, Amazon.com
The biggest online systems ever built, with many computers around the world.
Desirable features:
• No interruptions• No transactions ever lost• Secure from all intruders
In practice some transactions are lost; data is sometimes inconsistent. This is acceptable for selling books, but what about banking?
15
The Web: Limits to Growth -- Security
Why is security on the Internet so difficult?
1. Public key encryption invented in mid-1980s, yet widespread deployment remains elusive.
2. System security is riddled with loopholes
• operating system security developed when operating systems were simple monitors
• now operating systems are very complex and hence vulnerable
• language based security seeks for simpler interfaces to attach security
Fred Schneider
16
The Web: Limits to Growth -- Security
The Internet is based on stateless protocols
routing
http
Stateless protocols have allowed flexible growth, but inhibit certain controls
junk email
denial of service attacks
Can we quantify the trade-off?
18
Priorities: Andrew File System
Carnegie Mellon Industry
Microsoft (2000)
IBM (1989)
Campus file system (1985)
Coda research
19
Two Fears
Two fears for digital libraries:
• Librarians will ignore the expertise of computer science.
Two fears for X:
• Specialists in X will ignore the expertise of computer science.
• Computer scientists will ignore the insights of specialists in X.
• Computer scientists will ignore the insights of librarians.
20
Thoughts for the NSF
• Applications and computer science need to be side by side.
• Big projects appear to be more productive than small ones.
• Inter-disciplinary collaboration cannot be forced.