evolving digital libraries to support geographically distributed scientific research rick luce...
Post on 20-Dec-2015
221 views
TRANSCRIPT
Evolving Digital Libraries to Support
Geographically Distributed Scientific Research
Rick LuceResearch Library DirectorLibrary Without Walls Project LeaderLos Alamos National Laboratory
Symposium on Knowledge Environments for ScienceNSF, October 22, 2002
Standards &InteroperableFrameworks
EnablingTechnologies
& Infrastructure
Content:• Access• Retrieval
Financial Models• Funding • Content licensing
User Behavior• User needs • Collaboration• Scholarly communication changes• Adoption curves
Some Puzzle Pieces for Digital Libraries
Delivery of Content & Services
• Libraries replicating one another• Requires integrated framework• Lack of interoperability• Tough work
Trend
• Publisher pricing flip for e-content• Old model of libraries facing decline or aggregation
DL Models: Delivery
Content CaptureIngest repositories • Easy entry in network environment• Digitization of old stuff• E-collections distributed but
archiving is unknown• Largely publisher controlled today
New players emergingLow barrier entry
Trend
DL Models: Capture
eprint systems
Eprint Systems:xxx or arXiv e-print archive Physics: 1991 Ginsparg, LANL RePEc - (Economics - Surrey U - Krichel) NCSTRL - (Computer Science - Cornell U - Lagoze) NDLTD - (Theses - Virginia Tech - Fox) CogPrints - (Cognitive Sciences - Southampton - Harnad)
Harvesters ARC & ARCHON - Computer Science Dep’t, ODU SCIRUS – Elsevier even at the individual level … Kepler - ODU
Capture Systems
Content Capture
NSF-
NSDLDLESE
Share usage logs between nodesShare citations & digital archives
New collaboration opportunities
Normalization
Authentication –Shibboleth DRM
Delivery of Content & Services
OAI protocolsStandards
Digital Library Hybrid
Stanford Univ Pacific Northwest Nat’l Lab Edwards AFB Univ Nevada Idaho Nat’l Eng. & Enviro Lab
4 New Mexico Universities Sandia National Labs Air Force Research Lab Nat’l Renewable Energy Lab Santa Fe Institute
Albany Research Cntr. Brooks AFB Brookhaven Nat’l Lab Eglin AFB Enviro Measurem’t Lab DOE HQ Energy library Fed. Technology Center Griffith AFB Oak Ridge Nat’l Lab Savanah River Co. Tyndall AFB Hanscomb AFB Wright Patterson AFB Montana State Univ
29 Institutional Customers in the U.S.
3%
97%
Open Access 6%
94%
Open Access
~8M full text articles
Copyrightrestrictions
Copyrightrestrictions
~60M metadata records
Large fraction of scholarly content has significant access restrictions & cost barriers
ChallengesFALLOUT: WITH PUBSCIENCE GONE, SIIA SEEKS OTHER
CLOSURES -- With PubSCIENCE now history, the trade association that lobbied for its dismantling is reportedly set to focus its energies on other freely accessible government information resources. According to FEDERAL COMPUTER WEEK, Software and Information Industry Association (SIIA) public policy director David Le Duc said the group was "looking into a couple of other databases and agencies," in particular one "law-related" and one that "has to do with agriculture." After more than a year of intense lobbying by the SIIA, a major trade association for the software and digital content industry, the federal government discontinued PubSCIENCE in early November …They argue, that it is unfair for taxpayer dollars to fund databases that compete with commercial products.
Library Journal Academic News Wire: November 19, 2002
Repository Models
• Distributed – MIT individual faculty upload and manage their own scholarly output
• Semi-distributed – UC eScholarship assigns management responsibility to organizational units (research units, departments) that then assist faculty with uploading their papers.
• Semi-centralized - CalTech repository sites are set up for any university unit, but the library uploads the papers on the faculty's behalf. Its digital collections range from computer science technical reports to theses and dissertations.
Institutional Repositories: Roy Tennant, 9/15/02
So far: harvesting of descriptive metadata ... but coming, harvesting of:
references usage logs certification metadata metadata rights citation mapping co-citation visualization personalization
OAI’s roleOAI’s Role
OpenURLInformation resources allow open linking by including a hook along with
each metadata description... which presents itself as an actionable OpenURL
1. Knowledge contexts categorized– Keywords & keyword semantic proximity– Citations and citation proximity– Semantic proximity– Traversal proximity
2. Recommendation(s) calculated3. Traversal proximity analyzed4. Adaptation in system
Users + Profiles = learning community
Adaptation of Structure and Semantics –- Using Collective Behavior of Users
LANL Active Recommendation System
Finding the Balance Point
Community specific tools Encourage/support trans- disciplinary research
Small teams Deployable across Lab or multiple institutions
New technologies, new tools Legacy data & systems
Knowledge is represented by articles, books, etc.
Knowledge characterized by relationships among objects, documents & resources
Hub/spoke model for DL’s: balance resources and focused efforts
Known path, existing infrastructure (people, buildings) institutional pride
• is nonalgorithmic (path cannot be fully specified in advance)
• tends to be complex (total path not visible from one vantage point)
• often yields multiple solutions (each with costs/benefits rather than unique solutions)
• involves nuanced judgment and interpretation
• involves the application of multiple criteria (which sometimes conflict with one another)
• often involves uncertainty (not everything bearing on the task is known)
• involves self-regulation of the thinking process (someone else does not ‘call the plays’ at every step)
• involves imposing meaning, finding structure in apparent disorder
• is effortful. (considerable mental work involved in the kinds of elaborations and judgments required)
*Resnick (’87)
Higher Order Thinking* …
Visualization• Scientific visualization – use of interactive visual
representation of scientific data, typically physically based to amplify cognition
• Information visualization – use of interactive visual representations of abstract, nonphysically based data to amplify cognition
Successes
Culture of measurement – long term focus on user driven requirements and corresponding satisfaction levels
Open Archives Initiative – small, quick, right players Eprint arXiv – communities of common interest,
timeliness, passionate people, didn’t take a lot of $$ OpenURL – small, quick, right players, passionate
people, (standards efforts too long) MyLibrary – personalized, adhoc collaboration
? Recommendation systems with shared knowledge models – uses available logs, complex, privacy concerns
Challenges
IP, copyright limitations Post 9/11 pressure to close government access Integrating formal and informal systems – need
new mechanisms for peer review and rewards
Archiving – not glamorous but a research problem
Problem space is larger than NSF domain –– Requires cross organizational collaboration (DOE, NIH,
etc.) and international connections