greenstone digital library software (gsdl) open source software to build digital libraries

Post on 14-Jan-2016

54 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Greenstone Digital Library Software (GSDL) Open Source Software to Build Digital Libraries. What is open-source software?. - PowerPoint PPT Presentation

TRANSCRIPT

Greenstone Digital Library Software

(GSDL)

Open Source Software to

Build Digital Libraries

What is open-source software?

• “The basic idea behind open source is very simple: When programmers can read, redistribute, and modify the source code for a piece of software, the software evolves. People improve it, people adapt it, people fix bugs. And this can happen at a speed that, if one is used to the slow pace of conventional software development, seems astonishing.”

- from www.opensource.org

• Anyone can redistribute the software,

• Source code must always be available

What is a Library?

A trinity

USERS

BOOKS

STAFF

What is a Digital Library?

• A digital library is an organized collection of information– A focused collection of digital objects– Methods for finding, access and retrieval– Methods for selection, organization, and

maintenance of the collection– Methods for preservation

GSDL - Introduction

Greenstone is a suite of software for building and distributing digital library collections. It provides a new way of organizing information and publishing it on the Internet or on CD-ROM. Greenstone is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO. It is open-source, multilingual software, issued under the terms of the GNU General Public License.

Features

• Builds and distributes digital library collections

• Full-text document search and display • Multi-platform support • Web-based user interface • Highly customizable • Document collections can be exported to

CD-ROMs • Can be used for archiving

Features of Greenstone Software• Access through Web

browser• Windows or Unix• Searching • Browsing• Easy to maintain• Various metadata• Plug-ins for new

document types• Multiple languages• Text, pictures, audio,

video • Open Source Software

• Hierarchical phrase and key-phrase indexes

• Multi-gigabyte• Compression• Password Protection• User logs• Administrative functions• Updates dynamically

without bringing system down

• Publish to CD-ROM• Uniform presentation

across different computers

Overview of Greenstone

• Collections

A typical digital library built with Greenstone will contain many collections, individually organized—though they bear a strong family resemblance. Easily maintained, collections can be augmented and rebuilt automatically.

Overview of Greenstone

• Document Formats

Source documents come in a variety of formats, and are converted into a standard XML form for indexing by “plugins.” Plugins distributed with Greenstone process plain text, HTML, WORD and PDF documents, and Usenet and E-mail messages.

Overview of Greenstone

• Multimedia documents

Collections can contain text, pictures, audio and video. Non-textual material is either linked into the textual documents or accompanied by textual descriptions (such as figure captions) to allow full-text searching and browsing.

Using Greenstone Collections Figure shows a

screenshot of the “Demo” collection supplied with the Greenstone software. Almost all icons are clickable. Several icons appear at the top of almost every page.

Figure

“Collections” of digital material Individualized, depending on metadata etc Up to several Gb of text … … + associated images, movies, whatever Fully searchable Served on WWW, or published on

removable media Run anywhere, on any computer Fully internationalized Non-exclusive: documents and metadata in

any format Non-prescriptive: standard and non-

standard metadata

What we wanted

UNESCO: DistributingGreenstone DL software

GNU licensedFully documented … in

English/French/Spanish/RussianLanguage interfaces … Arabic Chinese Czech … Thai

TurkishUnix/Windows/Mac OS-XTrivial to installGUI interface for gathering, enriching, building …Serve collections on Web or write them to CD-ROMDocument formats: HTML, Word, PDF, PS, plain text,

e-mail Metadata formats: XML, DC, OAI, MARC, …

“Give a man a fish, feed him for a dayTeach a man to fish, feed him for life”

Sustainable development

Greenstone software

download from http://greenstone.org

Languages for interface: 38 Languages for full software + manuals: 4 Countries represented on email lists: 60 UNESCO training courses in:

Bangalore, Almaty, Dakar, Suva, …

Greenstone facts Open source: Gnu GPL Distributed via SourceForge since: Nov 2000 Average downloads: 5000/month since then Humanitarian CD-ROMs produced: 30-35 Distribution for each one: 5000/year

Distribution

UNESCO, Paris (“Information for All” programme)

FAO, Rome (Info Management Resource Kit) UNU, Japan (CD-ROM collections of UNU

material)

UN Agencies

International

University of Waikato, New Zealand Indian Institute of Sciences, Bangalore University College, London University of Cape Town, South Africa University of Lethbridge, Canada

Technical centers

Sample collections at greenstone.org

Argentina Human Rights Commission ArgentinaTasmania State Library AustraliaPeking University Digital Library ChinaGresham College, London EnglandUniversity of Applied Sciences, Stuttgart

GermanyAssociation of Indian Labour Historians,

IndiaIndian Institute of Management, Kozhikode IndiaIndian Institute of Science, Bangalore IndiaVimercate Public Library, Milan, Italy ItalyNetherlands Institute for Scientific Information Services

NetherlandsPhilippine Government Information Network

PhilippinesMari El Republic, Russia RussiaSlavonski Brod Public Library, Slovenia

SloveniaVietnam National University VietnamWelsh Books Council Wales

International

• Auburn University, Alabama• Detroit Public Library• Hawaiian Electronic Library• ibiblio project, University of North Carolina• Illinois Wesleyan University• LeHigh University, Pennsylvania• New York Botanical Garden• University of California at Riverside• University of Chicago Library• University of Illinois• Texas A&M University• Washington Research Library Consortium

U.S.Sample collections at greenstone.org

Plugins for

Standards Can use any metadata set, Dublin Core supplied Plugins for

METS can be used as Greenstone’s internal representation

Metadata

Documents

Web Can publish Greenstone collections on CD-ROM Can publish Greenstone collections on OAI Export collections to METS Export collections to DSpace (ready for DSpace’s batch import

program)

Serving

PDFPostScriptWord, RTFHTMLPlain textLatex

Images(any format: GIF, JPEG, TIFF …)

MP3Ogg VorbisUnknownPlug

(e.g. for audio, MPEG, Midi)

ZIPExcelPPTEmailSource code

XML ReferMARC OAICDS/ISIS METS (subset)ProCite DSpaceBibTex

Ghostscript

Kea

pdftohtml

rtftohtml

TextCat

wvWare

Xlhtml

XML::Parser

Interpreter for Adobe Postscript documents (Postscript plugin)

Keyphrase extraction program (to generate metadata)

Converter for PDF documents (PDF plugin)

Converter for RTF documents (RTF plugin)

Detects languages and document encodings

Converter for Word documents (Word plugin)

Converter for Excel/Powerpoint documents (plugins)

Parses XML documents, used to read and write Greenstone’s internal XML document format

The power of open source: Greenstone uses …

MG

GDBM

wget

YAZ

Stemmer

GCC

CVS

Perl

Apache

Creates compressed full-text indexes and performs searches

Database used for metadata etc

Downloading pages from the Web when creating collections

Client and server implementation of Z39.50

English language stemmer

C/C++ compiler

Version control system

Used for plugins etc

Web server used by many Greenstone installations

and …

Humanity Development Libraryfor sustainable development and basic human needs

Example

160,000 pages30,000 images800 books430 magazines340 kgUS$20,000

CD-ROMUS$1 Win3.1x upwardStand-aloneand intranet serverWeb browser user

interfaceGlobal Help Project, Antwerp (+ UN agencies)

Chinese documents(pictures of text)

+ Chinese interface

Peking University Library

Chinese(Chinese & English interfaces)

Classic Chinese literature

UNESCO, Paris

French

PAHO, WHO

Spanish

Russian

Mari El Republichttp://gov.mari.ru/gsdl

The Greenstone Librarian Interface (GLI)

Building collectionsInteractive Java programRuns on anythingBuild a collection on the computer

you are on… plus new applet version Includes metadata editorCaveat: cannot deal with such huge

collections as Greenstone can (particularly of metadata)

Create a new collection

Gather: Gather the files together

Enrich: Add the Metadata

Design: Add plugins and configure them

Design: Search Indexes, etc

Create: Building the collection

Preview: admire the result

Create: It’s built – preview it?

Format: For Features Display, etc.

Export the collection to CD-ROM?

Export the collection to CD-ROM?

Previewing the collection

Full-text search

Search Results

Full Text Display

Form-based search

Browsing titles

Browsing by Keywords

top related