lis618lecture 0 introduction to the course thomas krichel 2011-04-21

39
LIS618 lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Upload: abraham-lawrence

Post on 28-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

LIS618 lecture 0

Introduction to the course

Thomas Krichel2011-04-21

Page 2: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

structure

• me• the way I see it• you• the way you see it.

Page 3: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

me

• I am Thomas Krichel.• My homepage is

http://openlib.org/home/krichel. You can also use http://wotan.liu.edu/home/krichel, it contains almost the same contents at all times.

Page 4: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

my courses page

• My courses are at http://wotan.liu.edu/home/krichel/courses.

• These contain material for all current and previous editions of all courses that I ran at the Palmer School.

• I am an open access supporter.

Page 5: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

me and LIS618

• In 2003, the course was called “database searching”.

• Since 2004, it has been called “online information retrieval techniques”.

• Let me try to clarify both terms.

Page 6: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

term “database”

• A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies).

Page 7: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

this is not our database

• The previous definition is not what librarians mean when they talk about databases, with the use of the term “database searching”.

• What they mean by “database” is any type of, usually remote access, resource that the library has purchased.

• Searching Google is not “database searching”.

Page 8: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

searching

• When using the term searching with database searching we mean the following process– a user has an information need– the user formulates a query– the user is presented with a set of results

• Librarians love searching. Users love finding.

Page 9: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

why study database searching?

• There are historical reasons.• There are pedagogical reasons.• There are reasons of transparency.

Page 10: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

historical reasons

• When libraries first licensed remote content, it was very expensive and difficult to use.– The telecommunications charges where high.– The cost of the system access was high. There

often was a charge by minute.– The systems were difficult to use. They were not

suitable for a non-trained user.• Database searching by a librarian is a way to

save cost.

Page 11: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

historical reasons today

• The historical reasons don’t seem to apply.• There are still reasons why you have

intermediated searching.• One important one is to save the searcher (a

high-salary individual) time and have the search conducted by someone with a lower salary.

• A lot of these job are outsourced.

Page 12: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

pedagogical reasons

• As librarians, we need to teach people how to use online information resources.

• Unless they can do this themselves.• Many (most) think they can. • The pedagogical reasons seem to disappear

over time. • There are however serious problems of

transparency.

Page 13: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

transparency

• In days of old library databases where proprietary.

• The engines provided documentation on how to search contents to a detailed level. The release of this information did not damage the business.

• In the days of search engines (the new “database”) the algorithms to search are secret.

Page 14: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

secrecy in search

• There are some indications has the search engines give on how they do their work.

• But overall the algorithms are secret.• The monopoly of Google makes for a serious

threat to the information culture.• The solution would be to build and operate

open-source engines. I have done some pioneering but small scale work in this area.

Page 15: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

information retrieval

• Deals with how to build systems that allow users, even untrained, obtain complicated information.

• This is big business. Google, arguably the most successful business of the early 21st century, owes it all it information retrieval.

• In particular, to web information retrieval.

Page 16: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

online information retrieval techniques

• This is different from database searching because we are talking about techniques.

• Successful database requires techniques at the level of query formulation.

• But it more requires an overall knowledge of the database, it’s contents, structure.

• This is more the subject of a sources and services type course.

Page 17: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

http://openlib.org/home/krichel

Please shutdown the computers whenyou are done.

Thank you for your attention!

Page 18: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21
Page 19: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

NOT COVERED BECAUSE OF ILLNESSnot cover

Page 20: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Proposed Organization

• Normal lecture• Quiz at the beginning of every lecture– Factually oriented, around 15 minutes– Remove worst performance– Average to form 50%

• Search exercise 50% • I may make some adjustment to the syllabus

this week.

Page 21: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Search exercise

• Find victim of an information need• Best to take someone you know in a

professional capacity• Conduct interview about an information need

experienced by the victim, write down expectations

• Search in formal database and on web• Discuss results with the victim• Write essay, no longer than 5 pages.

Page 22: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

about the course

• This course is new wine in an old bottle• Officially a merger of – lis566 information resources on the Internet• mailing lists• usenet news• web searching

– lis618 database searching• access and use of commercial databases

Page 23: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

mix of theory and practice

• I am not a database search practitioner.• Each database is different, practical skills are

not easily transferable. • Thus my emphasis in the course is more on

theory.• In the past, I did theory first, then practice.• These day I mix. Some theory and some

practice in every session.

Page 24: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

What online retrieval systems?• Dialog has been the traditional database

covered. – They were the market leaders in online databases in

the past.– Nowadays the field is much more open.– They remain a very good teaching tool for

command based database searching.• Nexis: a news database I have covered every

year.• Google: a well-known search engine that I

started to cover two years ago.

Page 25: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

other stuff

• Other online IR systems that I have covered in the past– OCLC FirstSearch– Factiva (briefly)– WestLaw (external speaker)

• New developments– Peer-to-peer networks– an introduction to reference linking using OpenURL

• Old developments with library potential– relational databases

Page 26: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

About me • Born 1965, in Völklingen (Germany)• Studied economics and social sciences at the

Universities of Toulouse, Paris, Exeter and Leiceister.

• PhD in theoretical macroeconomics• Lecturer in Economics at the University of

Surrey 1993 and 2001• Since 2001 assistant professor at the Palmer

School

Page 27: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Why?

• During research assistantship period, (1990 to 1993) I was constantly frustrated with difficult access to scientific literature.

• At the same time, I discovered easy access to freely downloadable software over the Internet.

• I decided to work towards downloadable scientific documents. This lead to my library career (eventually).

Page 28: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Steps taken I

• 1993 founded the NetEc project at http://netec.mcc.ac.uk, later available at http://netec.ier.hit-u.ac.jp as well as at http://netec.wustl.edu.

• These are networking projects targeted to the economics community. The bulk is– Information about working papers– Downloadable working papers– Journal articles were added later

Page 29: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Steps taken II

• Set up RePEc, a digital library for economics research. Catalogs– Research documents– Collections of research documents– Researchers themselves– Organizations that are important to the research

process• Decentralized collection, model for the open

archives initiative

Page 30: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Steps taken III

• Co-founder of Open Archives Initiative• Work on the Academic Metadata Format• Co-founded rclis, a RePEc clone for (Research

in Computing, Library and Information Science)• Currently working on the Konz project. It uses

a database of titles of journal published papers and tries to find them on the Internet.

Page 31: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

my interest in databases

• an important emphasis of course is still on commercial databases.

• From my point of view I have two interests in database searching– As a provider, I must understand how people

search in order to provide some data that they can use and will use.

– As an economist, I have a strong interest in information as a commodity. The database market is an important market place.

Page 32: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

online information retrieval

• This subject can be though off as a subset of information retrieval (IR). Most IR is online or digital.

• IR concentrates on textual data.• We can think of online IR to fall under two

categories– database IR – web IR

Page 33: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

database / web IR

• Database IR look at systems that have– controlled set of record– low heterogeneity– use requires authentication– advanced search features

• Web IR has opposite characteristics

Page 34: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

traditional social model

• User goes to a library• Describes problem to the librarian• Librarian does the search– without the user present– with the user present

• Hands over the result to the user• User fetches full-text or asks a librarian to

fetch the full text.

Page 35: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

economic rational for traditional model

• In olden days the cost of telecommunication was high.

• Database use costs– cost of communication– cost of access time to the database

• The traditional model controls an upper limit to the costs.

Page 36: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

disintermediation

• With access cost time gone, the traditional model is under threat

• There is disintermediation where the librarian looses her role of doing the search.

• But that may not be good news for information retrieval results– user knows subject matter best– librarian knows searching best

Page 37: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Web searching

• IR has received a lot of impetus through the web, which poses unprecedented search challenges.

• With more and more data appearing on the web DS may be a subject in decline– It is primarily concerned with non-web databases– There is more and more web-based methods of

searching

Page 38: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

Public access vs quality

• Now the public at large is able to do online searching. • At the same time need for quality answers has

grown.• Quality-filtered services will become more important.• In the current databases, there is as lot that would

already be available for free mixed with quality-controlled stuff.

• Publishers have direct offerings and intermediated vending is in decline.

Page 39: LIS618lecture 0 Introduction to the course Thomas Krichel 2011-04-21

http://openlib.org/home/krichel

Thank you for your attention!