lis618 lecture 4 thomas krichel 2002-09-06. structure of talk the blue sheet working with dialog...

25
LIS618 lecture 4 Thomas Krichel 2002-09-06

Upload: eustacia-hawkins

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

the rank command counts the occurrences of unique terms within a specified field and lists them in ranked order with most highly posted term appearing first. example –rank au to find who has written the most papers Does not work on word-indexed fields

TRANSCRIPT

Page 1: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

LIS618 lecture 4

Thomas Krichel2002-09-06

Page 2: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Structure of talk

• The blue sheet• Working with Dialog• Nexis.com

Page 3: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

the rank command

• counts the occurrences of unique terms within a specified field and lists them in ranked order with most highly posted term appearing first.

• example– rank au

to find who has written the most papers• Does not work on word-indexed fields

Page 4: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

View command

Page 5: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Sort command

Page 6: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

type command

type set/format/range [from base]• set is a result set• format is a format• range can be

– start – end• start is a record number to start• end is a record number to end

– all• base is a database file number or "each"

Page 7: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

formats are defined

• 2 -- full record except abstract• 3 or medium – citation• 5 or long – full except full text• 6 or free – title and dialog number• 8 or short – title plus indexing terms

– useful to find other indexing terms• 9 or full – everything• KWIC or K – keywords in context

Page 8: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Using controlled vocabularies

• not all databases have it• use of heterogeneous schemes across

different databases• A strategy to use them is s word s s1/ti,de t s1/8/all s term/cc

Page 9: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

special name finders

• Dialog company name, file 416• Dialog journal name, file 414• Dialog product name, file 413

Page 10: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

searching dialog index

• use broad term on file 415• expand

– db= database type– di= index category– dt= document types indexed– gc= geographic coverage

Page 11: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Finally: verdict on Dialog

• speed abominable• interface clunky• javascript stinks• integration of sources leaves much to be

desired

• Does not reach the pass mark• Will be dead in five years time.

Page 12: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Nexis

Page 13: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

primarily a news service

• adds an important temporal component to all its contents

• restricts contents as compared to Dialog• Six interfaces to be covered

– search -- personal news– subject directory -- real time news– power search -- search forms

• potentially bad competition from Google

Page 14: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

limiting by dates

• does not use ISO format :--(• any of the following are legal

– 07/24/2000 – 7/24/00 – July 24, 2000 – Jul 24, 2000 – July 2000

Page 15: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Nexis search

• no phrase terms• implicit Boolean "or" between terms• in fact searches

– keywords extracted– HLEAD for news– TITLE for legal documets– WEB-SEARCH-TEXT for web pages

Page 16: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

with the web comes all sorts• TITLE: FarmSex FarmLove bestiality beasitialty animal sex

animalsex dog fuckers horsecocks zebras chimps monkeys male fuck dogsex

• URL: http://www.farmsex.com  • DOMAIN-TYPE: Commercial • WEB-LEAD-TEXT: WARNING! Farm Sex The following

pages contain sexually oriented adult material intended for individuals 21 years of age or older.

• WEB-PAGE-STATUS: OK (200) • LAST-UPDATE: September 7, 2002 • FIRST-REFERENCE: Jul 22 1999 12:00AM • LATEST-REFERENCE: June 01, 2002• SUBJECT:  PORNOGRAPHY & OBSCENITY (78%); 

Page 17: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

TERMS segment

• a list of indexing terms has been compiled by nexis staff

• computer scans and looks for terms or a variants of terms, and replaces with term identified

• this collection of terms makes for the index terms.

Page 18: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

relevance ranking

• where terms appear within the document • how many occurances of the terms appear

in the document• how often those search terms appear

throughout the document• apparently not how much they occur,

example search for "the"• seems that they guard algorithm a secret

Page 19: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Subject directory

• you can follow the subject tree but• there seems to be only a tiny amount of

documents• categories are not particularly deep or

developed• there is a "more like this" feature of limited

use, Thomas finds

Page 20: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Power search

• source selection, editing is possible• use of connectors is possible here

– OR -- AND – AND NOT – PRE/n, n is a number, ordered proximity– W/n, n is a number, unordered proximity – W/S words in same sentence – W/P words is the some paragraph

• no use of double quotes for paragraphs

Page 21: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

Power search expressions• Parentheses group terms together• * for one or no letter• ! for any number of letters• ATLEAST n, where n is a minimum number of

occurrences• PLURAL (term) only the plural of term• SINGULAR (term) only the singular of term• ALLCAPS (term) only capitals of term• NOCAPS (term) no capitals of term• CAPS (term) capitalized term only

Page 22: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

power search for news

• uses power search expressions, plus• lhead (expression)• company (expression) for a company• byline (expression) for the author• show (expression) for a television show

transcript

expression is a Boolean expression

Page 23: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

power search for legal data

• uses power search expressions, plus• name (expression) for the name of a party• cite (expression) for a citation expression

for case law• title (expression) for the title of a law article

expression is a Boolean expression

Page 24: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

other searches

• our evidence suggests that this is a delicate topic

• news alert– use this to get personal use– do a search, then click on update to get to a

screen where you can enter • periodicity • document type

Page 25: LIS618 lecture 4 Thomas Krichel 2002-09-06. Structure of talk The blue sheet Working with Dialog Nexis.com

a different query language

• terms are implicitly ANDed• explicit AND and OR allowed• phrases have to be put in quotes• * starts for any number of characters, not

just one as in power search• parenthesis can be used