data discovery the reference interview. always begin by clarifying the distinction between...

22
Data Discovery The reference interview

Upload: deirdre-miller

Post on 16-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Data Discovery

The reference interview

Page 2: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

The reference interview

• Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the patron clearly knows this distinction.

• Ask a question that will help you understand what they might be seeking using our frameworks from yesterday.

• Asking them if they want statistics or data isn’t a good starting question, though.

Page 3: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Frameworks

In print

E-publications E-tables Databases

Online

Statistics

Aggregate Microdata

Data

Statistical Information Table Dimensions:

•Geography

•Time

•Subject content

Page 4: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

The reference interview

• What the patron intends or needs to do with the numbers? What is their objective?– Does the patron need them for a report or for data

analysis?

• What geographic area is needed?– Smallest geographic area to be described

• What time period is needed?

• What subject matter (variables) expressed in numbers is needed?

Page 5: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

The reference interviewIf you determine the patron does need data:• Population (unit of observation) to be

described• Do they need aggregate data, microdata,

spatial data?

• What software does the patron intend to use?

• How would the patron like the data delivered?

Page 6: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

level of service

• How much you do depends on the level of service you are offering.– Finding a resource– Retrieving a resource from an online

service– Tailoring a product for the patron– Creating a product for a patron (e.g., postal

code conversion linkage)

Page 7: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Does the person want onenumber? Are they pursuing a fact or figure?Want to know “how many?”

Statistics in printor ready-ref. electronicsource?

YES

YES

Go to print or ready ref.electronic source.

Page 8: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Does the person want onenumber? Are they pursuing a fact or figure?Want to know “how many?”

Statistics in printor ready-ref. electronicsource?

YES

YES

Go to print or ready ref.electronic source.

NO Are the data accessible incomputer-readable form?

YES

Go to computer-readablesource.

Extract relevant datafrom computer-readablesource and compile statisticsusing appropriatesoftware.

Page 9: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

To Use Data You Need 3 Things

• Datafile (the raw numbers)

• “Codebook” (where the numbers are and what they mean)

• Statistical Software (for reading the datafile and analyzing the data)

Page 10: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Field California Poll (newsletter) September 24, 1996as reproduced on microfiche in the collection, American Public Opinion Data.

The Statistics

Page 11: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

3001101 1999503 1 3001102122322288181818 112999999999999 999911111199999911111999993311182818 3001103182818 89214888211111111111111199999999999999 122883 2299821948 30011046601893249242331 111 212190100 9000311 300110500000000010000000000000000000000 3001106 1.1951 1.1345 1.1474 1.1585 3001107 1.1559 1.0007 1.0461 1.1416 3001201 2329503 2 3001202238543388881288 112999999999999 999999999911881199999111113231282882 3001203222882 18828822229999999999999911231221221212 322814 8103011942 30012043209492892242314 221 282071000 9470711 300120510010000000000000000000000000000 3001206 1.0056 0.8949 0.9050 0.8557 3001207 1.0988 0.9358 0.8786 0.8586 3001301 5349503 1 3001302358332888111888 117999999999999 999988881199999933333999992221181822 3001303181822 18848223112121112111241499999999999999 212884 3399811948 30013046405399393111511 211 212121000 9550311 300130510000000000000000000000000000000 3001306 1.1951 0.8094 0.6256 0.8518 3001307 1.1559 0.5942 0.4393 0.8840 3001401 1029503 2 3001402342342218111111 111128888888122 100199999922888299999822882212121828 3001403118821 11122223119999999999999912112182221122 212213 2202538148 30014044805399119381311 211 131491000 9540311 300140500000000010000000000000000000010 3001406 0.7594 0.6758 0.7376 0.7498 3001407 0.7829 0.6668 0.7040 0.7600

The Data

Page 12: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

VARIABLE 15 RATE PERFORMANCE-BARBARA BOXER DECK 2/17

Q7. WHAT KIND OF JOB DO YOU THINK BARBARA BOXER IS DOING AS U.S. SENATOR - A VERY GOOD, GOOD, FAIR, POOR OR VERY POOR JOB?

N OF CASES VALUE VALUE LABEL

33 1 VERY GOOD 130 2 GOOD 134 3 FAIR 63 4 POOR 43 5 VERY POOR 107 8 NO OPINION 513 9 NOT APPLICABLE (NOT FORM B) ____ 1023 TOTAL

From the codebook for the data:The Field (California) Poll #96-04THE FIELD INSTITUTEINTERVIEWING PERIODS: AUGUST 29 - SETEMBER 7, 1996NUMBER OF CASES: 1023

The Codebook

Page 13: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

Statistical Software

• Designed to read large files of raw numeric data• Not a spreadsheet!

– Can handle many more variables and cases.– Can do more elaborate and accurate statistics.– Designed to handle data (cases, observations, variables,

weights), not unstructured “cells.”

Page 14: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

GAUSSJMP

MiniTab S-PlusSAS

SPSSStataSystat

Page 15: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

SPSS

3001101 1999503 1 3001102122322288181818 112999999999999 999911111199999911111999993311182818 3001103182818 89214888211111111111111199999999999999 122883 2299821948 30011046601893249242331 111 212190100 9000311 300110500000000010000000000000000000000 3001106 1.1951 1.1345 1.1474 1.1585 3001107 1.1559 1.0007 1.0461 1.1416 3001201 2329503 2 3001202238543388881288 112999999999999 999999999911881199999111113231282882 3001203222882 18828822229999999999999911231221221212 322814 8103011942 30012043209492892242314 221 282071000 9470711

Codebook

Describe data layout

Write commands to analyze data

(data)

Page 16: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

RESPONDENTS SEX * recoded question 7 Crosstabulation

64 70 65 50 249

25.7% 28.1% 26.1% 20.1% 100.0%

12.6% 13.8% 12.8% 9.8% 48.9%

96 64 42 58 260

36.9% 24.6% 16.2% 22.3% 100.0%

18.9% 12.6% 8.3% 11.4% 51.1%

160 134 107 108 509

31.4% 26.3% 21.0% 21.2% 100.0%

31.4% 26.3% 21.0% 21.2% 100.0%

Count

% withinRESPONDENTS SEX

% of Total

Count

% withinRESPONDENTS SEX

% of Total

Count

% withinRESPONDENTS SEX

% of Total

MALE

FEMALE

RESPONDENTSSEX

Total

Very Good/ Good Fair

Poor /Very Poor no opinion

recoded question 7

Total

Page 17: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

RESPONDENTS SEX * RATE PERFORMANCE-BARBARA BOXER Crosstabulation

7.0% 25.0% 35.0% 18.0% 15.0% 100.0%

3.5% 12.4% 17.4% 9.0% 7.5% 49.8%

8.9% 38.6% 31.7% 12.9% 7.9% 100.0%

4.5% 19.4% 15.9% 6.5% 4.0% 50.2%

8.0% 31.8% 33.3% 15.4% 11.4% 100.0%

8.0% 31.8% 33.3% 15.4% 11.4% 100.0%

% withinRESPONDENTS SEX

% of Total

% withinRESPONDENTS SEX

% of Total

% withinRESPONDENTS SEX

% of Total

MALE

FEMALE

RESPONDENTSSEX

Total

VERY GOOD GOOD FAIR POOR VERY POOR

RATE PERFORMANCE-BARBARA BOXER

Total

Page 18: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the
Page 19: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

reference strategies• Gov publications approach

– What agency would produce such a statistic?

• Does the mandate or goals include the scope of content?

• Who are the members of the agency, if the agency is a membership organization?

– What jurisdiction responsible for this content?

– Is this likely an official or non-official statistic?

– What publication titles are related to this content?

– What is the availability of statistics from the agency

• Data librarian approach– What data source would be

used to produce such a statistic?

– Who would collect such data?– What unit of observation

would be needed to produce such a statistics?

– What would the structure of the table look like given time, geography and attributes of the unit of observation?

– Would the source be in the realm of official or non-official statistics?

– Use the literature trail and its indexes (non-official vs. official publications)

Page 20: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

the data reference interview process

• The information-seeking context is as important to statistics and data as other reference interviews.

• How is the data reference interview similar to general reference interviews?

• How is the data reference interview different?

Page 21: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the

research on the data reference interview process

• A colleague is developing a model from which comparisons can be made between the general and data reference interviews.

• One aspect of the model, namely the discovery and clarification of concepts and language, is being investigated using items from a specialist discussion list and a blog.

http://blogs.library.ualberta.ca/digrs/

Page 22: Data Discovery The reference interview. Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the