chapter 1 jan. 8, 20081 chapter 1 where do data come from?
Post on 19-Dec-2015
217 views
TRANSCRIPT
Chapter 1 Jan. 8, 2008
1
Chapter 1
Where Do Data Come From?
Chapter 1 Jan. 8, 2008
2
Thought Question 1
From a recent study, researchers concluded that high levels of alcohol consumption resulted in lower graduation rates at colleges. How do you think this study was carried out in order to get these results? Do you think the conclusion is correct? Is there a more reasonable conclusion?
Chapter 1 Jan. 8, 2008
3
Thought Question 2
It is popular knowledge that for similar jobs men earn more money on average than women, and yet there are cases where some women make more money than some men. Therefore, to determine if men really do earn more, you would need to sample many people of each sex. Suppose we also want to know if, on average, men stay at their current jobs for a longer time period than women. How could you go about trying to determine this? Would it be sufficient to collect data for one member of each sex?... two members of each sex? What information about men’s and women’s measurements would help you decide how many people to measure?
Chapter 1 Jan. 8, 2008
4
Chapter 1 Jan. 8, 2008
5
What is STATISTICS ?
Using “data” to draw a conclusion about something unknown.
Decision making in the presence of uncertainty.
Chapter 1 Jan. 8, 2008
6
Statistics- Meaning ?
Method of analysisa collection of methods for planning experiments
or observational studies, obtaining data, and
then organizing, summarizing, presenting,
analyzing, interpreting, and drawing conclusions
based on the data.
Chapter 1 Jan. 8, 2008
7
Statistics- Meaning ?
Our Book:
Statistics is the science (or ‘art’) of data.
Chapter 1 Jan. 8, 2008
8
Common Language
Chapter 1 Jan. 8, 2008
9
Population The complete collection of all subjects or
objects (scores, people, measurements, and
so on) that are being studied.
The collection is complete in the sense that it
includes all subjects to be studied.
Chapter 1 Jan. 8, 2008
10
Census: The collection of data from every element in a population.
Sample : A subset of elements drawn from a population from which we collect data.
The sample must be a good representative of the entire population.
Chapter 1 Jan. 8, 2008
11
Population
individuals
Chapter 1 Jan. 8, 2008
12
Sampling Frame
Individuals that could possibly be selected for the sample (not necessarily the same as the population)
Chapter 1 Jan. 8, 2008
13
List of Individuals123456789
1011121314151617
Census
1
9
23 4 5 6
78
10
17161513
14
1211
1
9
23 4 5 6
78
10
17161513
14
1211
Census
Chapter 1 Jan. 8, 2008
14
Sampling Frame
1
9
23 4 5 6
7810
17161513
14
1211
List of Individuals123456789
1011121314151617
Sample
Chapter 1 Jan. 8, 2008
15
Example
Suppose we are interested in the average age of all Malaspina students.
The relevant population is all Malaspina students (including students in all campuses).
Sampling Frame: List of Malaspina students at the Nanaimo campus.
Chapter 1 Jan. 8, 2008
16
Example Cont.
A sample can be students in this Math 161 class, or, 50 randomly selected Malaspina students at the Nanaimo campus.
If we use the ages of all Malaspina students, then we have a census.
Chapter 1 Jan. 8, 2008
17
_____________________________________
______________________________________
______________________________________
What is Statistics?
Chapter 1 Jan. 8, 2008
18
Descriptive & Inferential StatisticsDescriptive & Inferential Statistics
StatisticalMethods
DescriptiveStatistics
InferentialStatistics
StatisticalMethods
DescriptiveStatistics
InferentialStatistics
Chapter 1 Jan. 8, 2008
19
Descriptive Statistics
Consists of the collection, organization,
summarization, and presentation of
data.
Chapter 1 Jan. 8, 2008
20
Inferential Statistics
Consists of generalizing from samples
to populations, performing estimations
and hypothesis tests, determining
relationships among variables, and making
predictions.
Chapter 1 Jan. 8, 2008
21
What Is “Data”?(better: What are “data”?)
?
Pieces of information.
Numbers.
The above are data only if the information has a meaning attached.
Chapter 1 Jan. 8, 2008
22
Data
Data are observations that have been collected. The observation may be numerical (example: age, height, GPA) or non-numerical (example: gender, eye colour, province of residence)
The Nature of DataThe Nature of Data
Chapter 1 Jan. 8, 2008
23
Two Types of Data
Quantitative or Numeric Data
Numbers representing counts or measurements.
Qualitative or Categorical Data Data can be separated into different categories that are distinguished by some non-numeric characteristics.
The Nature of DataThe Nature of Data
Chapter 1 Jan. 8, 2008
24
ExamplesQuantitative (numerical) data
the ________________ of college graduates
the ________________ between home and school
Qualitative (or categorical) data
the ___________ of college graduates (F, M)
the ___________ of a product (best, good, bad)
Chapter 1 Jan. 8, 2008
25
Two Types of Quantitative Data
Discrete: Data values that can be counted such as 0, 1, 2, 3, . . .
Example: Number of students in a Stat. class.
Continuous: Data that can assume an infinite number of values between any two specific values. - Usually results from measurements.
22 33
Chapter 1 Jan. 8, 2008
26
Classify as discrete or continuous:
1. The number of eggs that hens lay;
for example, 3 eggs a day________.
Examples
2. The height of College students.
___________________
Chapter 1 Jan. 8, 2008
27
C a te go rica l
D isc re te C o ntin uo us
N u m e rica l
D a ta
Chapter 1 Jan. 8, 2008
28
1. Nominal: characterized by data that
consist of names, labels, or categories only.
The data cannot be arranged in an ordering
scheme (such as low to high)
Example: Survey responses may be yes, no,
or undecided. Eye colour, gender etc.
Levels of Measurements
Chapter 1 Jan. 8, 2008
29
2. Ordinal: involves data that may be
arranged in some order, but differences
between data values either cannot be
determined or are meaningless
Example: Course grades: A, B, C, D, or F
Dress size: small, medium, large, XL
Chapter 1 Jan. 8, 2008
30
3. Interval: like the ordinal level, with the
additional property that the difference between
any two data values is meaningful. However,
there is no natural zero starting point (where
none of the quantity is present)
Example: Years 1000, 2000, 1776, and 1492
Temperature in 0C - 0 0C does not mean no
temperature.
Chapter 1 Jan. 8, 2008
31
4. Ratio: the interval level modified to
include the natural zero starting point (where
zero indicates that none of the quantity is
present). For values at this level, differences
and ratios are meaningful.
Example: Prices of college textbooks.
Chapter 1 Jan. 8, 2008
32
Levels of Measurement_________________ - categories only
_________________- categories with some order
_________________- differences but no natural
starting point
_________________- differences and a natural
starting point
Chapter 1 Jan. 8, 2008
33
Summary
DataData
CategoricalCategorical NumericalNumerical
NominalNominal OrdinalOrdinal IntervalInterval RatioRatio
Chapter 1 Jan. 8, 2008
34
How Data are Obtained Observational Study
– Observes individuals and measures variables of interest but does not attempt to influence the responses.
– Describes some group or situation.– Sample Surveys are a type of observational
study.
Experiment– Deliberately imposes some treatment on
individuals in order to observe their responses.– Studies whether the treatment causes change in
the response.
Chapter 1 Jan. 8, 2008
35
Experiments vs. observational studies
for comparing the effects of treatments: Experiment
– experimenter determines which units receive which treatments (ideally using some form of random allocation)
Observational study– compare units that happen to have received each of the
treatments
– often useful for identifying possible causes of effects, but cannot reliably establish causation
Only properly designed and executed experiments can reliably demonstrate causation.
Chapter 1 Jan. 8, 2008
36
Data SourcesData Sources
DataSources
Primary Secondary
Experiment Survey Observation
Chapter 1 Jan. 8, 2008
37
Case Study
The Effect of Hypnosison the
Immune System
reported in Science News, Sept. 4, 1993, p. 153
Chapter 1 Jan. 8, 2008
38
Case Study
The Effect of Hypnosison the
Immune System
Objective:To determine if hypnosis strengthens thedisease-fighting capacity of immune cells.
Chapter 1 Jan. 8, 2008
39
Case Study 65 college students.
– 33 easily hypnotized– 32 not easily hypnotized
white blood cell counts measured
All students viewed a brief video about the immune system.
Chapter 1 Jan. 8, 2008
40
Case Study
Students randomly assigned to one of three conditions– subjects hypnotized, given mental exercise– subjects relaxed in sensory deprivation
tank– control group (no treatment)
Chapter 1 Jan. 8, 2008
41
Case Study
white blood cell counts re-measured after one week
the two white blood cell counts are compared for each group
Results– hypnotized group showed larger jump in white
blood cells– “easily hypnotized” group showed largest immune
enhancement
Chapter 1 Jan. 8, 2008
42
Case Study
The Effect of Hypnosison the
Immune System
What is the population?
What is the sample?
Chapter 1 Jan. 8, 2008
43
Case Study
The Effect of Hypnosison the
Immune System
What data werecollected?
Easy or difficult to achieve hypnotic trance
Group assignment Pre-study white
blood cell count Post-study white
blood cell count
Chapter 1 Jan. 8, 2008
44
Case Study
The Effect of Hypnosison the
Immune System
Is this an experimentor
an observational study?
Chapter 1 Jan. 8, 2008
45
Case Study
The Effect of Hypnosison the
Immune System
Do hypnosis and mental exercise affect the immune system?
Chapter 1 Jan. 8, 2008
46
Case Study
Weight Gain SpellsHeart Risk for Women
“Weight, weight change, and coronary heart disease in women.” W.C. Willett, et al., vol. 273(6), Journal of the American Medical Association, Feb. 8, 1995.
(Reported in Science News, Feb. 18, 1995, p. 108)
Chapter 1 Jan. 8, 2008
47
Case Study
Weight Gain SpellsHeart Risk for Women
Objective:To recommend a range of body mass index (a function of weight and height) in terms of
coronary heart disease (CHD) risk in women.
Chapter 1 Jan. 8, 2008
48
Case Study
Study started in 1976 with 115,818 women aged 30 to 55 years and without a history of previous CHD.
Each woman’s weight (body mass) was determined.
Each woman was asked her weight at age 18.
Chapter 1 Jan. 8, 2008
49
Case Study
The cohort of women were followed for 14 years.
The number of CHD (fatal and nonfatal) cases were counted (1292 cases).
Results were adjusted for other variables.
Chapter 1 Jan. 8, 2008
50
Case Study
Results: compare those who gained less than 11 pounds from age 18 to current age to the others.– 11 to 17 lbs: 25% more likely to develop
heart disease– 17 to 24 lbs: 64% more likely– 24 to 44 lbs: 92% more likely– more than 44 lbs: 165% more likely
Chapter 1 Jan. 8, 2008
51
Case Study
Weight Gain SpellsHeart Risk for Women
What is the population?
What is the sample?
Chapter 1 Jan. 8, 2008
52
Case Study
Weight Gain SpellsHeart Risk for Women
What data werecollected?
Age in 1976 Weight in 1976 Weight at age 18 Incidence of coronary
heart disease Other: smoking, family
history, menopausal status, post-menopausal hormone use
Chapter 1 Jan. 8, 2008
53
Case Study
Weight Gain SpellsHeart Risk for Women
Is this an experimentor
an observational study?
Chapter 1 Jan. 8, 2008
54
Case Study
Weight Gain SpellsHeart Risk for Women
Does weight gain in women increase their risk
for CHD?
Chapter 1 Jan. 8, 2008
55
Key Concepts Knowing about statistical methods will
have practical consequences in your everyday lives.
Experiment versus Observational Study. Common Terms:
– Individuals, Population, Sampling Frame, Sample, Sample Survey, Census, Variable.
Chapter 1 Jan. 8, 2008
56
ConclusionConclusion Defined statistics. Distinguished between descriptive &
inferential statistics. Summarized the sources of data. Described the types of data & scales. Common Terms: Population, Sampling
frame, Census, Sample, Individuals, Variables etc.