metropolitan state university of denver...

66
Introduction Collecting Data: Sampling Introduction to Statistics Nels Grevstad Metropolitan State University of Denver [email protected] August 19, 2019 Nels Grevstad

Upload: others

Post on 25-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Introduction to Statistics

Nels Grevstad

Metropolitan State University of Denver

[email protected]

August 19, 2019

Nels Grevstad

Page 2: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Topics

1 Introduction

2 Collecting Data: Sampling

Nels Grevstad

Page 3: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Objectives

Objectives:

Distinguish between quantitative and qualitative data.Distinguish between discrete and continuous (quantitative)variables. Distinguish between nominal and ordinal(qualitative) variables.

Identify common types of selection bias in sampling.

State how non-response can lead to biased survey results.

Nels Grevstad

Page 4: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Introduction (1.1, 1.2)

Statistics is the science of collecting, organizing,analyzing, and drawing conclusions from data.

Data are observations (measurements) of a variable.

A variable is a characteristic that varies from oneindividual to the next (e.g. height, age, income etc.).

Nels Grevstad

Page 5: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Introduction (1.1, 1.2)

Statistics is the science of collecting, organizing,analyzing, and drawing conclusions from data.

Data are observations (measurements) of a variable.

A variable is a characteristic that varies from oneindividual to the next (e.g. height, age, income etc.).

Nels Grevstad

Page 6: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Introduction (1.1, 1.2)

Statistics is the science of collecting, organizing,analyzing, and drawing conclusions from data.

Data are observations (measurements) of a variable.

A variable is a characteristic that varies from oneindividual to the next (e.g. height, age, income etc.).

Nels Grevstad

Page 7: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Nominal Ordinal DiscreteContinuous

Qualitative Quantitative

Data

Nels Grevstad

Page 8: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Variables can be either

1. Quantitative (also called numerical), i.e. possiblevalues of the variable are numerical values, or

2. Qualitative (also called categorical), i.e. possiblevalues of the variable are categories.

Nels Grevstad

Page 9: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Variables can be either

1. Quantitative (also called numerical), i.e. possiblevalues of the variable are numerical values, or

2. Qualitative (also called categorical), i.e. possiblevalues of the variable are categories.

Nels Grevstad

Page 10: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Variables can be either

1. Quantitative (also called numerical), i.e. possiblevalues of the variable are numerical values, or

2. Qualitative (also called categorical), i.e. possiblevalues of the variable are categories.

Nels Grevstad

Page 11: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Quantitative variables can be either

1. Discrete, i.e. possible values of the variable areisolated numbers with gaps between them, e.g. theinteger values 0, 1, 2, 3, ..., or

2. Continuous, i.e. the set of possible values of thevariable is an entire interval, or continuum, e.g. allvalues greater than 0, such as 47.46231.

Nels Grevstad

Page 12: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Quantitative variables can be either

1. Discrete, i.e. possible values of the variable areisolated numbers with gaps between them, e.g. theinteger values 0, 1, 2, 3, ..., or

2. Continuous, i.e. the set of possible values of thevariable is an entire interval, or continuum, e.g. allvalues greater than 0, such as 47.46231.

Nels Grevstad

Page 13: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Quantitative variables can be either

1. Discrete, i.e. possible values of the variable areisolated numbers with gaps between them, e.g. theinteger values 0, 1, 2, 3, ..., or

2. Continuous, i.e. the set of possible values of thevariable is an entire interval, or continuum, e.g. allvalues greater than 0, such as 47.46231.

Nels Grevstad

Page 14: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Qualitative variables can be either

1. Ordinal, i.e. possible values of the variable have aspecific ordering to them (e.g. small, medium, andlarge), or

2. Nominal, i.e. the possible values of the variable haveno specific ordering (e.g. the colors red, green, andblue, etc.).

Nels Grevstad

Page 15: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Qualitative variables can be either

1. Ordinal, i.e. possible values of the variable have aspecific ordering to them (e.g. small, medium, andlarge), or

2. Nominal, i.e. the possible values of the variable haveno specific ordering (e.g. the colors red, green, andblue, etc.).

Nels Grevstad

Page 16: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Qualitative variables can be either

1. Ordinal, i.e. possible values of the variable have aspecific ordering to them (e.g. small, medium, andlarge), or

2. Nominal, i.e. the possible values of the variable haveno specific ordering (e.g. the colors red, green, andblue, etc.).

Nels Grevstad

Page 17: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

ExerciseDetermine whether each of these variables is quantitative orqualitative. If it’s quantitative, determine whether it’s discreteor continuous. If it’s qualitative, determine whether it’s ordinalor nominal.

a) The makes (e.g. Ford, Toyota, etc.) of cars in a certaindowntown parking garage.

b) The amount of time it takes your bus to travel between twostops from day to day.

c) The numbers of flowers on plants after three months growth.

d) The heights of plants after three months growth.

Nels Grevstad

Page 18: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

ExerciseDetermine whether each of these variables is quantitative orqualitative. If it’s quantitative, determine whether it’s discreteor continuous. If it’s qualitative, determine whether it’s ordinalor nominal.

a) The makes (e.g. Ford, Toyota, etc.) of cars in a certaindowntown parking garage.

b) The amount of time it takes your bus to travel between twostops from day to day.

c) The numbers of flowers on plants after three months growth.

d) The heights of plants after three months growth.

Nels Grevstad

Page 19: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

ExerciseDetermine whether each of these variables is quantitative orqualitative. If it’s quantitative, determine whether it’s discreteor continuous. If it’s qualitative, determine whether it’s ordinalor nominal.

a) The makes (e.g. Ford, Toyota, etc.) of cars in a certaindowntown parking garage.

b) The amount of time it takes your bus to travel between twostops from day to day.

c) The numbers of flowers on plants after three months growth.

d) The heights of plants after three months growth.

Nels Grevstad

Page 20: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

ExerciseDetermine whether each of these variables is quantitative orqualitative. If it’s quantitative, determine whether it’s discreteor continuous. If it’s qualitative, determine whether it’s ordinalor nominal.

a) The makes (e.g. Ford, Toyota, etc.) of cars in a certaindowntown parking garage.

b) The amount of time it takes your bus to travel between twostops from day to day.

c) The numbers of flowers on plants after three months growth.

d) The heights of plants after three months growth.

Nels Grevstad

Page 21: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

e) The weights of babies born at a certain hospital.

f) The number of phone calls received by a store’s customerservice department from day to day.

g) The condition (good, fair, serious, critical) of patientsadmitted to a hospital.

h) The number of college credits students in this class haveacquired.

i) The colors (blue, orange, red, etc.) of the shirts worn bystudents in this class.

Nels Grevstad

Page 22: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

e) The weights of babies born at a certain hospital.

f) The number of phone calls received by a store’s customerservice department from day to day.

g) The condition (good, fair, serious, critical) of patientsadmitted to a hospital.

h) The number of college credits students in this class haveacquired.

i) The colors (blue, orange, red, etc.) of the shirts worn bystudents in this class.

Nels Grevstad

Page 23: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

e) The weights of babies born at a certain hospital.

f) The number of phone calls received by a store’s customerservice department from day to day.

g) The condition (good, fair, serious, critical) of patientsadmitted to a hospital.

h) The number of college credits students in this class haveacquired.

i) The colors (blue, orange, red, etc.) of the shirts worn bystudents in this class.

Nels Grevstad

Page 24: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

e) The weights of babies born at a certain hospital.

f) The number of phone calls received by a store’s customerservice department from day to day.

g) The condition (good, fair, serious, critical) of patientsadmitted to a hospital.

h) The number of college credits students in this class haveacquired.

i) The colors (blue, orange, red, etc.) of the shirts worn bystudents in this class.

Nels Grevstad

Page 25: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

e) The weights of babies born at a certain hospital.

f) The number of phone calls received by a store’s customerservice department from day to day.

g) The condition (good, fair, serious, critical) of patientsadmitted to a hospital.

h) The number of college credits students in this class haveacquired.

i) The colors (blue, orange, red, etc.) of the shirts worn bystudents in this class.

Nels Grevstad

Page 26: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

A population is an entire group of individuals about whichwe seek information.

A sample is a subset of the population that’s selected insome prescribed manner (usually randomly).

Two uses of statistics are

1. To describe (or summarize) a set of data(descriptive statistics).

2. To draw conclusions about a population based on datain a sample (inferential statistics).

Nels Grevstad

Page 27: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

A population is an entire group of individuals about whichwe seek information.

A sample is a subset of the population that’s selected insome prescribed manner (usually randomly).

Two uses of statistics are

1. To describe (or summarize) a set of data(descriptive statistics).

2. To draw conclusions about a population based on datain a sample (inferential statistics).

Nels Grevstad

Page 28: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

A population is an entire group of individuals about whichwe seek information.

A sample is a subset of the population that’s selected insome prescribed manner (usually randomly).

Two uses of statistics are

1. To describe (or summarize) a set of data(descriptive statistics).

2. To draw conclusions about a population based on datain a sample (inferential statistics).

Nels Grevstad

Page 29: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

A population is an entire group of individuals about whichwe seek information.

A sample is a subset of the population that’s selected insome prescribed manner (usually randomly).

Two uses of statistics are

1. To describe (or summarize) a set of data(descriptive statistics).

2. To draw conclusions about a population based on datain a sample (inferential statistics).

Nels Grevstad

Page 30: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

A population is an entire group of individuals about whichwe seek information.

A sample is a subset of the population that’s selected insome prescribed manner (usually randomly).

Two uses of statistics are

1. To describe (or summarize) a set of data(descriptive statistics).

2. To draw conclusions about a population based on datain a sample (inferential statistics).

Nels Grevstad

Page 31: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Descriptive statistics can be used with any set of data,even if they aren’t a random sample.

Inferential statistics requires that the data be from either:

1. A random sample (so that they’ll be representative ofthe population), or

2. A randomized experiment (so that we can drawconclusions about the effect of a treatment).

Nels Grevstad

Page 32: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Descriptive statistics can be used with any set of data,even if they aren’t a random sample.

Inferential statistics requires that the data be from either:

1. A random sample (so that they’ll be representative ofthe population), or

2. A randomized experiment (so that we can drawconclusions about the effect of a treatment).

Nels Grevstad

Page 33: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Descriptive statistics can be used with any set of data,even if they aren’t a random sample.

Inferential statistics requires that the data be from either:

1. A random sample (so that they’ll be representative ofthe population), or

2. A randomized experiment (so that we can drawconclusions about the effect of a treatment).

Nels Grevstad

Page 34: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Descriptive statistics can be used with any set of data,even if they aren’t a random sample.

Inferential statistics requires that the data be from either:

1. A random sample (so that they’ll be representative ofthe population), or

2. A randomized experiment (so that we can drawconclusions about the effect of a treatment).

Nels Grevstad

Page 35: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

This course has three main broad sections:

1. Collecting data, summarizing data, and graphing data.

2. Probability (which is used to quantify uncertainty).

3. Inferential statistics (which uses probability to expressuncertainty in the conclusions that we draw about apopulation).

Nels Grevstad

Page 36: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

This course has three main broad sections:

1. Collecting data, summarizing data, and graphing data.

2. Probability (which is used to quantify uncertainty).

3. Inferential statistics (which uses probability to expressuncertainty in the conclusions that we draw about apopulation).

Nels Grevstad

Page 37: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

This course has three main broad sections:

1. Collecting data, summarizing data, and graphing data.

2. Probability (which is used to quantify uncertainty).

3. Inferential statistics (which uses probability to expressuncertainty in the conclusions that we draw about apopulation).

Nels Grevstad

Page 38: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

This course has three main broad sections:

1. Collecting data, summarizing data, and graphing data.

2. Probability (which is used to quantify uncertainty).

3. Inferential statistics (which uses probability to expressuncertainty in the conclusions that we draw about apopulation).

Nels Grevstad

Page 39: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Nels Grevstad

Page 40: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Collecting Data: Sampling (1.2, 1.3)

Sampling versus Census Taking

In a census, data are obtained from every individual in thepopulation (e.g. the U.S Census).

Taking a census is time consuming and expensive. Insteadwe usually take a sample from the population, and thenmake generalizations (i.e. statistical inferences) about thepopulation based on our sample.

Nels Grevstad

Page 41: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Collecting Data: Sampling (1.2, 1.3)

Sampling versus Census Taking

In a census, data are obtained from every individual in thepopulation (e.g. the U.S Census).

Taking a census is time consuming and expensive. Insteadwe usually take a sample from the population, and thenmake generalizations (i.e. statistical inferences) about thepopulation based on our sample.

Nels Grevstad

Page 42: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Collecting Data: Sampling (1.2, 1.3)

Sampling versus Census Taking

In a census, data are obtained from every individual in thepopulation (e.g. the U.S Census).

Taking a census is time consuming and expensive. Insteadwe usually take a sample from the population, and thenmake generalizations (i.e. statistical inferences) about thepopulation based on our sample.

Nels Grevstad

Page 43: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Biased and Unbiased Sampling Schemes

In order to generalize from sample data to the population,we’d like our sample to be representative of the population.

Nels Grevstad

Page 44: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Biased and Unbiased Sampling Schemes

In order to generalize from sample data to the population,we’d like our sample to be representative of the population.

Nels Grevstad

Page 45: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some Biased Sampling Schemes

A data collection method that systematically produces datathat are not representative of the population is called abiased data collection method.

Nels Grevstad

Page 46: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some Biased Sampling Schemes

A data collection method that systematically produces datathat are not representative of the population is called abiased data collection method.

Nels Grevstad

Page 47: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some examples of bias in sampling include:

Selection bias - Certain groups are systematicallyunderrepresented in the sample (and othersoverrepresented). Examples include:

Convenience samples - The sample is selected in a waythat’s convenient, but may favor certain groups of thepopulation (and exclude others). An example is standing ona street corner surveying passersby.

Voluntary response samples (also calledself-selected samples) - Sample selection is by ”openinvitation”. Individuals decide for themselves whether or notto be included, and those who do may differ systematicallyfrom those who don’t. Examples include online surveysconducted from websites, where individuals who have astrong interest in the survey’s subject matter are more likelyto respond than those who don’t.

Nels Grevstad

Page 48: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some examples of bias in sampling include:

Selection bias - Certain groups are systematicallyunderrepresented in the sample (and othersoverrepresented). Examples include:

Convenience samples - The sample is selected in a waythat’s convenient, but may favor certain groups of thepopulation (and exclude others). An example is standing ona street corner surveying passersby.

Voluntary response samples (also calledself-selected samples) - Sample selection is by ”openinvitation”. Individuals decide for themselves whether or notto be included, and those who do may differ systematicallyfrom those who don’t. Examples include online surveysconducted from websites, where individuals who have astrong interest in the survey’s subject matter are more likelyto respond than those who don’t.

Nels Grevstad

Page 49: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some examples of bias in sampling include:

Selection bias - Certain groups are systematicallyunderrepresented in the sample (and othersoverrepresented). Examples include:

Convenience samples - The sample is selected in a waythat’s convenient, but may favor certain groups of thepopulation (and exclude others). An example is standing ona street corner surveying passersby.

Voluntary response samples (also calledself-selected samples) - Sample selection is by ”openinvitation”. Individuals decide for themselves whether or notto be included, and those who do may differ systematicallyfrom those who don’t. Examples include online surveysconducted from websites, where individuals who have astrong interest in the survey’s subject matter are more likelyto respond than those who don’t.

Nels Grevstad

Page 50: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Some examples of bias in sampling include:

Selection bias - Certain groups are systematicallyunderrepresented in the sample (and othersoverrepresented). Examples include:

Convenience samples - The sample is selected in a waythat’s convenient, but may favor certain groups of thepopulation (and exclude others). An example is standing ona street corner surveying passersby.

Voluntary response samples (also calledself-selected samples) - Sample selection is by ”openinvitation”. Individuals decide for themselves whether or notto be included, and those who do may differ systematicallyfrom those who don’t. Examples include online surveysconducted from websites, where individuals who have astrong interest in the survey’s subject matter are more likelyto respond than those who don’t.

Nels Grevstad

Page 51: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Examples of bias in sampling (cont’d):

Nonresponse bias - One or more survey questions areskipped by some of the individuals surveyed (itemnon-response) or some surveys aren’t returned at all(unit non-response). The individuals who respond maydiffer systematically from those who don’t respond.Examples include:

Mailed surveys for which the return postage isn’t prepaid.Low income people won’t respond.

Phone surveys that take too long. People who are very busydon’t participate or they hang up before the survey iscompleted.

Surveys that have questions about illegal activities (e.g. druguse) or other sensitive topics (e.g. sex) that some peopledon’t answer.

Nels Grevstad

Page 52: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Examples of bias in sampling (cont’d):

Nonresponse bias - One or more survey questions areskipped by some of the individuals surveyed (itemnon-response) or some surveys aren’t returned at all(unit non-response). The individuals who respond maydiffer systematically from those who don’t respond.Examples include:

Mailed surveys for which the return postage isn’t prepaid.Low income people won’t respond.

Phone surveys that take too long. People who are very busydon’t participate or they hang up before the survey iscompleted.

Surveys that have questions about illegal activities (e.g. druguse) or other sensitive topics (e.g. sex) that some peopledon’t answer.

Nels Grevstad

Page 53: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Examples of bias in sampling (cont’d):

Nonresponse bias - One or more survey questions areskipped by some of the individuals surveyed (itemnon-response) or some surveys aren’t returned at all(unit non-response). The individuals who respond maydiffer systematically from those who don’t respond.Examples include:

Mailed surveys for which the return postage isn’t prepaid.Low income people won’t respond.

Phone surveys that take too long. People who are very busydon’t participate or they hang up before the survey iscompleted.

Surveys that have questions about illegal activities (e.g. druguse) or other sensitive topics (e.g. sex) that some peopledon’t answer.

Nels Grevstad

Page 54: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Examples of bias in sampling (cont’d):

Nonresponse bias - One or more survey questions areskipped by some of the individuals surveyed (itemnon-response) or some surveys aren’t returned at all(unit non-response). The individuals who respond maydiffer systematically from those who don’t respond.Examples include:

Mailed surveys for which the return postage isn’t prepaid.Low income people won’t respond.

Phone surveys that take too long. People who are very busydon’t participate or they hang up before the survey iscompleted.

Surveys that have questions about illegal activities (e.g. druguse) or other sensitive topics (e.g. sex) that some peopledon’t answer.

Nels Grevstad

Page 55: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Examples of bias in sampling (cont’d):

Nonresponse bias - One or more survey questions areskipped by some of the individuals surveyed (itemnon-response) or some surveys aren’t returned at all(unit non-response). The individuals who respond maydiffer systematically from those who don’t respond.Examples include:

Mailed surveys for which the return postage isn’t prepaid.Low income people won’t respond.

Phone surveys that take too long. People who are very busydon’t participate or they hang up before the survey iscompleted.

Surveys that have questions about illegal activities (e.g. druguse) or other sensitive topics (e.g. sex) that some peopledon’t answer.

Nels Grevstad

Page 56: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Avoiding Selection Bias Via Simple Random Sampling

We can avoid selection bias by taking asimple random sample (or SRS) of size n, i.e. a samplechosen in such a way that every possible size-n subset ofthe population has the same chance of being selected.

To take a SRS, either:

1. Place the names of individuals from the population in ahat, and draw n of them without looking, or

2. Create a list (called a sampling frame) of all theindividuals in the population, and then use a computerrandom sample generator to select n individuals fromthe list.

Nels Grevstad

Page 57: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Avoiding Selection Bias Via Simple Random Sampling

We can avoid selection bias by taking asimple random sample (or SRS) of size n, i.e. a samplechosen in such a way that every possible size-n subset ofthe population has the same chance of being selected.

To take a SRS, either:

1. Place the names of individuals from the population in ahat, and draw n of them without looking, or

2. Create a list (called a sampling frame) of all theindividuals in the population, and then use a computerrandom sample generator to select n individuals fromthe list.

Nels Grevstad

Page 58: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Avoiding Selection Bias Via Simple Random Sampling

We can avoid selection bias by taking asimple random sample (or SRS) of size n, i.e. a samplechosen in such a way that every possible size-n subset ofthe population has the same chance of being selected.

To take a SRS, either:

1. Place the names of individuals from the population in ahat, and draw n of them without looking, or

2. Create a list (called a sampling frame) of all theindividuals in the population, and then use a computerrandom sample generator to select n individuals fromthe list.

Nels Grevstad

Page 59: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Avoiding Selection Bias Via Simple Random Sampling

We can avoid selection bias by taking asimple random sample (or SRS) of size n, i.e. a samplechosen in such a way that every possible size-n subset ofthe population has the same chance of being selected.

To take a SRS, either:

1. Place the names of individuals from the population in ahat, and draw n of them without looking, or

2. Create a list (called a sampling frame) of all theindividuals in the population, and then use a computerrandom sample generator to select n individuals fromthe list.

Nels Grevstad

Page 60: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Avoiding Selection Bias Via Simple Random Sampling

We can avoid selection bias by taking asimple random sample (or SRS) of size n, i.e. a samplechosen in such a way that every possible size-n subset ofthe population has the same chance of being selected.

To take a SRS, either:

1. Place the names of individuals from the population in ahat, and draw n of them without looking, or

2. Create a list (called a sampling frame) of all theindividuals in the population, and then use a computerrandom sample generator to select n individuals fromthe list.

Nels Grevstad

Page 61: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Most of the statistical inference procedures we’ll look atwere developed for use with SRSs.

The procedures that are used to compare two groups canalso be used with randomized experiments.

Nels Grevstad

Page 62: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Most of the statistical inference procedures we’ll look atwere developed for use with SRSs.

The procedures that are used to compare two groups canalso be used with randomized experiments.

Nels Grevstad

Page 63: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Other Sampling Schemes

Sometimes, though, other sampling schemes are used.These include:

Systematic random sampling - A sampling frame (i.e. listof all individuals from the population) is created, and thenevery kth individual from the list is selected, starting from arandomly selected individual among the first k on the list.Using a larger value of k will result in a smaller sample size.

Stratified random sampling - The population is first splitinto subpopulations called strata (e.g. based on age group,gender, or income bracket), and then a separate SRS isselected from each stratum. Care should be taken not tooversample from any particular stratum.

Nels Grevstad

Page 64: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Other Sampling Schemes

Sometimes, though, other sampling schemes are used.These include:

Systematic random sampling - A sampling frame (i.e. listof all individuals from the population) is created, and thenevery kth individual from the list is selected, starting from arandomly selected individual among the first k on the list.Using a larger value of k will result in a smaller sample size.

Stratified random sampling - The population is first splitinto subpopulations called strata (e.g. based on age group,gender, or income bracket), and then a separate SRS isselected from each stratum. Care should be taken not tooversample from any particular stratum.

Nels Grevstad

Page 65: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Other Sampling Schemes

Sometimes, though, other sampling schemes are used.These include:

Systematic random sampling - A sampling frame (i.e. listof all individuals from the population) is created, and thenevery kth individual from the list is selected, starting from arandomly selected individual among the first k on the list.Using a larger value of k will result in a smaller sample size.

Stratified random sampling - The population is first splitinto subpopulations called strata (e.g. based on age group,gender, or income bracket), and then a separate SRS isselected from each stratum. Care should be taken not tooversample from any particular stratum.

Nels Grevstad

Page 66: Metropolitan State University of Denver ngrevsta@msudenversites.msudenver.edu/ngrevsta/wp-content/uploads/sites/416/2019/08… · Introduction (1.1, 1.2) Statistics is the science

IntroductionCollecting Data: Sampling

Other Sampling Schemes

Sometimes, though, other sampling schemes are used.These include:

Systematic random sampling - A sampling frame (i.e. listof all individuals from the population) is created, and thenevery kth individual from the list is selected, starting from arandomly selected individual among the first k on the list.Using a larger value of k will result in a smaller sample size.

Stratified random sampling - The population is first splitinto subpopulations called strata (e.g. based on age group,gender, or income bracket), and then a separate SRS isselected from each stratum. Care should be taken not tooversample from any particular stratum.

Nels Grevstad