know the symbols and the meanings drawing conclusions about a population on the basis of observing...
TRANSCRIPT
![Page 1: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/1.jpg)
CHAPTER 7: SURVEY
SAMPLING & INFERENCE
![Page 2: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/2.jpg)
CLASS SURVEYS...
![Page 3: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/3.jpg)
STATISTIC VS. PARAMETER… Know the symbols and the meanings
![Page 4: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/4.jpg)
STATISTICAL INFERENCE … Drawing conclusions about a population
on the basis of observing only a small subset of that population
Always involves some uncertainty
Does a given sample represent a particular population accurately?
Is the sample systemically ‘off?’ By a lot? By a little?
![Page 5: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/5.jpg)
WHAT COULD POSSIBLY GO WRONG? BIAS… Recall, bias is…
Being systematically ‘off;’ scale 5 pound heavy; clock 10 minutes fast, etc.
Textbook explains two different types of bias: measurement bias & sampling bias
Don’t need to know if a particular bias is measurement or sampling; just need to know the concept of bias
Let’s discuss…
![Page 6: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/6.jpg)
BAD SAMPLING TECHNIQUES: VOLUNTARY RESPONSE SAMPLING…• People who choose themselves by responding
to a general appeal
• Biased because people with strong opinions, especially negative opinions, are most likely to respond
• Often very misleading
• With a partner, come up with one real-life example of voluntary response sampling with which you are familiar; 30 seconds & share out
![Page 7: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/7.jpg)
BAD SAMPLING TECHNIQUES: VOLUNTARY RESPONSE SAMPLING…
TV, radio, on-line sites pose a question & listener/viewer call/text in a response
DWTS, America’s Next Top Model, etc.
![Page 8: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/8.jpg)
BAD SAMPLING TECHNIQUES: CONVENIENCE SAMPLING….
Convenience Sampling: Choosing individuals who are easiest to reach.
With a partner, think of a real-life example of convenience sampling. 30 seconds, then share out.
![Page 9: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/9.jpg)
BAD SAMPLING TECHNIQUES: CONVENIENCE SAMPLING…
Example: Mall interviews• Not representative; mall-goers have $$
• Interviewers tend to choose ‘safe’ people to interview
• May not reflect views of all consumers
• Worthless data
![Page 10: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/10.jpg)
BIAS... SITUATIONAL EXAMPLES…• Conducting a survey asking if people
believe in God at various Christian churches
• Taking a poll at a variety of liquor stores that asks if those customers drink alcohol
• Surveying a random sample of gun and ammunition store customers if they support the right to bear arms
![Page 11: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/11.jpg)
BIAS... Convenience sampling & voluntary
sampling are both bias sampling methods
How do we minimize/eliminate bias?
Let impersonal chance/randomness do the choosing; more on this...
![Page 12: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/12.jpg)
RANDOM SAMPLING... (remember, in voluntary response people
chose to respond; in convenience sampling, interviewer made choice; in both situations, personal choice creates bias)
Simple Random Sample (SRS) – type of probability or random sample
SRS – chance selects the sample
The use of chance selecting the sample is the essential principle of statistical sampling
![Page 13: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/13.jpg)
RANDOM SAMPLING... Several different types of random sampling,
all involve chance selecting the sample
Choosing samples by chance gives all individuals an equal chance to be chosen
We will focus on Simple Random Sampling (SRS)
SRS ensures that every set of n individuals has an equal chance to be in the sample/actually selected
![Page 14: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/14.jpg)
SIMPLE RANDOM SAMPLE (SRS)
Easiest ways to use chance/SRS:• Names in a hat
• Random digits generator in calculator or Minitab
• Random digits table
![Page 15: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/15.jpg)
SIMPLE RANDOM SAMPLERandom digits table: table of random digits,
long string of digits 0, 1, ..., 9 in which:
• Each entry in the table is equally likely to be 0 – 9
• Entries are independent. Knowing the digits in one point of the table gives no information about another part of the table
• Table in rows & columns; read either way (but usually rows); groups & rows – no meaning; just easier to read
![Page 16: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/16.jpg)
RANDOM DIGITS TABLE...
0 – 9 equally likely 00 – 99 equally likely 000 – 999 equally likely
![Page 17: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/17.jpg)
RANDOM DIGITS TABLE
Joan’s small accounting firm serves 30 business clients. Joan wants to interview a sample of 5 clients to find ways to improve client satisfaction. To avoid bias, she chooses a SRS of size 5.
![Page 18: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/18.jpg)
![Page 19: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/19.jpg)
RANDOM DIGITS TABLE... Enter table at a random row
Notice her clients are numbered (labeled) with 2-digits numbers (if this isn’t already done, you must label your list), so we are going to go by 2-digit number in table
Ignore all 2-digit number that are beyond 30 (our data is numbered from 01 to 30)
Ignore duplicates
Continue until we have 5 distinct 2-digit numbers chosen & identify who those clients are
![Page 20: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/20.jpg)
CHOOSING A SRS: 4 STEPS1. Label... Assign a numerical label to every
individual
2. Random Digits Table (or Minitab or names in hat)... Select labels at random
3. Stopping Rule ... Indicate when you should stop sampling
4. Identify Sample ... Use labels to identify subjects/individuals selected to be in the sample
![Page 21: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/21.jpg)
REMINDERS.... CAUTIONS....IF USING RDT Be certain all labels have the same # of
digits if using RDT(ensures individuals have the same chance to be chosen)
Use shortest possible label, i.e., 1 digit for populations up to 10 members (can use labels from 0 to 9), 2 digits for populations from 11 – 100 members (can use labels from 00 to 99), etc. --this is just a good standard of practice...
![Page 22: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/22.jpg)
SRS OF OUR CLASS... FOR CANDY Label students; what labels should we use?
Label candy; what labels should we use?
SRS of 3 students using Random Digits Table; enter table on line ___
SRS of 3 students using my Minitab (will my Minitab be different from your Minitab?)
Should we allow duplicates? When should we and when should we not?
![Page 23: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/23.jpg)
CAUTIONS ABOUT SAMPLE SURVEYS...
Most samples suffer from some degree of under coverage (another type of bias)
What is bias again?? ....
... Bias is systemically favoring a particular outcome
Under coverage occurs when a group(s) is left out of the process of choosing the sample somehow/entirely
![Page 24: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/24.jpg)
UNDERCOVERAGE...Talk in your groups and come up with an
example of under coverage (some groups in population are left out of the process of choosing the sample)
Examples... Household surveys will miss students in dorms, prison inmates, the homeless
Telephone surveys will miss those without phones; how about those with unlisted phone numbers?
![Page 25: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/25.jpg)
CAUTIONS ABOUT SAMPLE SURVEYS... Another source of bias in many/most sample
surveys is non-response, when a selected individual cannot be contacted or refuses to cooperate
Big problem; happens very often, even with aggressive follow-up
Almost impossible to eliminate non-response; we can just try to minimize as much as possible
Note: Most media polls won’t/don’t tell us the rate of non-response
![Page 26: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/26.jpg)
CAUTION ABOUT SAMPLE SURVEYS...
Response bias...occurs when respondents are untruthful, especially if asked about illegal or unpopular beliefs or behaviors
Example: Salaries, amount/frequency of alcohol consumed, jail time, use of illegal drugs, weight, age, votes or not, etc.
![Page 27: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/27.jpg)
VIDEO CLIP OF JIMMY KIMMEL LIVEWho won the “First Lady Debate?”
http://www.youtube.com/watch?v=EohGmG-QUhA
http://perezhilton.com/tv/JIMMY_KIMMEL_LIVE_Who_Won_The_Presidential_Debate_BEFORE_It_Even_Happened/?id=79b7cb451ec00#.
Vb-lN_NViko
http://www.mrctv.org/videos/kimmel-public-weighs-first-lady-debate
![Page 28: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/28.jpg)
WORDING IN SAMPLE SURVEYS... MORE POSSIBLE BIAS
Should we ban disposable diapers?
A survey paid for by makers of disposable diapers found that 84% of the sample opposed banning disposable diapers. Here’s the actual question:
It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. In contract, beverage containers, third-class mail and yard wastes are estimated to account for about 21% of the trash in landfills. Given this, in your opinion, would it be fair to ban disposable diapers?
Remember our survey ...
![Page 29: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/29.jpg)
CAUTION ABOUT SAMPLE SURVEYS...
Even if we use SRS/probability sampling, and we are very careful in reducing bias as much as possible, the statistics we get from a given sample will likely be different from the statistics we get from another sample.
Statistics vary from sample to sample
We can improve our results (our sample statistic can get closer to what our population parameter actually is) by increasing our random sample size
Remember samples vary; parameters are fixed
![Page 30: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/30.jpg)
CAUTION ABOUT SAMPLE SURVEYS…
But remember no matter how much we increase/how large our sample is, a large sample size does not ‘fix’ underlying issues, like bad wording, under coverage, convenience sampling, etc.
![Page 31: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/31.jpg)
QUESTIONS TO ASK YOURSELF BEFORE YOU BELIEVE A POLL/SURVEY...
Who carried out the survey? How was the sample selected? How large was the sample? What was the response rate? How were the subjects contacted? When was the survey conducted? What was the exact question
asked?
![Page 32: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/32.jpg)
MEASURING THE QUALITY OF A SURVEY…BIAS & VARIABILITY True value of population parameter (p or
μ) is like a bull’s eye on a target
Sample statistic ((, , etc.) is like an arrow fired at the target; sometimes it hits the bull’s eye and sometimes it misses
Keep in mind… we (very) often don’t know how close to the bull’s eye we are…
![Page 33: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/33.jpg)
BIAS & VARIABILITY (AIM & PRECISION)When we take many samples from a population (sampling
distribution), bias & variability can look like the following:
![Page 34: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/34.jpg)
BIAS & VARIABILITY (AIM & PRECISION)... We want: low bias & low variability (good aim
& good precision)
Properly chosen statistics computed from random samples of sufficient size will have low bias & low variability (good aim & precision)
Hits the bull’s eye on the target
Can’t eliminate bias & variability (bad aim & precision); can just do all that we can to reduce bias & variability
![Page 35: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/35.jpg)
SAMPLING DISTRIBUTIONS...
The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
![Page 36: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/36.jpg)
GROUP/PARTNER ACTIVITY... The population we will consider is the
scores of 10 students on an exam as follows:
The parameter of interest is the mean score in this population, which is 69.4.
The sample is a SRS drawn from the population.
![Page 37: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/37.jpg)
Let’s use RDT (enter at a random line) to draw an SRS of size n = 4 from this population. Calculate the mean of the sample scores. This statistic is an estimate of the population parameter.
Repeat this process 4 times. Write your 4 ’s on the board
Input all ’s written on the board into Minitab & create a histogram. You are constructing the sampling distribution of .
What is the approximate value of the center of your histogram? What is the shape of this histogram?
![Page 38: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/38.jpg)
SIMULATED SAMPLING DISTRIBUTION …
![Page 39: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/39.jpg)
PRECISION OF AN ESTIMATOR… The precision (variability; how much it
varies) of an estimator does not depend on the size of the population; it depends only on the sample size. An estimator based on a sample size of 10 is just as precise in a population of 1000 people as in a population of a million.
Let’s explore this idea more…
![Page 40: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/40.jpg)
DESCRIBING SAMPLING DISTRIBUTIONS...
Consider 1000 SRS’s of n = 100 for proportion of U.S. adults who watched Survivor Guatemala in 2005
Discuss in your groups some observations you have about this distribution; share out.
![Page 41: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/41.jpg)
SOCS; symmetric, uni-modal, no outliers, center at about 0.37, spread from about 0.21 to about 0.53, ≈ Normal
(not ); some are low; some are high; most are about 0.37
Remember, this is data from sampling distribution; center of sampling distribution (a statistic, not a parameter) = 0.37
In reality, we rarely know population parameters (center p, or spread σ); why? Discuss.
![Page 42: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/42.jpg)
n = 100
(1000 SRS for both
sampling distributions)
n = 1000
What do you notice?
![Page 43: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/43.jpg)
BIG IDEA FOR LARGER N...Given that sampling randomization is used
properly, the larger the SRS size ( n ), the smaller the spread (the more tightly clustered; the more precise) of the sampling distribution.
Center doesn’t change significantly
Shape doesn’t change significantly
Spread (range, standard deviation) does change significantly
![Page 44: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/44.jpg)
SPREAD OF SAMPLING DISTRIBUTION VS. SIZE OF POPULATION
Red, white, & blue marbles (1/3 each) in a 64 ounce cup; well mixed
VERSUSRed, white, & blue marbles (1/3 each) in a
large cargo shipping container; well mixed
Variability (spread, standard deviation) of % of red marbles depends only on size of my scoop (SRS size; n); not the population size
Teaspoon size SRS out of cargo container is not going to vary less than teaspoon size SRS out of 64 ounce cup
![Page 45: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/45.jpg)
RED, WHITE, & BLUE MARBLES (1/3 EACH) IN A 128 OUNCE CUP; WELL MIXED VERSUSRED, WHITE, & BLUE MARBLES (1/3 EACH) IN A LARGE CARGO SHIPPING CONTAINER; WELL MIXED
SRS of scoop size 4 ounces out of either (cup or cargo container) will vary equally
SRS of scoop size 6 ounces will have less variability than SRS of 4 ounces
SRS of scoop size 8 ounces out of either (cargo container or cup) will vary equally
Variability/precision does not depend of the size of the population but rather the size of the SRS
![Page 46: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/46.jpg)
ANOTHER EXAMPLE... A SRS of 2,500 from U.S. population of
300 million is going to be just as accurate/same amount of variability/precision as that same size SRS of 2,500 from 750,000 San Francisco population
Both just as precise (given that population is well-mixed); equally trustworthy
Not about population size; it’s about sample size (n)
![Page 47: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/47.jpg)
BOWL SIZE IS POPULATION…
![Page 48: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/48.jpg)
SO WHY NOT JUST HAVE REALLY LARGE SAMPLE SIZES ALL THE TIME?
Increased sample size improves precision/ reduces variability
Surveys based on larger sample sizes have smaller standard error (SE) and therefore better precision (less variability)
Trade-offs…
Cost increases, time-consuming, etc.
![Page 49: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/49.jpg)
UNBIASED ESTIMATORS… IF Conditions are met:
Sample randomly selected (with or without replacement
If sampling without replacement, population must be at least 10 times the sample size (rule of thumb)
![Page 50: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/50.jpg)
REMEMBER THE NORMAL DISTRIBUTION? Life was good and easy with the Normal
distribution Could easily calculate probabilities Good working model If we could use the Normal distribution
with sampling distributions for proportions, life would be great
Guess what? We can. Meet the Central Limit Theorem
![Page 51: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/51.jpg)
CENTRAL LIMIT THEOREM (CLT) Has many versions (one for proportions,
one for means, etc.)
Let’s discuss proportions for now
To use CLT with proportions, three conditions must be met
![Page 52: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/52.jpg)
CONDITIONS FOR USING CLT (NORMAL DISTRIBUTION) WITH PROPORTIONS…
Random and independent (samples collected randomly from population and observations independent)
Large sample; np ≥ 10 and n(1 – p) ≥ 10; proportion of expected successes and failures at least 10
Big population (population at least 10 times sample size)
![Page 53: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/53.jpg)
![Page 54: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/54.jpg)
CONFIDENCE INTERVALS...WHAT PROPORTION OF ALL COC STUDENTS HAVE AT LEAST ONE TATTOO?
What proportion of us have at least one tattoo? So sample statistic, our =
If we were to ask another group of COC students, we would
get another (likely different)
445 Math 075 students were asked this last Spring; 133/445 = 0.299 = 29.9% had at least one tattoo
Remember, larger n, generally less variation; but still centered at same value (unbiased estimator)
We want to be able to say with a high level of certainty what proportion of all COC students have at least one tattoo. But we don’t know the true, unknown population parameter, p.
![Page 55: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/55.jpg)
SO WHAT CAN WE SAY ABOUT OUR COC STUDENT POPULATION AND THEIR TATTOOS?
We don’t know p (population parameter) We do know (sample statistic) Our estimator is unbiased (what does
that mean?) SD (SE) = ? Sample size is large; ‘randomly’
selected; big population So.... our distribution is ≈ Normal,
centered around ; 68% with 1 SD; 95% within 2 SDs; 99.7% within 3 SDs
![Page 56: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/56.jpg)
SO WHAT CAN WE SAY ABOUT OUR COC STUDENT POPULATION? Our distribution is ≈ Normal, centered
around ; 68% with 1 SD; 95% within 2 SDs; 99.7% within 3 SDs
So, we are highly confident (95%) that the unknown population parameter, the proportion of all COC students that have at least one tattoo, is between ___ and ___.
This is a confidence interval
![Page 57: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/57.jpg)
SO WHAT CAN WE SAY ABOUT OUR COC STUDENT POPULATION? Our distribution is ≈ Normal, centered
around ; 68% with 1 SD; 95% within 2 SDs; 99.7% within 3 SDs
So, we are highly confident (99.7%) that the unknown population parameter, the proportion of all COC students that have at least one tattoo, is between ___ and ___.
This is a confidence interval
![Page 58: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/58.jpg)
STATISTICAL INFERENCE... Statistical inference provides methods for
drawing conclusions about a population based on sample data
Methods used for statistical inference assume that the data was produced by properly randomized design
Confidence intervals, which are based on sampling distributions of statistics, now; then will discuss Hypothesis Testing (another form of inference)
![Page 59: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/59.jpg)
INFERENCE: CONFIDENCE INTERVALS... Estimator ± margin of error (MOE)
Margin of error tells us amount we are most likely ‘off’ with our estimate
Margin of error helps account for sampling variability (NOT any of the bias’ we discussed...voluntary response, non-response, et.)
![Page 60: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/60.jpg)
CONFIDENCE LEVELS: HOW CONFIDENT ARE YOU... that the average temperature in Santa
Clarita in degrees Fahrenheit is between -50 and 150?
that the average temperature in Santa Clarita in degrees Fahrenheit is between 70 and 70.001?
![Page 61: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/61.jpg)
CONFIDENCE LEVELS: HOW CONFIDENT ARE YOU... that the average temperature in Santa
Clarita in degrees Fahrenheit is between -50 and 150?
that the average temperature in Santa Clarita in degrees Fahrenheit is between 70 and 70.001?
In general, large interval high confidence level; small interval lower confidence level
![Page 62: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/62.jpg)
TYPICAL CONFIDENCE LEVELS... 99% confidence level
95% confidence level
90% confidence level
Typically we want both: a reasonably high confidence level AND a reasonably small interval; but there are trade-offs; more on this in a little bit
![Page 63: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/63.jpg)
WE ARE 95% CONFIDENT IN OUR METHOD... GIVES CORRECT RESULTS 95% OF THE TIME...
![Page 64: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/64.jpg)
CONFIDENCE INTERVALS... Will we ever know for sure if we
captured the true unknown population parameter p? No. Actual p is unknown.
Memorize: “I am ___% confident that the true, unknown population proportion of (context) is between ____ and ____.”
![Page 65: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/65.jpg)
PRACTICE: TRUE OR FALSE? There is a 95% probability (chance) that
the interval from 0.80 to 0.92 contains p.
![Page 66: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/66.jpg)
PRACTICE: TRUE OR FALSE? There is a 95% probability (chance) that
the interval from 0.80 to 0.92 contains p.
False; The probability is either 0 or 1 (but we don’t know which)
![Page 67: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/67.jpg)
PRACTICE: TRUE OR FALSE There is a 95% chance that the interval
(0.17, 0.24) contains
![Page 68: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/68.jpg)
PRACTICE: TRUE OR FALSE There is a 95% chance that the interval
(0.17, 0.24) contains
False. The general form of a CI is ± MOE. So, will always be in the center of the CI.
![Page 69: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/69.jpg)
PRACTICE: TRUE OR FALSE? There’s a 95% probability that p = .112
![Page 70: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/70.jpg)
PRACTICE: TRUE OR FALSE? There’s a 95% probability that p = .112
False. Never use this wording. Don’t use ‘probability;’ there’s no mention of the CI either
![Page 71: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/71.jpg)
PRACTICE: TRUE OR FALSE? We are 95% confident that the true,
unknown population parameter, p, of freshmen who have genius-level IQ scores from this particular university, is between 0.010 and 0.018
![Page 72: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/72.jpg)
PRACTICE: TRUE OR FALSE? We are 95% confident that the true,
unknown population parameter, p, of freshmen who have genius-level IQ scores from this particular university, is between 0.010 and 0.018
True; perfect wording for CI interpretations. Doesn’t use word “probability,” context, confidence level.
![Page 73: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/73.jpg)
TO REVIEW...WHAT EFFECTS THE LENGTH OF CI’S? The lower the confidence level (say 10%
confident), the shorter the CI The higher the confidence level (say
99% confident), the wider the CI
What else effects the length of the CI? Larger the n, shorter the CI (small MOE) Smaller the n, longer the CI (large MOE)
![Page 74: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/74.jpg)
EXAMPLE... HIGH CI & SMALL MOE So, if you want (need) high confidence level
AND small(er) interval (margin of error), it is possible if you are willing to increase n
Can be expensive, time-consuming
Sometimes not realistic (why?)
In reality, you may need to compromise on the confidence level (lower confidence level)
![Page 75: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/75.jpg)
PRACTICE... Alcohol abuse is considered by some as the
number one problem on a campus. How common is it? A 2001 SRS of 10,904 U.S. college students collected information on drinking behavior and alcohol-related problems. The researchers defined “frequent binge drinking” as having 5 or more drinks in a row 3 or more times in the past 2 weeks. According to this definition, 2486 students were classified as frequent binge drinkers.
Based on these data, what can we say about the proportion of all college students who have engaged in frequent binge drinking?
![Page 76: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/76.jpg)
N = 10,904; X = 2,486; CONFIDENCE LEVEL: 99% Check conditions to create a confidence
interval... Randomization, Normality, Independence
Randomization: SRS stated in problem Normality (via CLT): np ≥ 10; n (1 – p) ≥
10 Independence: population must be at
least 10 times sample size.
![Page 77: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/77.jpg)
N = 10,904; X = 2,486; CONFIDENCE LEVEL: 99% Perform Minitab calculations 1 sample, proportion Options, 99 CL Data, summarized data, events, trials “+”, data labels
Always conclude with interpretation, in contextI am 99% confident that the true, unknown
population parameter, p, the proportion of all college students who have engaged in frequent binge drinking is between 21.8% and 24.9%.
![Page 78: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/78.jpg)
CHOOSING A SAMPLE SIZE... Often researchers choose the MOE & CL
they want ahead of time/before survey
So they need to have a particular n to achieve the MOE and the CL they want.
€
MOE = z *p(1 − p)
n
⎛
⎝ ⎜
⎞
⎠ ⎟
![Page 79: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/79.jpg)
CHOOSING A SAMPLE SIZE... Often researchers choose the MOE A common CL is 95%, so z* ≈ 2 Can solve for n & get formula in
textbook
€
m = 20.5(1− 0.5)
n
⎛
⎝ ⎜
⎞
⎠ ⎟
![Page 80: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/80.jpg)
PRACTICE... CHOOSING A SAMPLE SIZE...A company has received complaints about its
customer service. They intend to hire a consultant to carry out a survey of customers. Before contacting the consultant, the company president wants some idea of the sample size that she will be required to pay for.
One critical question is the degree of satisfaction with the company's customer service. The president wants to estimate the proportion p of customers who are satisfied. She decides that she wants the estimate to be within 3% (0.03) at a 95% confidence level.
![Page 81: Know the symbols and the meanings Drawing conclusions about a population on the basis of observing only a small subset of that population Always](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e215503460f94b0e2cd/html5/thumbnails/81.jpg)
CHOOSING A SAMPLE SIZE... No idea of the true proportion p of
satisfied customers; so use p = 0.5. The sample size required is given by
€
MOE = z *p(1− p)
n
⎛
⎝ ⎜
⎞
⎠ ⎟
0.03 = 20.5(1− 0.5)
n
⎛
⎝ ⎜
⎞
⎠ ⎟