in 2007, deaths of a large number of pet dogs and cats were ultimately traced to contamination of...
TRANSCRIPT
In 2007, deaths of a large number of pet dogs and cats In 2007, deaths of a large number of pet dogs and cats were ultimately traced to contamination of some brands of were ultimately traced to contamination of some brands of pet food. The manufacturer pet food. The manufacturer NOWNOW claims that the food is claims that the food is safe, but before it can be released, an experiment to test safe, but before it can be released, an experiment to test whether the food is now safe for dogs and cats to eat whether the food is now safe for dogs and cats to eat must be conducted.must be conducted.
A group of A group of 32 dog owners have volunteered 32 dog owners have volunteered their their pets for this experimental study. pets for this experimental study. Of the 32 dogs, 16 Of the 32 dogs, 16 are poodles and 16 are German shepherds. are poodles and 16 are German shepherds. The The dogs will eat the assigned food for a period of 6 weeks. dogs will eat the assigned food for a period of 6 weeks.
We believe that because of differences in body size, We believe that because of differences in body size, the the two different breeds may be affected two different breeds may be affected differently differently by potential contaminants in the dog food.by potential contaminants in the dog food.
ExplainExplain how you would carry out a how you would carry out a
completely randomized completely randomized experimentexperiment to see if the new food is safe for dogs to eat.to see if the new food is safe for dogs to eat.
Group of 32 dogs
Group Group 1: 1:
16 dogs
Treatment Treatment 1: 1: Dogs eat new food for 6 weeks
Group 2: Group 2: 16 dogs
Treatment Treatment 2: 2: Dogs eat “safe” food for 6 weeks
Compare health of dogs, to be evaluated by veterinarian
Remember: completelycompletely randomized experiment
means NO BLOCKING ALLOWED!!! NO BLOCKING ALLOWED!!!
We will number the dogs from 01 to 32, then use a We will number the dogs from 01 to 32, then use a random number generator (or table) to select 16 random number generator (or table) to select 16 dogs (ignoring repeated numbers) for treatment dogs (ignoring repeated numbers) for treatment group 1 (new food from the company). The rest of group 1 (new food from the company). The rest of the dogs will be placed in treatment group 2 (“safe” the dogs will be placed in treatment group 2 (“safe” food).food).
AP Grading Criteria: AP Grading Criteria: If If twotwo knowledgeable knowledgeable statistics users read your description, will they use statistics users read your description, will they use
the the samesame method to assign experimental units method to assign experimental units to treatments?to treatments?
Of the 32 dogs, 16 are poodles and 16 are German shepherds (we believe different breeds may react differently to contaminants in the food). Explain the changes you would make to your previous design by
incorporating blocking.incorporating blocking.
32 32 dogsdogs
Group Group 1: 1:
8 dogs
Treatment Treatment 11Dogs eat new food for 6 weeks
Treatment Treatment 22Dogs eat “safe” food for 6 weeks
Compare health of dogs
Block A: Block A: 16 poodles
Block B: Block B: 16 German shepherds
Group Group 2: 2:
8 dogs
Group Group 3: 3:
8 dogs
Group Group 4: 4:
8 dogs
Treatment Treatment 11Dogs eat new food for 6 weeks
Treatment Treatment 22Dogs eat “safe” food for 6 weeks
Compare health of dogs
Group of 40
volunteers
Group 1: Group 1: 20
patients
Treatment Treatment 1: 1: Patient takes
the new pill
Group 2: Group 2: 20
patients
Control:Control:Patient gets
placebo
Compare numbers of headaches…
NEVER call your NEVER call your subjects a subjects a
“random sample” “random sample” unless you KNOW for a FACT that they really were a random
sample of the population.
With experiments, you are almost alwaysalways dealing with
VOLUNTEERS (think about it!)
NEVER call your NEVER call your subjects a subjects a
“random sample” “random sample” unless you KNOW for a FACT that they really were a random
sample of the population.
With experiments, you are almost alwaysalways dealing with
VOLUNTEERS (think about it!)
• pair up experimental units according to similar characteristics
• randomly assign one to one treatment & the other automatically gets the 2nd treatment
• Or have each unit do both treatments in random order (such as before/after, or a taste test with Coke/Pepsi)
• the assignment of treatments is dependent
Matched pairsMatched pairsa special type of block designa special type of block design
Next, randomlyrandomly assign one unit from a pair to Treatment A. The other unit gets Treatment B.
Treatment ATreatment A Treatment BTreatment B
This is one way to do a matched pairs design – another way is to have each individual unit do both treatments (as in a taste test).
PairPair experimental
units according according to specific to specific
characteristicscharacteristics.
Treatment ATreatment A Treatment BTreatment B
In each pair, assign one unit the number “1” and the other the # “2”.•In each block (pair), we will flip a fair coin such that if the side of the coin facing up is…•““heads”heads”, #1 will get treatment A (and #2 will get treatment B)
•““tails”tails”, #2 will get treatment A (and #1 will get treatment B)
1 21 2
1 21 21 21 2
1 21 2
Just make sure you flip a coin for EACH pair!Just make sure you flip a coin for EACH pair!
Do Do notnot write: write:•If we flip “heads”, then ALLALL of the #1’s get treatment A (and ALL of the #2’s get treatment B)…
•You must give each #1 (and #2) a fair fair chance of going either waychance of going either way.
Do Do notnot write: write:•If we flip “heads”, then ALLALL of the #1’s get treatment A (and ALL of the #2’s get treatment B)…
•You must give each #1 (and #2) a fair fair chance of going either waychance of going either way.
(shampoo worksheet)(shampoo worksheet)
Slide 3- 12
Displaying and Describing Displaying and Describing
Categorical DataCategorical Data
types of cars
color of hair
gender
grade level?Chapter 3Chapter 3
Slide 3- 13
The Three Rules of Data AnalysisThe Three Rules of Data Analysis
The three rules of data analysis won’t be difficult to remember:
1.1. Make a pictureMake a picture — things may be revealed that are not obvious in the raw data. These will be things to think about.
2.2. Make a pictureMake a picture — important features of and patterns in the data will show up. You may also see things that you did not expect.
3.3. Make a pictureMake a picture — the best way to tell others about your data is with a well-chosen picture.
Slide 3- 14
Slide 3- 15
Slide 3- 16
Launched: 31st May 1911Builders: Harland and Wolff,
BelfastPort of Registry: Liverpool Passengers Lost: 818 (62%)Crew Lost: 684 (77%)Total Lost: 1,502 (68%)
Slide 3- 17
Frequency Tables: Making PilesFrequency Tables: Making Piles Records countscounts and category names.
Slide 3- 18
Relative Frequency TablesRelative Frequency Tables Percentages (proportions) instead of counts.
Slide 3- 19
Both describe the distribution of a categorical variable.
DistributionDistribution: : name of categories and how frequently each occurs
Frequency distribution Relative frequency distribution
Slide 3- 20
What do you see?
When we look at each
ship, we see the areaarea taken up by the ship,
instead of the lengthlength of the ship.
this is a this is a ViolationViolation of of
the “the “Area Principle”Area Principle”this is a this is a ViolationViolation of of
the “the “Area Principle”Area Principle”
Slide 3- 21
Bar ChartsBar Charts A bar chart displays the distribution of a categorical
variable, showing the counts for each category next to each other for easy comparison.
A bar chart stays true to the area principle.
For bar charts (with categorical data), be sure to leave
spacesspaces between the bars!!!
For bar charts (with categorical data), be sure to leave
spacesspaces between the bars!!!
Slide 3- 22
Bar ChartsBar Charts A relative frequency bar chart displays the
relative proportion of counts for each category.
Slide 3- 23
Pie ChartsPie ChartsWhen you are interested in parts of the whole,
a pie chart might be your display of choice.
Slide 3- 24
While some people might like the pie chart on the left better, it is harder to compare fractions of the whole, which a well-done pie chart does.
What Can Go Wrong?What Can Go Wrong?
Slide 3- 25
What Can Go Wrong? What Can Go Wrong? This plot of the percentage of high-school students who engage in specified dangerous behaviors has a problem. Can you see it?
if you are making a pie chart with
percentages percentages (or
proportions), make sure the make sure the percentages add up to percentages add up to 100%!!!100%!!!
if you are making a pie chart with
percentages percentages (or
proportions), make sure the make sure the percentages add up to percentages add up to 100%!!!100%!!!
back to the Titanic…back to the Titanic…
Slide 3- 27
A contingency tablecontingency table allows us to look at two categorical variables together.
FirstSecon
d Third Crew Total
Alive203 118 178 212 711
Dead122 167 528 673 1490
Total325 285 706 885 2201
Class
Su
rviv
al
marginal distributionsmarginal distributions
Slide 3- 28
What percent of the people on the Titanic died?
What percent of the people were surviving crew?
*What percent of the survivors were First class?
*What percent of First class survived?
1490/2201 = 67.7%
212/2201 = 9.6%
203/711 = 28.6%
203/325 = 62.5%
Slide 3- 29
A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable.
Slide 3- 30
Conditional Distributions The conditional distributions tell us that there is a
difference in class for those who survived and those who perished.
Pie charts of the two
distributions:
Slide 3- 31
We see that the distribution of Classdistribution of Class for the survivorssurvivors is differentdifferent from that of the non-survivorsnon-survivors…
so classclass and survivalsurvival are associatedassociated
(they are dependentdependent).
Slide 3- 32
independent = no associationindependent = no association
dependent = associationdependent = association
The variables would be considered independentindependent if the distribution of one variable were the same for same for all categories of the other variable.all categories of the other variable.
Slide 3- 33
Segmented Bar ChartsSegmented Bar Charts A segmented bar chart segmented bar chart
displays the same information as a pie chart, but in the form of bars instead of circles.
Pro
port
ion
Slide 3- 34
Here’s a look at Here’s a look at gendergender versus versus level of education level of education in the fictitious town of Podunk (home of Podunk University!)in the fictitious town of Podunk (home of Podunk University!)
Not High School
Graduate
High School
Graduate*
College Graduat
e Total
Male 31829.3%
60355.5%
16515.2%
1086100%
Female 21229.3%
40255.5%
11015.2%
724100%
Total 53029.3%
100555.5%
32515.2%
1800100%
Level of Education
Gen
der
The distributions for each gender are the same, so
gendergender is independentindependent of level of educationlevel of education.
The distributions for each gender are the same, so
gendergender is independentindependent of level of educationlevel of education.
*and not a college graduate
(no association)(no association)
Slide 3- 35
Podunk – Level of Education by GenderPodunk – Level of Education by Gender
Not High
School Graduat
e
High School
Graduate*
College Graduat
e Total
Male318
29.3%603
55.5%165
15.2%1086100%
Female
21229.3%
40255.5%
11015.2%
724100%
Total530
29.3%1005
55.5%325
15.2%1800100%
NotHigh
SchoolGraduate
Male Female
HighSchool
Graduate(but not collegegrad)
CollegeGraduate
NotHigh
SchoolGraduate
HighSchool
Graduate(but not collegegrad)
CollegeGraduate
Gender is
independentindependent of level of education (nono association)
Slide 3- 36
In which region do the greatest number of people wear seatbelts?
Slide 3- 37
The MidwestMidwest has the smallestsmallest proportion proportion of car drivers wearing seat belts (about 62%) where the SouthSouth andand WestWest have the largestlargest proportionproportion (about 78- 80%).
Overall, the bar chart shows that all four regions of the country have more than 60% of car drivers wearing seat belts.
Note: we are using the word “proportion” (or “percentage”)…
…NOT the word “number”
Note: we are using the word “proportion” (or “percentage”)…
…NOT the word “number”
Slide 3- 38
What Can Go Wrong?
Be sure to use enough individuals!
Do not make a report like “We found that
66.67% of the rats improved their performance with
training.
The other rat died.”
Slide 3- 39
What Can Go Wrong? (cont.) Don’t use unfair or silly averages~
we need we need datadata for next time! for next time!(average hair length)(average hair length)