Chapter 1Picturing Distributions with Graphs
BPS - 5th Ed.Chapter 1 1
Statistics
BPS - 5th Ed.Chapter 1 2
Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves the design of the experiment or sampling procedure, the collection and analysis of the data, and making inferences (statements) about the population based upon information in a sample.
Individuals and Variables
Individuals the objects described by a set of data may be people, animals, or things
Variable any characteristic of an individual can take different values for different individuals
BPS - 5th Ed.Chapter 1 3
Variables
Categorical Places an individual into one of several groups or categories
Quantitative (Numerical) Takes numerical values for which arithmetic operations such as
adding and averaging make sense
BPS - 5th Ed.Chapter 1 4
Distribution
Tells what values a variable takes and how often it takes these values
Can be a table, graph, or function
BPS - 5th Ed.Chapter 1 5
Displaying Distributions
Categorical variables Pie charts Bar graphs
Quantitative variables Histograms Stemplots (stem-and-leaf plots)
BPS - 5th Ed.Chapter 1 6
Class Make-up on First Day
BPS - 5th Ed.Chapter 1 7
Year Count Percent
Freshman 18 41.9%
Sophomore 10 23.3%
Junior 6 14.0%
Senior 9 20.9%
Total 43 100.1%
Data Table
Class Make-up on First Day
BPS - 5th Ed.Chapter 1 8
Pie Chart
Class Make-up on First Day
BPS - 5th Ed.Chapter 1 9
Bar Graph
Example: U.S. Solid Waste (2000)
Material Weight (million tons) Percent of total
Food scraps 25.9 11.2 %
Glass 12.8 5.5 %
Metals 18.0 7.8 %
Paper, paperboard 86.7 37.4 %
Plastics 24.7 10.7 %
Rubber, leather, textiles 15.8 6.8 %
Wood 12.7 5.5 %
Yard trimmings 27.7 11.9 %
Other 7.5 3.2 %
Total 231.9 100.0 %
BPS - 5th Ed. Chapter 1 10
Data Table
Example: U.S. Solid Waste (2000)
BPS - 5th Ed.Chapter 1 11
Pie Chart
Example: U.S. Solid Waste (2000)
BPS - 5th Ed.Chapter 1 12
Bar Graph
Examining the Distribution of Quantitative Data
Overall pattern of graph Deviations from overall pattern Shape of the data Center of the data Spread of the data (Variation) Outliers
BPS - 5th Ed.Chapter 1 13
Shape of the Data
Symmetric bell shaped other symmetric shapes
Asymmetric right skewed left skewed
Unimodal, bimodal
BPS - 5th Ed.Chapter 1 14
SymmetricBell-Shaped
BPS - 5th Ed. Chapter 1 15
SymmetricMound-Shaped
BPS - 5th Ed. Chapter 1 16
SymmetricUniform
BPS - 5th Ed. Chapter 1 17
AsymmetricSkewed to the Left
BPS - 5th Ed. Chapter 1 18
AsymmetricSkewed to the Right
BPS - 5th Ed. Chapter 1 19
Outliers
Extreme values that fall outside the overall pattern May occur naturally May occur due to error in recording May occur due to error in measuring Observational unit may be fundamentally different
BPS - 5th Ed.Chapter 1 20
Histograms
For quantitative variables that take many values
Divide the possible values into class intervals (we will only consider equal widths)
Count how many observations fall in each interval (may change to percents)
Draw picture representing distribution
BPS - 5th Ed.Chapter 1 21
Histograms: Class Intervals How many intervals?
One rule is to calculate the square root of the sample size, and round up.
Size of intervals? Divide range of data (maxmin) by number of
intervals desired, and round to convenient number
Pick intervals so each observation can only fall in exactly one interval (no overlap)
BPS - 5th Ed.Chapter 1 22
Case Study
BPS - 5th Ed.Chapter 1 23
Weight Data
Introductory Statistics classSpring, 1997
Virginia Commonwealth University
Weight Data
BPS - 5th Ed.Chapter 1 24
Weight Data: Frequency Table
BPS - 5th Ed. Chapter 1 25
sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width
Weight Data: Histogram
BPS - 5th Ed. Chapter 1 26
100 120 140 160 180 200 220 240 260 280Weight
* Left endpoint is included in the group, right endpoint is not.
Nu
mb
er
of s
tude
nts
Stemplots (Stem-and-Leaf Plots)
For quantitative variablesSeparate each observation into a stem
(first part of the number) and a leaf (the remaining part of the number)
Write the stems in a vertical column; draw a vertical line to the right of the stems
Write each leaf in the row to the right of its stem; order leaves if desired
BPS - 5th Ed.Chapter 1 27
Weight Data
BPS - 5th Ed.Chapter 1 28
12
Weight Data:Stemplot(Stem & Leaf Plot)
BPS - 5th Ed. Chapter 1 29
1011121314151617181920212223242526
Key
20|3 means203 pounds
Stems = 10’sLeaves = 1’s
192
2
1522
5
135
Weight Data:Stemplot(Stem & Leaf Plot)
BPS - 5th Ed. Chapter 1 30
10 016611 00912 003457813 0035914 0815 0025716 55517 00025518 00005556719 24520 321 02522 023242526 0
Key
20|3 means203 pounds
Stems = 10’sLeaves = 1’s
Extended Stem-and-Leaf Plots
If there are very few stems (when the data cover only a
very small range of values), then we may want to create
more stems by splitting the original stems.
BPS - 5th Ed.Chapter 1 31
Extended Stem-and-Leaf PlotsExample: if all of the data values were between 150 and 179, then we may choose to use the following stems:
BPS - 5th Ed.Chapter 1 32
151516161717
Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).