insights into data - freudenthal instituut...2 insights into data i. ii. iii. iv. v. directions: i....
TRANSCRIPT
Data Analysis and Probability
Insightsinto Data
SE_FM_ppi_vi_ISBN9092_2006:39.IID.SB.CRPT.qxd 12/18/08 10:44 PM Page i
Mathematics in Context is a comprehensive curriculum for the middle grades.It was developed in 1991 through 1997 in collaboration with theWisconsin Centerfor Education Research, School of Education, University ofWisconsin-Madison andthe Freudenthal Institute at the University of Utrecht, The Netherlands, with thesupport of the National Science Foundation Grant No. 9054928.
The revision of the curriculum was carried out in 2003 through 2005, with thesupport of the National Science Foundation Grant No. ESI 0137414.
National Science FoundationOpinions expressed are those of the authorsand not necessarily those of the Foundation.
© 2010 Encyclopædia Britannica, Inc. Britannica, Encyclopædia Britannica, thethistle logo, Mathematics in Context, and the Mathematics in Context logo areregistered trademarks of Encyclopædia Britannica, Inc.
All rights reserved.
No part of this work may be reproduced or utilized in any form or by any means,electronic or mechanical, including photocopying, recording or by any informationstorage or retrieval system, without permission in writing from the publisher.
International Standard Book Number 978-1-59339-960-3
Printed in the United States of America
1 2 3 4 5 C 13 12 11 10 09
Wijers, M., de Lange, J., Bakker, A., Shafer, M. C., & Burrill, G. (2010). Insightsinto data. InWisconsin Center for Education Research & Freudenthal Institute(Eds.), Mathematics in context. Chicago: Encyclopædia Britannica, Inc.
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 5/19/09 10:15 PM Page ii
The Mathematics in Context Development TeamDevelopment 1991–1997
The initial version of Insights into Data was developed by Monica Wijers and Jan de Lange. It was adapted for use in American schools by Mary C Shafer and Gail Burrill.
Wisconsin Center for Education Freudenthal Institute StaffResearch Staff
Thomas A. Romberg Joan Daniels Pedro Jan de LangeDirector Assistant to the Director Director
Gail Burrill Margaret R. Meyer Els Feijs Martin van ReeuwijkCoordinator Coordinator Coordinator Coordinator
Project Staff
Jonathan Brendefur Sherian Foster Mieke Abels Jansie NiehausLaura Brinker James A, Middleton Nina Boswinkel Nanda QuerelleJames Browne Jasmina Milinkovic Frans van Galen Anton RoodhardtJack Burrill Margaret A. Pligge Koeno Gravemeijer Leen StreeflandRose Byrd Mary C. Shafer Marja van den Heuvel-PanhuizenPeter Christiansen Julia A. Shew Jan Auke de Jong Adri TreffersBarbara Clarke Aaron N. Simon Vincent Jonker Monica WijersDoug Clarke Marvin Smith Ronald Keijzer Astrid de WildBeth R. Cole Stephanie Z. Smith Martin KindtFae Dremock Mary S. SpenceMary Ann Fix
Revision 2003–2005
The revised version of Insights into Data was developed by Arthur Bakker and Monica Wijers. It was adapted for use in American schools by Gail Burrill.
Wisconsin Center for Education Freudenthal Institute StaffResearch Staff
Thomas A. Romberg David C. Webb Jan de Lange Truus DekkerDirector Coordinator Director Coordinator
Gail Burrill Margaret A. Pligge Mieke Abels Monica WijersEditorial Coordinator Editorial Coordinator Content Coordinator Content Coordinator
Project Staff
Sarah Ailts Margaret R. Meyer Arthur Bakker Nathalie KuijpersBeth R. Cole Anne Park Peter Boon Huub Nilwik Erin Hazlett Bryna Rappaport Els Feijs Sonia PalhaTeri Hedges Kathleen A. Steele Dédé de Haan Nanda QuerelleKaren Hoiberg Ana C. Stephens Martin Kindt Martin van ReeuwijkCarrie Johnson Candace UlmerJean Krusi Jill VettrusElaine McGrath
39.IND.0526.eg.qxd 05/27/2005 14:02 Page iii
Cover photo credits: (left, middle) © Getty Images; (right) © Comstock Images
Illustrations2 (top left and right) Christine McCabe/© Encyclopædia Britannica, Inc.;11, 18 Holly Cooper-Olds; 28 Christine McCabe/© Encyclopædia Britannica, Inc.;46, 47 Holly Cooper-Olds; 59 © Encyclopædia Britannica, Inc.
Photographs1 (top) © Getty Images; (bottom) Lynn Betts, USDA Natural Resources ConservationService; 3 © Corbis; 6 (left to right) Ron Dahlquist; Mark E. Gibson/ Corbis;12 © Corbis; 16 Victoria Smith/HRW Photo; 17 (top) Sam Dudgeon/HRW Photo;(bottom) © Bettmann/Corbis; 29 Dennis MacDonald/Alamy; 32 Victoria Smith/HRW Photo; 35 © PhotoDisc/Getty Images; 38 Victoria Smith/HRW Photo;39 © PhotoDisc/Getty Images; 44 © Corbis; 46 Amos Morgan/PhotoDisc/GettyImages; 53 (left, middle) PhotoDisc/Getty Images; (right) Siede Preis/PhotoDisc/Getty Images; 54 PhotoDisc/Getty Images; 55 Siede Preis/PhotoDisc/Getty Images;56 George K. Peck; 58 Stephanie Friedman/HRW
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/15/09 5:18 PM Page iv
Contents
Contents v
Letter to the Student vi
Section A Patterns in DataBean Sprout Experiment 1Living in Cities 3Summary 8Check Your Work 8
Section B Selecting SamplesCollecting “Fair” Data 11Biased Samples 16Random Numbers 18Summary 20Check Your Work 20
Section C Interpreting GraphsDifferent Impressions 22Summary 30Check Your Work 30
Section D Using DataExploring Growth 32Presenting the Bean Sprout Data 38Summary 40Check Your Work 41
Section E Correlating DataGrowing Babies 44Summary 50Check Your Work 50
Section F Lines That Summarize DataEgg Hunt 53Gone Fishing 59Summary 60Check Your Work 61
Additional Practice 64
Answers to Check Your Work 70
0
10
2
4
6
3
5
7
9
11
13
15
17
19
21
23
8
10
12
14
16
18
20
22
2425
1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21
Plant Height at 7 Days
Fre
qu
en
cy (
Nu
mb
er
of
Pla
nts
)
39.IND.0526.eg.qxd 05/27/2005 14:02 Page v
vi Insights into Data
Dear Student,
Welcome to Insights into Data. Do you look at the graphs in newspapers to see if they make sense? Numbers and graphs are used to describe situations all around you: sports, grades, sales, marketing, taxes, and even car ratings.
In this unit, you will learn how to use numbers and graphs to help you make decisions and draw conclusions. You will also study surveys and how they are conducted. You will grow mungbeans in soda, salt water, and tap water to see which is the best solution for growing sprouts. You will even learn to use lines to help you investigate the relationship between two things, such as the length and width of birds’ eggs. (Do you think birds’ eggs are mostly round?)
Look for graphs and numerical information in newspapers and magazines to develop your own insights into data.
Sincerely,
TThhee MMaatthheemmaattiiccss iinn CCoonntteexxtt DDeevveellooppmmeenntt TTeeaamm
39.IND.0526.eg.qxd 05/27/2005 14:02 Page vi
APatterns in Data
Bean Sprout Experiment
Farmers are concerned about parasites damagingtheir crops. Chemical companies develop pesticidesthat kill the parasites, but they have to be careful thatthe chemicals do not harm the crops. In this section,you will collect and examine data, and you will studyhow graphs can help you reach conclusions about adata set. In the experiment that follows, you willexamine the effect of different liquids on the growthof bean sprouts germinated from mung beans.
Section A: Patterns in Data 1
In the experiment below, you will investigate the answers to the following questions:
• How fast do bean sprouts grow?
• What happens to the growth of bean sprouts if the beans areplaced in different solutions?
You will use the results of this experiment later in the unit.
Before conducting an experiment, researchers hypothesize about, or predict, what will happen in the experiment.
1. Read the description of the activity on page 2 and then make a prediction about the outcome.
mung beans
39.IND.0526.eg.qxd 05/27/2005 14:02 Page 1
Your group will need the following items:
2 Insights into Data
i.
ii.
iii.
iv.
v.
Directions:
i. Cut three paper towels to fit the bottom of the petri dish.
ii. Put two layers of paper towels in the petri dish and arrange the mung beans on top. Figure out a way to identify the tenindividual beans so that you will be able to collect data foreach bean as it grows.
iii. Soak the paper towels and beans by adding several teaspoonsof your solution.
iv. Place another paper towel over the beans and dampen it withyour solution. Place the cover on the petri dish. Label the petridish with the name of your group and the type of solution used.
v. At approximately the same time each day, measure the lengthof each sprout in millimeters. Use the table on Student
Activity Sheet 1 to record the lengths of the sprouts and yourobservations about the growth of the beans. Keep track of theprogress of the bean sprouts for seven days. During this time,add more solution as needed to keep the paper towels wet.
• one petri dish
• paper towels
• ten mung beans that have been soakedovernight in tap water
• a metric ruler for measuring in millimeters
You will also need one of the following solutions:
• tap water
• 1 teaspoon of salt per pint of water
• 1 fluid ounce of cola per pint of water
• 1 fluid ounce of lemon-lime soda per pint of water
39.IND.0526.eg.qxd 05/27/2005 14:02 Page 2
Section A: Patterns in Data 3
The United States Census Bureau regularly investigates whatpercentage of people live in cities and what percentage live in ruralareas. The information also indicates the movement of people fromcities to rural areas and vice versa.
2. Why do you think it is important to know if people are movingfrom cities to rural areas?
Living in Cities
Bean Length of Bean Sprout (in mm) Observations
NumberDay 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7
and/or
Problems
1
2
3
4
5
6
7
8
9
10
39.IND.0526.eg.qxd 05/27/2005 14:02 Page 3
A
4 Insights into Data
Patterns in Data
After the information is collected, the Census Bureau reports the percentage of urban population by state and the per capita income for each state.
3. Explain the meaning of “per capita income for each state.”
The scatter plot here and on Student Activity Sheet 2 shows the information collected by the United States Census Bureau about per capita income in each state and the District of Columbia and thepercentage of people who live in urban areas in that state. The meannationwide per capita income per state was about $30,000. Each stateis identified as a data point in the plot labeled by its postal code. Forexample, MS represents Mississippi, and DC represents the District of Columbia.
40 45 50 55 60 65 70 75 80 85 90 95 100
22,000
Percent Urban
Urban Population and Per Capita Income by State in 2000
24,000
26,000
28,000
30,000
32,000
34,000
36,000
38,000
40,000
42,000
44,000
Per
Cap
ita I
nco
me (
in d
oll
ars
)
MS
ME
MT
MO
MN
MD
MA
NJ
KY
IAKS
MI
IL
GAIN
LASC
TX
RIPA
VA
UT
TNSD ND
NH
DC
CT
NY
NE
NC
OK
NM
AZ
COCA
AK
OR
OH
ID
DE
FL
HI
NV
WV
VT
WY
WA
WI
AR
AL
Source: Per capita income in 2000: Statistical Abstract of the United States, 2003, Table 671. Percent urban: US Census Bureau
A
39.IID.SB.CRPT.qxd 04/06/2006 12:27 Page 4
APatterns in Data
Section A: Patterns in Data 5
Use Student Activity Sheet 2 to answer problems 4–15.
4. What does this graph tell you? Write two general statementsbased on the graph.
5. Look at the data point for Utah (UT).
a. What percentage of the people in Utah lived in urban areas in 2000?
b. What other information is shown by this data point?
6. Find the data point for California (CA). Explain what that pointrepresents.
Scott studied the scatter plot of the Census Bureau data on per capita income and percentage living in urban areas and said,“The higher the percentage of people who live in cities, the higher the per capitaincome is for the state.”
7. In general, do you agree with Scott’s statement?
Margaret studied the same scatter plot and made the followingcomment. “I don’t believe that $30,000 is the mean. There are only 20 states above the mean.”
8. Explain why so few states are above the mean. (Note: $30,000 isthe correct mean.)
Dan, Eliza, and Yolanda also studied the scatter plot of the CensusBureau data. Dan noticed,“Minnesota (MN) has a higher per capitaincome than Georgia (GA).”
9. a. Consider Dan’s statement. What else can you tell aboutMinnesota as compared to Georgia?
A
39.IND.0526.eg.qxd 05/27/2005 14:02 Page 5
6 Insights into Data
Patterns in DataA
Eliza: “Hawaii (HA) and Nevada (NV) must be the same kind ofstates.”
b. Comment on Eliza’s statement. Do you agree?
Yolanda: “Alaska (AK) and Oklahoma (OK) are very different.”
c. Do you agree with Yolanda? Explain your answer.
10. Locate your home state on the scatter plot. Write a statementabout the per capita income in your state and its relativeposition on the graph.
Draw a horizontal line through a per capita income of $35,000.A group, or cluster, of states is above this line.
11. a. What states are in this cluster?
b. Write two sentences describing these states in terms of theirper capita income and the percentage of people in the stateliving in urban areas.
c. What can you say about the states below the line?
Sometimes you can find more information and make new statementsby looking more closely at a graph.
States in which fewer than half of the people live in urban areas havea per capita income that ranges from a little over $22,000 to about$29,500.
12. a. What is the range of the per capita income for states in which85% to 90% of the population live in urban areas?
b. What is the range of the per capita income for states in whichover 90% of the population live in urban areas?
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 8:46 AM Page 6
WEST
MIDWEST
SOUTH
NORTHEAST
WA
OR
MT
ID
WY
NV
CA
AZNM
UTCO
KS
NE
SD
NDMN
IA
WI
IL
MO
OK
TX
AR
LAMS
ALGA
TN
KY
SC
NC
VA
FL
INOH
WV
PADC
MI
NYVT
NH
ME
MD
DE
NJ
CT
RI
MA
AK
H I
APatterns in Data
The United States Census Bureau categorizesstates according to their geographical location.The map on this page shows these categories.
Section A: Patterns in Data 7
A
13. On the scatter plot on Student Activity Sheet 2, circle the dot foreach state in the Midwest in blue and circle the dot for each statein the South in red.
14. a. Washington, D.C. (DC) and Maryland (MD) might be calledoutliers in comparison to the other southern states. Explain what it means to be an outlier.
b. What other state might also be an outlier?
15. Explain the position of Illinois (IL) on the scatter plot. Compare itsposition to those of other states in the Midwest.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 7
8 Insights into Data
Patterns in DataA
Data can be represented in a graph, such as a scatter plot like theone shown here. The data point A represents a car that weighs1,975 pounds and gets 51 miles per gallon.
Some conclusions you draw from a graph may be very obvious.For example, a scatter plot can show if there are clusters of dataor outliers.
Other conclusions may require more complex explanations, such asa description of a typical data point.
Often, careful examination of a graph can raise new questions. Moredata gathering and research may be necessary to answer these newquestions.
1. Describe some features you might look for in a scatter plot.Why might these be important?
01900 2000 2100 2200 2300
10
20
30
40
50
60
Fuel
(in
mile
sp
erg
allo
n)
Weight (in lbs)
Vehicle Fuel Economy
A
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 8:50 AM Page 8
Section A: Patterns in Data 9
2. a. Study the scatter plot for Vehicle Fuel Economy shown in theSummary. What does this graph tell you? Write two generalstatements.
b. Vehicle B weighs 2,100 pounds. Locate the data point for B inthe scatter plot. What can you tell about the fuel consumptionof car B?
c. Is there an outlier in the scatter plot? Explain your answer.
The table below shows the percentage of eighth-grade studentswho scored at or above the basic level in math and science on the2005 National Assessment of Educational Progress in the southernstates of the United States.
Percentage at or above Percentage at or aboveState Basic Level in Mathematics Basic Level in Science
Alabama 66 48
Arkansas 64 56
Delaware 72 63
Florida 65 51
Georgia 62 53
Kentucky 64 63
Louisiana 59 47
Maryland 66 54
Mississippi 52 40
North Carolina 72 53
Oklahoma 63 57
South Carolina 71 54
Tennessee 61 55
Texas 72 53
Virginia 75 66
West Virginia 60 57
Source: http://nces.ed.gov/nationsreportcard/states
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 9:07 AM Page 9
10 Insights into Data
Patterns in DataA
3. a. Use Student Activity Sheet 3 to make a scatter plot of thepercentage of students at or above the basic level in mathand science. Identify each data point by labeling it with thestate it represents. Write a general statement about thepattern(s) in the data you can observe from the graph.
b. Which state(s) do not seem to fit the pattern?
c. How can you tell from the graph whether, overall, the statesseemed to do better in math or in science? Explain yourreasoning.
d. Circle the group of states whose percentage of studentsscoring at or above the basic level in both math and sciencewas more than 60%. Identify these states. How might thesestates differ from the others?
e. Which state(s) had the most students scoring at or above thebasic level in both math and science? Justify your answer.
Using the scatter plot of Urban Population and Per Capita Income byState in 2000, select two other states that are in the same region asyour state. Write two or more statements that compare your state’sdata point to that of your neighbors. If they are different, tell why.
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 9:10 AM Page 10
Data can be obtained from organizations such as the United StatesCensus Bureau, and the results can then be graphed. However, it isnot always easy to get accurate data, as you may have seen in theunit Dealing with Data.
Questions such as the following are important in statistics:
• How do you get reliable data?
• What is the best way to visually present the data?
• How do you draw accurate conclusions based on the data?
Looking carefully at graphical representations of data is important.Even graphs based on complete data, such as the scatter plot on page 4, must be studied carefully before reliable conclusions can be made.
Section B: Selecting Samples 11
BSelecting Samples
Collecting “Fair” Data
A summer band camp has middle school students from all 50 statesand Washington, D.C. Twenty students from Delaware are at the camp.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 11
The Census Bureau data indicate thatabout 80% of the population in Delawarelives in urban areas, as shown in thescatter plot on page 4.
1. Do you think it is likely that of the 20 middle school students fromDelaware at the band camp, 16 live in urban areas? Explain your thinking.
Sue states, “Only eight out of the 20Delaware students in the band camp livein urban areas.”
2. Does this number surprise you? Whatare some possible reasons for therather low number?
12 Insights into Data
Selecting SamplesB
The question to investigate is, “How likely is it that in a randomlyselected sample of 20 middle school students from Delaware, onlyeight of them live in urban areas?” Choosing a random sample isimportant because it helps reduce bias in the sampling process. Asample is biased when it favors certain outcomes or some parts of the population over others. Care must be taken so that any memberof the population has an equally likely chance of being chosen in thesample. In statistics, random means that each element of a set has anequal probability of occurring.
3. a. Reflect What is meant by a “randomly selected sample” ofstudents from Delaware?
b. How could someone randomly select 20 middle school students from Delaware?
Suppose you had a random sample of students from Delaware. Howmany of them do you think would be likely to come from urban areas?To investigate this question, you can create a model, or a simulation,of the situation in Delaware.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 12
Student Activity Sheet 4 is divided into ten rectangles.
Section B: Selecting Samples 13
Urban Urban
Urban Urban
Urban Urban
Urban Rural
Urban Rural
These rectangles represent the percentageof people from urban and rural areas inDelaware, where two out of every tenpeople are from a rural area.
• Cut out the rectangles. Fold them once and put them in a paperbag or box. Shake the container well.
• Take out a rectangle. Record in a table what is written on therectangle and put the rectangle back in the bag or box. Shake thecontainer to thoroughly mix the rectangles. Repeat this 20 times.
4. Explain how this activity has simulated taking a random sampleof 20 students from Delaware.
5. a. Make a table to tally the results for the entire class.
b. Reflect How do your results compare with your classmates’results? Explain any similarities or differences.
c. How many of your classmates have exactly eight studentsfrom Delaware who live in an urban area in their sample?Does this result surprise you? Why or why not?
d. How many of your classmates have exactly 16 students fromDelaware who live in an urban area in their sample? Does thisresult surprise you? Why or why not?
Urban
Rural
Name Number of “Urban” in Sample
Number of “Rural” in Sample
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 13
A
Organizing the data may help you see any patterns in the class results.You can use a table like the one below to show the possible numberof rectangles in the sample of 20 that had “urban” on them and a tallyof the students who had each number in their sample.
6. a. What do the three tally marks in the table mean?
b. Use the data collected in problem 5 to make a table like theone above.
c. Based on the results in the table what is your answer to thequestion, “How likely is it that in a randomly selected sampleof 20 middle school students from Delaware, only eight ofthem live in an urban area?”
14 Insights into Data
Selecting SamplesB
Number of “Urban” in Number of Students in Class
Sample of 20 Who Had This Number
….
…
7
8
9
10
11
12
13
14
15
16
17
. . . .
///
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 14
A
It is often easier to get a clear picture of the data if you have a graph.
7. a. On Student Activity Sheet 5, use the data from the table youmade in problem 6b to make a histogram.
b. Based on your histogram, write two sentences about thenumber of Delaware students in a sample of 20 students wholive in cities.
c. Based on your data, how likely do you think it is to have 14 to18 Delaware students in a random sample of 20 students wholive in urban areas?
d. What do the results of your simulation tell you about thenumber of Delaware students at the band camp who are likely to live in urban areas?
8. a. Repeat the simulation you did in the activity on page 13.Collect the class data. Add the new results to the table youmade in problem 6b.
b. Make a histogram using the new table. How does this histogram compare with the first one?
c. What kind of results do you think you would get if you continued to repeat the experiment?
Section B: Selecting Samples 15
BSelecting Samples
Av
era
ge
Glo
ba
l Te
mp
era
ture
(in
°F
)
Year
2000-2001
1990-1999
1980-1989
1970-1979
1940-1969
1930-1939
1910-1929
1900-19092002
58
5
5
01900 2000 2100 2200 2300
10
20
30
40
50
60
Mil
es p
er
Ga
llo
n
Weight (in lbs)
Vehicle Fuel Economy
A
B
Initial Length (in mm)
Bluegill Growth
Len
gth
aft
er
1 Y
ear
(in
mm
)
0 25 50 75 100 125 150 175 200
60
70
80
90
100
110
120
130
140
150
160
170
180
0
10
2
4
6
3
5
7
9
11
13
15
17
19
21
23
8
10
12
14
16
18
20
22
2425
1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21
Plant Height at 7 Days
Fre
qu
en
cy
(N
um
be
r o
f P
lan
ts)
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 15
On pages 13–15 you completed a simulation. By taking a number of random samples of 20 from a population where 80% live in urbanareas, you created a sampling distribution of those that live in urbanareas. You probably found that having around 14 to 18 out of 20 who live in urban areas was a likely result. Your simulation probablyindicated that 8 out of the 20 did not happen very often in any of thesamples and so was a very unlikely result.
If only eight out of the 20 students from Delaware at the band camplive in urban areas, you can conclude that this sample of 20 studentsdid not seem to be typical of the population of Delaware with respectto the living environment. Having only eight of the 20 students fromurban areas could have occurred by chance, but it does not seem very likely.
16 Insights into Data
Selecting SamplesB
Biased Samples
Many samples are biased because they favor certain outcomesor they favor some parts of the population over others. In suchinstances, there is a systematic error in the way the sample represents the population. Consider the following situation:
Some TV stations poll the public. Viewers are urged to callspecific numbers to voice their opinions. Dialing one numberregisters a “yes” vote; dialing another number registers a“no” vote.
9. Mention at least two problems with this type of sampling.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 16
Bias can result when underlying factors about a situation are notconsidered during the selection of a sample.
10. Why might the information collected about students at a bandcamp be biased?
11. Read each of the following survey situations carefully. Explainhow each poll could involve bias.
Section B: Selecting Samples 17
Selecting Samples B
a. The chief of police in a major U.S. city wants to determine how the public feels about the department.He prepares a questionnaire and sends police officersout to interview people in randomly selected sectionsof the city.
b. A magazine for health foods and organic healingwants to establish that large doses of vitaminsimprove health. The editor asks readers who haveregularly taken vitamins in large doses to write to the magazine and describe their experiences. Of the2,754 readers who reply, 93% report some benefitfrom taking large doses of vitamins.
c. A researcher wants to find out how many Americansintend to vacation in the United States in one year. To avoid bias, she selects 27 travel agencies in largecities and interviews every seventh visitor. The resultsof her research are published and titled “RecordNumber of Americans to Foreign Destinations.”
d. Reflect In 1936, the largest poll about the presidentialelection between Franklin Roosevelt and Alf Landonwas taken by a magazine called Literary Digest.The publisher sent out 10 million questionnaires to people listed in telephone books. They also used other sources, such as car registrations andsubscriber lists. The magazine received 2.4 millionreplies. As a result of the poll, Literary Digestpredicted that Landon would win by a margin of57% to 43%. However, Roosevelt won the election.Another research group used a much smallersample of 50,000 people and predicted correctlythat Roosevelt would win the election. Give somereasons why the smaller sample gave a better prediction.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 17
You simulated taking a random sample of students from Delaware bypulling rectangular pieces of paper out of a box. Simulations are oftendone using a set of random numbers. The set you will use consists ofnumbers from 0 through 9 in random order. You can read randomnumbers from a table or generate them on a computer or calculator.
Suppose you looked at a set of 50 random numbers ranging from 0through 9.
12. How many numbers in the set would you expect to be a 0 or a 9?Why?
The following is a set of 50 random numbers.
13 a. How many numbers in this set are a 0 or a 9? Compare this toyour answer to problem 12.
b. Would you expect to get exactly this many numbers being a 0 or a 9 every time you look at a set of 50 random numbers?
18 Insights into Data
Selecting SamplesB
Random Numbers
1 2 6 7 2 4 0 1 7 0
2 7 9 3 7 9 0 4 7 2
1 4 6 2 2 5 6 1 6 4
0 5 7 6 4 6 4 7 3 5
2 7 9 0 4 1 2 0 2 7
Remember the band camp? You can also userandom numbers to simulate the chance thatyou would see only eight students fromDelaware who lived in an urban area.
14. Select a set of 20 random numbers from thetable by arbitrarily choosing one of the rowsor columns and counting out 20 numbers.How many of the 20 numbers that youselected are a 0 or a 9?
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 18
The 20 numbers could represent the 20 students from Delaware at theband camp. The 0s and 9s represent those who lived in rural areas, andthe other eight numbers (1 to 8) represent those living in urban areas.
15. a. Using your set of 20 random numbers from problem 14, howmany students did you have who lived in urban areas? (That is, how many of the random numbers in your set were 1 to 8?)
b. Collect the class results for their sets of 20 random numbersand make a histogram of the number of Delaware studentsfrom all of the sets who were from urban areas. Using thehistogram, how likely do you think it would be to have 14 to 18 Delaware students in a random sample live in cities?
16. How do your results from the simulation with random numberscompare to the simulation you did with the numbers in the box?
Section B: Selecting Samples 19
Selecting Samples B
Math HistoryThe United States Census
In the United States, the first prototype of a population pyramidwas published in the Statistical Atlas of the United States Basedon the Ninth Census (1870).
Statistical data about the population in the United States is collected by the Census Bureau. Fact-finding is one of America’soldest activities. In the early 1600s, a census was taken inVirginia, and people were counted in nearly all of the Britishcolonies that later became the United States.
Following independence, there was an almost immediate needfor a census of the entire nation. The first census was taken in 1790, under the direction of Secretary of State ThomasJefferson. That census, taken by U.S. Marshals on horseback,counted 3.9 million inhabitants.
Nowadays, graphs have fewer mistakes because most of themare made using computer software; however, they all looksomewhat similar. Of course unique graphs still exist. You canfind them in newspapers!
000000
054062
033
070
031
013
004016
057
044
189
479
001003
001000
Wyoming
Mississippi
002002
004003
012015
022026
036
051 056
096
123
147
125
151
088
037
Vermont
005005
016
030
041
051
062 065
084081
106 102
107110
051
029
039
016
Washington
136141
081088
065114
050145
024086
011038
005010
002003
002001
The total number of living inhabitants in each case, as reportedin the census, is reduced to thousandths, and the number ofthousandths of each sex in each decade of life represented bythe distance measured on the horizontal lines, severally, fromthe perpendicular base line.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 19
20 Insights into Data
Selecting SamplesB
A population is a group of people or set of objects about which youwant to gather information.
You can collect data by questioning a sample of people from a specificpopulation or by examining a sample of objects from a set that haslike characteristics.
When taking a sample, it is important to do so randomly so that everymember of the population has an equal chance of being selected.
You can also collect data by designing and running an experiment orby carrying out a simulation.
When collecting data from a sample, you should avoid bias. Somepossible causes of bias are:
• incorrectly choosing the sample;
• neglecting to account for the people who do not respond; and
• letting interviewers select the people they want to interview.
A researcher is interested in preferences of middle school students.Your school is willing to participate in a survey for sixth- and seventh-grade students, but not all students can participate. Susan suggestsgiving the survey to all of the students in one class.
1. a. Will this be a fair sample? Explain your thinking.
b. How would you select a random sample of sixth- and seventh-grade students from your school?
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 20
Section B: Selecting Samples 21
2. a. If you look at a set of 50 random numbers ranging from 0 to 9,how many would you expect to be even?
b. Use the random number set from problem 12 and find howmany of these numbers are even. (Note: Count 0 as an evennumber.)
c. Would you expect to get this many even numbers out of everyset of 50 random numbers?
3. Select one example from this section that you think illustrateseach of the following causes of bias.
• incorrectly choosing the sample
• neglecting to account for the people who do not respond
• letting interviewers select the people they want to interview
4. Reflect Some people think that the larger the sample you take,the less chance you have of bias. Do you agree? Explain youranswer.
Samui chose a random sample of the eighth graders at his schooland found that their favorite sport was basketball. In his report he stated, “Eighth graders prefer basketball to any other sport.”Comment on his conclusion.
39.IND.0526.eg.qxd 05/27/2005 15:27 Page 21
22 Insights into Data
CInterpreting Graphs
Different Impressions
Graphs are useful for representing information in a clear and conciseway. The graph of rage and fear is a good example of a graph thatconveys information using only two words.
1. a. Reflect Describe what is represented in this graph.
b. Why is this a “good” example?
Many people do not trust the information provided in statisticalcharts. Sometimes the data come from a poorly selected sample, or the data are presented improperly. Some graphs have “mistakes”or “misrepresent” data. When using graphs, it is important to thinkabout how they are constructed and to make sure the graphs do notgive the wrong impressions.
RAGE
FEARFEAR
Source: Data from “Catastrophe Theory” by E.C. Zeeman. Copyright © 1976 by Scientific American, Inc. All rights reserved.
39.IND.0526.eg.qxd 05/27/2005 14:03 Page 22
A
The graphs on this page give two different impressions of the relationship between the number of hours per week that a student works at a job and his or her grade point average.
2. a. Why do the graphs provide different impressions?
b. Which graph do you think a high school principal would usewhen talking to parents about their child’s decreasing gradepoint average? What argument do you think the principalwould make?
Section C: Interpreting Graphs 23
CInterpreting Graphs
3.1
2824201612840
2.6
3.0
2.9
2.8
2.7
Hours Worked Per Week
Gra
de
Po
int
Avera
ge
3.2
28242016128400
2.8
2.4
1.6
0.8
2.0
1.2
0.4
Hours Worked Per Week
Gra
de P
oin
t A
ve
rag
e
39.IID.SB.1220.qxd 12/21/2005 18:50 Page 23
The table contains information about the world population for specific years.
3. Draw a graph to represent the data from the table.
4. a. Based on these data, what would you expect the world population to be in the current year?
b. Use an almanac or search the Internet to find the world population in the current year. How does the actual populationfor the current year compare to your answer for part a?
c. What do you think the world population was in 1985? How didyou make your estimate?
d. Which estimate of the world population do you think will bemore accurate: your estimate for 1985 or an estimate for 2025?Explain your reasoning.
24 Insights into Data
Interpreting GraphsC
Year World Population
(in billions)
1630 0.5
1820 1.0
1890 1.5
1930 2.0
1950 2.5
1960 3.0
1968 3.5
1975 4.0
1981 4.5
1987 5.0
1994 5.5
2000 6.0
Source: http://www.ibiblio.org/lunarbin/worldpop/
39.IID.SB.1220.qxd 12/21/2005 18:50 Page 24
A
Section C: Interpreting Graphs 25
CInterpreting Graphs
Use Student Activity Sheet 6, which shows examples of graphs madeby two students to answer problem 3.
5. a. Do these graphs accurately represent the data? Explain.
b. Compare these graphs to the graph you drew in problem 3.
You may hear that the climate is changing since the average globaltemperature is rising. The graph below shows the average globaltemperature from 1900 to 2002.
Avera
ge G
lob
al Tem
pera
ture
(in
°F)
Year
2000–2001
1990–1999
1980–1989
1970–1979
1940–1969
1930–1939
1910–1929
1900–1909
2002
58
57
56
Increase in Global Temperatures 1900–2002
Source: National Oceanic and Atmospheric Administration
6. a. Does the graph give you a good representation of the changein temperature? Explain your thinking.
b. How do the scales of the graph affect your impression?
c. Comment on the statement: The data show that the earth isgetting warmer than ever before.
39.IID.SB.1220.qxd 12/21/2005 18:51 Page 25
Although the two graphs below represent the same data, they givedifferent impressions.
26 Insights into Data
Interpreting GraphsC
980
5
10
15
20
25
99 00 01 02 03 04
Percentage of Delayed Flights
Pe
rce
nt
De
lay
ed
Year
98
15
20
25
99 00 01 02 03 04
Percentage of Delayed Flights
Year
Perc
en
t D
ela
yed
7. a. Which graph suggests that fewer flights are delayed?
b. How was this impression achieved?
c. What groups of people might choose to use each graph? Why?
d. Do you think these data were gathered by sampling, or do youthink they represent all flights? Give reasons to support youranswer.
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 26
Section C: Interpreting Graphs 27
Interpreting Graphs C
8. The graph represents the percentage of airline seats filled duringthe second quarter of 2003 through the first quarter of 2004.
a. It appears that the first quarter of 2004 had double thepercentage of passengers as in the second quarter of 2003. Is this an accurate description? Explain your answer.
b. Write a statement that accurately describes the changes fromthe second quarter of 2003 to the first quarter of 2004.
Populations of Five Large U.S. Cities
9. a. What message does the graph above tell you?
b. Do you think the pictures express the populations in an accurate way? Why or why not?
69.5
Second Quarterof 2003
Passenger Load Factor
Third Quarterof 2003
Fourth Quarterof 2003
First Quarterof 2004
70.0 70.5 71.0 71.5 72.0 72.5 73.0
Percent
500,000 540,000 650,000 790,000 900,000
Oklahoma City, OK Charlotte, SC Memphis, TN Indianapolis, IN San José, CA
39.IID.SB.1220.qxd 12/21/2005 18:54 Page 27
28 Insights into Data
Interpreting GraphsC
10. a. How was the $209 computed?
b. What data are used to make this graph?
c. The last package has $838 written above it. The amount $838is 4 � $209.40 rounded to the nearest whole number. Explainhow this fits the data.
d. Is the size of the last package four times the size of the first?Carefully explain your answer.
$838
5 10 15 20
packages packages packages packages
per month per month per month per month
$628
$419
$209
Eating Cereal Is a Good Investment
Average cost of a 16 oz box
of cereal is $3.49.
Annual cost
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 28
Design and carry out a statistical survey. Pay attention to the following aspects.
• What are you going to investigate?
• How do you write a good questionnaire?
• What is a good sample for your survey?
• How will you graphically represent your data?
• What conclusions can you make from your data?
• Do any new questions arise from your data?
• What further investigations might be necessary?
• Is there possible bias in the way you intend to carry out yoursurvey? If so, can you eliminate it?
Section C: Interpreting Graphs 29
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 29
30 Insights into Data
Interpreting GraphsC
In order to make accurate conclusions from data, the data must bereliable and presented appropriately.
Data can be presented in many different ways.
• picture graphs • histograms
• line graphs • scatter plots
• bar graphs
When you see a graph, you should look carefully to make sure thatthe graph is a fair one that accurately tells the story of the data. Datamay be misrepresented if one or more of the following occurs:
• the graph’s axes are scaled improperly;
• origins on the graph are excluded;
• three-dimensional pictures are used inappropriately;
• numbers that should not be compared are compared;
• pictures that do not fit the numbers are used.
1. Why is it important for a graph to be an accurate representationof data?
Richard is a member of a neighborhood football club. His father andhis brother John were members as well.
They recorded the number of club members for 10 different years.
1983 45 2003 701984 41 2005 671985 53 2006 801995 60 2007 752002 68 2008 70
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 9:58 AM Page 30
Section C: Interpreting Graphs 31
Richard graphs these data in the following way.
2. a. Does Richard’s graph represent the data accurately? Explain.
b. Draw another graph that you think accurately representsthe data.
3. Cut out a graph from a newspaper or magazine. Include thecaption or article that accompanies the graph. Attach the graphwith the article or caption to a sheet of paper. Write a paragraphthat explains how the graph presents the information in thecaption or article. Do you think the graph is a good representationof the data or not? Explain your reasoning.
Make a list of all of the ways you think a graph can be misleading.Make another list of things you should watch for when you arelooking at graphs in the media.
1983
10
20
30
40
50
60
70
80
90
1984 1985 1995 2002 2003 2005 2006 2007 2008
Year
Nu
mb
ero
fC
lub
Mem
ber
s
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 10:02 AM Page 31
32 Insights into Data
Graphs of data can help you get a better picture of how the data are distributed. Graphs can help you interpret the data and makestatements about an experiment.
The graph below represents the height of plants after a seven-dayperiod.
1. a. Study the graph. What does the bar at 7 millimeters (mm) represent?
b. Write three statements about how tall the plants in the studygrew.
DUsing Data
Exploring Growth
Day 1 Day 4 Day 5 Day 6 Day 7
0
10
2
4
6
3
5
7
9
11
13
15
17
19
21
23
8
10
12
14
16
18
20
22
2425
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Plant Height at 7 Days
Height (in mm)
Fre
qu
en
cy
(N
um
ber
of
Pla
nts
)
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 32
Jo Mei, says “The mean height of the plants at the end of theexperiment is about 10 mm.”
Jorge, says “The mean height of the plants at the end of theexperiment is about 15 mm.”
2. a. Who do you think is right, Jo Mei or Jorge?
b. Which number is most easily found in the graph: the mean, the median, or the mode of the height values?
c. Reflect Explain how you might use the information in thegraph to find the median height of the plants.
Akir, Kari, Viviana, and Marja were studying the growth of plants. Afterseven days, they each graphed their data. Then they wrote statementsabout the growth of their plants. Unfortunately, some of the work wasmisplaced, and the rest was mixed together. On this page you see whatis left of their work.
3. For each student, match the statement with a graph. If a student’sgraph is missing, make an appropriate graph.
Section D: Using Data 33
Using Data D
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 33
10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789
10111213141516
1817
1920
Height (in millimeters)
Fre
qu
en
cy (
nu
mb
er
of
pla
nts
)
Plant Height at 12 Days
10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789
10111213141516
1817
1920
Height (in millimeters)
Fre
qu
en
cy (
nu
mb
er
of
pla
nts
)
Plant Height at 10 Days
A
The histograms shown below and on the next page represent the frequency of heights of a group of plants at 10 days, 12 days, and 14 days.
34 Insights into Data
Using DataD
39.IND.0526.eg.qxd 05/27/2005 14:04 Page 34
A
Section D: Using Data 35
DUsing Data
4. a. Write down what each histogram tells you about the plantheights.
b. Describe the growth pattern of these plants.
10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789
10111213141516
1817
1920
Height (in millimeters)
Fre
qu
en
cy
(n
um
ber
of
pla
nts
)
Plant Height at 14 Days
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 35
36 Insights into Data
Using DataD
Josh decided to make a line graph of the mean plant height for all ofthe plants for certain days.
5. a. How did he find the height for day 10? What does this pointmean?
b. When did the plants seem to grow the most? How can you see this on the graph?
c. Comment on the advantages and disadvantages of usingJosh’s graph to describe the growth of the plants.
00
10
20
30
40
50
60
70
80
90
100
2 4 6 8 10
Day
Mean Height of Plants
Heig
ht
(in
mm
)
12 14 16 18 20
39.IID.SB.1220.qxd 12/21/2005 19:00 Page 36
Box plots can be also be used to compare the height of the plants.
Remember that a box plot is a graph in which the data are groupedinto four groups of roughly equal size. To draw a box plot, you needfive numbers from your data:
• the lowest number, or minimum
• the middle of the lower half, or first quartile (Q1)
• the median
• the middle of the upper half, or third quartile (Q3)
• the highest number, or maximum
6. a. Explain what these box plots tell you about the growth of the plants from the 12th to the14th day.
b. What can you tell about the growth of the plants from the box plots that you cannot tell from the line graph? From the histograms?
Section D: Using Data 37
Using Data D
0
Day 14
Day 12
20 40 60 80
Height (in mm)
Box Plots of Plant Height
Da
y
100 120 140 160 180
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 37
38 Insights into Data
Presenting the Bean Sprout Data
In the bean sprout growth experiment you conductedon page 2, you collected data and kept a record of thelengths of the bean sprouts and your observations.
Prepare a report for the class on the results of yourexperiment. Your report should include the following:
• the data you collected;
• a graph of the final length of each sprout;
• a written description of what the graph showsabout the final sprout lengths;
• a graph of the growth of the sprouts over time;
• a written description of what the graph showsabout the growth of the sprouts over time;
• some statements about the data in which you usethe mean, median, or mode (select the one you feelis best for your data);
• a statement about how the growth of the beansvaried;
• your conclusions about how the bean sprouts grewin the solution you used;
• a list of the things that affected the results of yourexperiment; and
• a list of the things you would change if you were torepeat the experiment.
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 38
A
Section D: Using Data 39
DUsing Data
Your teacher will give you some self-stick notes. Write the final lengthof each bean sprout on a separate self-stick note. Put your notes onthe number line your teacher has drawn on the board to make a histogram. Use the finished histogram on the board to answer the following questions.
7. a. What is the average length of the bean sprouts?
b. How did the bean sprouts in your solution compare to thebean sprouts in other solutions? What conclusions can youdraw about the effects of your solution on the way the beansprouts grew?
The histogram gives a general picture of the growth of the beansprouts in each solution.
Suppose you want to compare the effects of the different solutionsmore directly. One way to compare solutions is with box plots.
8. a. Make a box plot for the final lengths of the bean sprouts inyour solution. As a class, decide on a common scale for thebox plots. Why is a common scale necessary?
b. Below your box plot, write the type of solution your groupused. Put your box plot on the board together with those from other groups. How do the box plots help you comparethe different solutions?
c. What is the difference between the information you can see on a histogram and the information you can see in a box plot?
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 39
40 Insights into Data
Using Data
A data set can be graphed in different ways.
A histogram is a general picture of the data. It allows you to see howthe data are distributed.
A box plot is a graph in which the data are grouped into four groupsof approximately equal size. A box plot gives you a summary of thefive values:
• the lowest number, or minimum
• the middle of the lower half, or first quartile (Q1)
• the median
• the middle of the upper half, or third quartile (Q3)
• the highest number, or maximum
D
10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789
10111213141516
1817
1920
Height (in millimeters)
Fre
qu
en
cy
(n
um
be
r o
f p
lan
ts)
Plant Height at 12 Days
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 40
Section D: Using Data 41
These five values divide the data in four groups, each of whichcontains about 25% of the data. A box plot shows an overall pictureof the data but does not allow you to see details. Box plots areparticularly useful for comparing several data sets.
A line graph (graph over time) shows change over a period of time—for example, the change in the length of the bean sprouts from day today. The graph allows you to look for trends in the data.
A description of the data and what you can see in the graph can makeinterpreting the graph easier. Statements about the center of the datausing mean or median and about the spread using range or quartilescan also give insights into the data.
1. Can you estimate the mean, median, or mode from each of thesetypes of graphs? Explain.
• a histogram • a box plot
2. When is it helpful to use box plots?
0
Day 14
Day 12
20 40 60 80
Height (in mm)
Box Plots of Plant Height
Day
100 120 140 160 180
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 10:11 AM Page 41
42 Insights into Data
The table contains the starting salaries of teachers in secondaryschools in different countries in 2001. The salaries are given in dollars and adjusted for different money values among countries. The countries have been coded by grouping the continents in which they are located: I—North America, Europe, Australia, and New Zealand; II—South America, Africa, and Asia.
3. a. What salary differences between the two groups of continentsdo you expect to find?
b. Make box plots to represent the starting salaries of teachers inthe two continent groups.
c. How would you describe the differences in starting salaries inthe two continent groups?
Country Starting Salary Continent Group
(in dollars)
Argentina 11,000 II
Australia 28,000 I
Austria 25,000 I
Belgium (Fl.) 31,000 I
Belgium (Fr.) 30,000 I
Brazil 17,000 II
Chile 12,000 II
Czech Republic 12,000 I
Denmark 30,000 I
Egypt 2,000 II
England 23,000 I
Finland 23,000 I
France 24,000 I
Germany 43,000 I
Greece 20,000 I
Hungary 8,000 I
Iceland 23,000 I
Indonesia 1,000 II
Ireland 24,000 I
Source: Organization for Economic Cooperation and Development
Using DataD
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 42
Section D: Using Data 43
Country Starting Salary Continent Group
(in dollars)
Italy 25,000 I
Korea (South) 25,000 II
Malaysia 14,000 II
Netherlands 29,000 I
New Zealand 18,000 I
Norway 29,000 I
Paraguay 14,000 II
Peru 6,000 II
Philippines 11,000 II
Portugal 20,000 I
Scotland 22,000 I
Slovakia 5,000 I
Spain 31,000 I
Sweden 23,000 I
Switzerland 49,000 I
Thailand 6,000 II
Tunisia 21,000 II
United States 29,000 I
Uruguay 6,000 II
Write a paragraph that helps another class to prepare to do the beansprout experiment. Make sure you include any changes that youwould make that might make it easier to collect data that gives youinformation you want.
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 43
44 Insights into Data
In 1962, researchers studied 100 newborn babiesto see whether there was a relationship betweenthe length of the body and the circumference ofthe head. The table shows the results of thisstudy.
Most tallies appear along the diagonal thatextends from the lower left to the upper rightcorner of the graph.
ECorrelating Data
Growing Babies
Source: Sunburst Communications
47
39
38
37
36
35
34
33
32
48 49 50 51 52 53 54 55 56
Body Length (in cm)
Cir
cu
mfe
ren
ce
of
the H
ead
(in
cm
)
III
I IIIIII
I IIII
IIII
IIII
IIIIIIIII
I
IIII
IIIIIIII
III
IIIIIIII
IIII
IIIII
IIIIII
I
I
I
III
I
I I
I
I
I
I
I
I I
II
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 44
1. In relation to the diagonal, where do you find the babies withlarge heads relative to their lengths?
2. As the body lengths increase, what happens to the head sizes ofthe babies?
The scatter plot displays a picture of the information in the table.
3. a. Explain why the axes do not start at (0, 0).
b. What information seems to get lost in the scatter plot whencompared with the table?
4. Are there babies for whom body lengths are almost equal tohead sizes?
Section E: Correlating Data 45
Correlating Data E
46 48 50 52 54 56 5830
32
34
36
38
40
Body Length (in cm)
Cir
cum
fere
nce
of
the
Hea
d(i
ncm
)
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 10:17 AM Page 45
To show body lengths, head circumferences, and the number ofbabies studied, a three-dimensional graph can be drawn.
5. a. What information does the purple bar in the graph represent?Check this information against the data in the table.
b. What are the advantages and disadvantages of a three-dimensional graph?
Juan and Brenda have a discussion about the data.
46 Insights into Data
Correlating DataE
Body Length (in cm)
32 34
36
38
Head
Circu
mfe
rence
(in
cm
)
4850
5254 56
Source: Sunburst Communications
I think you can say that thereis a relationship between ababy’s body length and thecircumference of its head.
I’m not sure about that.The tallies seem to be justspread over the chart.
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 46
A
Section E: Correlating Data 47
Correlating Data
6. Do you agree with Brenda or Juan? Explain.
7. a. What do you think the head circumference is for a baby with abody length of 55 centimeters (cm)? How did you determineyour answer?
b. A one-year-old child has a height of 92 cm. What can you tellabout the circumference of the child’s head?
E
You can say that the longerthe baby, the larger thecircumference of its head.
Yes, I agree that you can find those exceptions.I can even find a baby that is 53 cm long andhas a head circumference of 33 cm, but thisseems to be an outlier!
If you look at the table,most babies fit in ageneral pattern.
Okay, but the relationship you described is not a strong one, and I think when thebabies start growing, this relationship willget even weaker or disappear.
That’s not true for all babies. Lookat the table. I see babies with bodylengths of 47 centimeters (cm) andof 52 cm, and both have a headcircumference of 34 cm.
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 47
Scatter plots are often made to investigate the relationship betweentwo variables. The points on a scatter plot may look like a “cloud.”When the cloud is long and slender, there is a strong correlationbetween the two variables. When the correlation is strong, knowingsomething about one of the variables helps you know how the othervariable will behave. If the cloud of points is very scattered or in acircle, no underlying relationship exists between the variables.Knowing something about one of the variables cannot tell you about the other. In this case, there is no correlation between the two variables.
8. Describe the correlation in the scatter plots. Indicate whether there seems to be no correlation, a weak correlation, or a strong correlation.
9. For which of the above scatter plots can you best predict thevalue of y when x � 4? Explain your reasoning.
48 Insights into Data
Correlating DataAE
0 1
1
2
3
4
5
6
2 3 4 5 6 0 1
1
2
3
4
5
6
2 3 4 5 6
0 1
1
2
3
4
5
6
2 3 4 5 60 1
1
2
3
4
5
6
2 3 4 5 6
y y
y y
x x
x x
a b
c d
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 48
In the diagrams for problem 8, you can see that a correlation can beweak or strong. Correlations can also be positive or negative.
10. a. What do you think is meant by the phrase “negative correlation”?
b. Which of the scatter plots on the previous page show anegative correlation?
11. For each of the following cases, decide whether there is no correlation, a strong correlation, or a weak correlation betweenthe two variables mentioned. If there is a correlation, is it positiveor negative?
• a person’s height and pulse rate
• the number of hours of sports training per week and pulse rate
• the height of a dinosaur and the length of its tail
• results on a math test and a science test
• temperature outside in the summer and kilowatt hours of electricity used
• number of children per household and number of televisionsets per household
• number of hours students study and their grade point averages
In a cause-effect relationship between two variables, a change in onevariable directly causes a change in the other.
12. a. If you expect a strong correlation between the two variables,can you be sure that a cause-effect relationship exists? Why or why not?
b. Do you think there is a cause-effect relationship in any of thesituations in problem 11? Explain your reasoning.
Collect some data about one of the cases in problem 11 and use the data to check your answer about the correlation. Be sure to think carefully about how you will select your sample.
Section E: Correlating Data 49
Correlating Data E
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 49
50 Insights into Data
Correlating Data
A scatter plot shows information about two variables that are paired in some way and can help you see whether a relationship existsbetween the variables. By looking for a trend in the scatter plot, you can see that a relationship exists.
If the points on a scatter plot are close to forming a straight line, thereis a strong correlation. If the points are scattered all over and do notfollow a pattern, there is a weak correlation.
Correlations can also be positive or negative. A negative correlationexists if the values of one variable increase while the values of theother decrease.
A scatter plot with a strong correlation doesnot imply that a cause-effect relationshipbetween the variables exists. Other kinds of analysis must be done to determinewhether a change in one variable causes a change in the other.
E
1. Explain how a scatter plot can help you understand correlation.
0 1
1
2
3
4
5
6
2 3 4 5 6
y
x
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 50
Section E: Correlating Data 51
19000
2000 2100 2200 2300
10
20
30
40
50
60
Mil
es p
er
Ga
llo
n
Weight (in lbs)
Vehicle Fuel Economya
14
12
10
8
6
4
2
00 20 40 60 80 100
Ma
xim
um
No
. o
f E
gg
s p
er
Ha
tch
Size of Bird (in cm)
Birds of Maineb
5034 36 38 40 42 44 46 48
52
54
56
58
60
62
Cir
cu
mfe
ren
ce
of
He
ad
(in
cm
)
Shoe Size (in cm)
Shoe Size
and Head Circumferencec
2. a. Describe the correlation in each of the three scatter plots.
b. In which of these plots might there be a cause-effect relationship?
3. Find an example of two variables that have a strong correlationbut do not have a cause-effect relationship.
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 51
52 Insights into Data
Correlating Data
4. Match each of the descriptions of correlation below to a corresponding scatter plot.
i. linear, negative, and moderate
ii. linear, positive, and strong
iii. no apparent correlation
iv. linear, positive, and weak
For small children, foot size can be correlated with the ability to read;the larger the foot, the better they can read. But both reading and footsize are a function of age. Find another example of two data sets withstrong correlation but where the relationship is really due to anothercommon factor.
E
a b
c d
39.IND.0526.eg.qxd 05/27/2005 14:05 Page 52
Section F: Lines That Summarize Data 53
FLines That SummarizeData
Egg Hunt
Marisol noticed that a bird had recentlylaid eggs in a nest outside her window.The eggs appeared small, and she wondered whether they had the sameshape as chicken eggs.
1. a. Do you think all bird eggs havethe same shape?
b. How can you check youranswer?
Throughout this section, you will usescatter plots to study the relationshipbetween the lengths and widths ofvarious bird eggs.
On Student Activity Sheets 7, 8, and 9 you will find tables with dataon 96 birds of Britain. Since there are many birds, they are classifiedinto families. Among the data, you will find the average lengths andwidths of the eggs of each bird.
2. a. Draw an eider’s egg using the dimensions found in the table.
b. Draw a lesser whitethroat’s egg.
c. How are the eggs different? What might account for the differences?
Quail egg Chicken egg
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 53
54 Insights into Data
Lines That Summarize DataF
You have discovered that eggs are different sizes. But what about their shapes? Is there a relationshipbetween the lengths and widths of bird eggs? Toanswer these questions, you will begin by looking at the eggs of one family, the warbler.
3. a. Find all the warblers in the table on Student
Activity Sheets 7, 8, and 9, and make a scatterplot of the lengths and widths of the warblereggs (put length on the horizontal axis andwidth on the vertical).
b. Describe the relationship between the lengthsand widths of the warbler eggs. Is there anycorrelation?
c. What do you expect the width of a warbler eggwill be if its length is 15 mm? How did you findyour answer?
The points in the scatter plot lie almost on a straight line. You mighthave used this information to find an answer to problem 3c. It is possible to summarize the pattern, or relationship, by drawing a linethat seems to best fit these points. This line will probably not gothrough all the points, but the points should lie close to the line.
4. a. On Student Activity Sheet 10, draw a straight line that seemsto “fit” these points. Use your line to predict the width of awarbler egg that is 15 mm long.
b. Does your line give you the same answer that you predicted inproblem 3c?
5. a. Use your line to determine what happens to the width of anegg if the length increases by 2 mm. Explain how you did this.Include a drawing with your answer.
b. Describe what will happen to the width of an egg if the lengthincreases by 1 mm.
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 54
Section F: Lines That Summarize Data 55
Lines That Summarize Data F
In problem 4, you looked at a scatter plot and drew a line that summarized the relationship between the variables. Your line mayhave been very different from the lines drawn by your classmates.
6. How can you decide whether a line drawn to summarize the datain a scatter plot is a good line? What criteria would be helpful inmaking your decision?
7. a. Use your criteria to decide whether or not you drew a good line.
b. How does your line compare with the line drawn by a classmate?
There are many ways to find a good line. Different criteria are usedfor different purposes. One criterion is how well your line predicts avalue for something you already know.
8. a. Look at the graph of your line from problem 4. What does itpredict for the width of a lesser whitethroat warbler egg that is 16.5 mm long?
b. Now look at the data on Student Activity Sheets 7–9. What isthe actual value for the width of the lesser whitethroat warbleregg? How well did your line predict this value?
c. Overall, does the line you drew seem to predict values wellwhen you compare it with the actual data? Explain.
Ostrich egg Emu egg
Chicken egg Quail egg
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 55
56 Insights into Data
Lines That Summarize DataF
In problem 4, you drew a line to predictthe width of the warbler egg. One way toget a line is to draw the line so that it goesthrough the point that is the mean of thelengths and the mean of the widths for theeggs. The slope of the line should reflectthe direction of the trend in the points.
Student Activity Sheet 11 shows a line for the warbler eggs thatcontains the point (mean length, mean width).
9. a. How is this line different from the line you drew in problem 4?
b. Use this new line to find the width of the lesser whitethroatwarbler egg that is 16.5 mm long. How close is the predictionto the width given in the table?
c. Overall, how well does the line seem to predict the widths ofthe warbler eggs?
One equation for the relationship between the length and width of the warbler eggs that goes through the mean is:
width = 0.7 � length � 1.1
10. a. Show that this line contains the point (mean length, meanwidth).
b. Use this equation to find the width of the lesser whitethroatwarbler egg that is 16.5 mm long.
c. How close is the width you found to the width given in thetable?
11. a. What is the slope of the line for the warbler eggs in question 10?
b. How can you find this number from the graph?
c. What does the slope tell you about the relationship betweenthe length and the width of warbler eggs?
39.IID.SB.1220.qxd 12/21/2005 19:10 Page 56
Section F: Lines That Summarize Data 57
Lines That Summarize Data F
12. a. What does the other number in the equation represent on thegraph (the number that is not the slope)?
b. Does this number make sense in terms of egg length andwidth? Why or why not?
c. What length does the equation predict for eggs that are1 mm wide?
d. Do you think the equation is useful for very small numbers?Why or why not?
The scatter plot shows the data for 96 birds of Britain.
13. a. Where in the scatter plot are the data for the warbler eggs?
b. What happens to the relationship between length and width asthe bird eggs get larger?
0
10
20
30
40
50
60
20 40 60 8010 30 50 70 90 100
Wid
th(i
nm
m)
Length (in mm)
Birds of Britain
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 10:27 AM Page 57
58 Insights into Data
Line That Summarize DataF
An equation for the line for the scatter plot of the 96 birds of Britainon page 57 is:
width = 0.7 � length � 1.6
14. a. Compare the equation above to the equation for the warblers:width = 0.7 � length � 1.1. What can you tell about the twolines from the equations?
b. Sketch this line on Student Activity Sheet 11. How do the linescompare?
c. If you make predictions based on the line for all eggs, whatwill happen to your predictions as the egg lengths increase?
15. What can you tell about the egg shapes of birds from differentfamilies?
For this activity, you will need chicken eggs and either a tape measure or a ruler.
Measure some chicken eggs at home. Record the lengths and widths. In class, collect everyone’s data. Find the typical length and width for a chicken egg.
How does the typical chicken egg fit in the graph of the birds ofBritain?
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 58
Graphs of growth often result in curves because livingthings do not grow at a constant rate throughout theirlifetimes.
Researchers in fisheries are interested in studying thegrowth rates of fish. One study on bluegills comparedthe lengths of fish at the beginning of the year withtheir lengths at the end of the year.
The table contains the results of this research.
16. a. How much did the fish that was initially 161 mmgrow in one year?
b. Graph the data from the table.
c. What pattern do you see in the scatter plot fromproblem 16b?
17. a. Draw a straight line to represent the data in thescatter plot.
b. Estimate how long a 140-mm fish will be afterone year.
c. Is a straight line a good model to represent thedata in the graph? Explain your reasoning.
18. a. Sketch a model on the graph that betterrepresents the data.
b. Using this model for the growth, describe whatis happening to the growth of the fish as thelength changes.
c. If a fish is now 110 mm long, predict how long it will be in one year.
19. Describe the difference between using a straightline and using the curved line you created inproblem 18 to predict growth.
Section F: Lines That Summarize Data 59
Lines That Summarize Data F
Gone FishingBluegill Growth
Initial Length after
Length x 1 Year y
(in mm) (in mm)
48 69
52 71
51 69
53 75
68 101
71 107
69 100
75 104
101 138
107 138
100 130
104 140
138 160
132 157
130 156
140 161
160 173
157 168
156 172
161 178
173 176
168 174
172 173
178 178
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 59
60 Insights into Data
Lines That Summarize Data
If the relationship between two variables appears to be linear, a linecan be found to describe the relationship.
Straight lines can be used to predict unknown values and to checkexisting values over the range of the data in the problem. If the trendseems to be linear, predicting an unknown value between data pointswill probably give an answer close to the truth.
The slope of the line can be expressed in terms of the data. This canlead to statements such as the following:
When the — increases by — , the — — by — .
F
13119
12
13
14
15
16
17
11
10
15 17 19 21 23
Length (in mm)
Wid
th (
in m
m)
39.IID.SB.1220.qxd 12/21/2005 19:16 Page 60
Section F: Lines That Summarize Data 61
Initial Length (in mm)
Bluegill Growth
Len
gth
aft
er
1 Y
ear
(in
mm
)
0 25 50 75 100 125 150 175 20060
70
80
90
100
110
120
130
140
150
160
170
180
You can draw a line that seems to capture the trend in the data. Youcan check how well your line seems to summarize the relationshipbetween the variables by checking how much the predictions madeusing the line would vary from the actual values in the data. A curvemay sometimes be used to describe a relationship that cannot bedescribed by a straight line. This is usually the case in the context ofgrowth, since growth rates are not constant.
1. Explain how you can find the slope of a line from the statementlike the one bolded in the Summary.
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 61
62 Insights into Data
Lines that Summarize Data
The table shows the number of calories and grams of carbohydratesfor standard size servings of different kinds of fruit.
2. a. Make a graph of the data with the number of calories on thehorizontal axis and the grams of carbohydrates on the verticalaxis. Write a sentence about the relationship you can observebetween calories and carbohydrates in fruit.
b. Draw a line that seems to represent the relationship youobserved in the data. Write the equation for this line.
c. Describe what the slope and y-intercept each mean in terms ofthe data.
F
Fruit Calories Carbohydrates (grams)
Apple, raw 23��4 in. diameter 80 21
Apricot, 3 raw 60 3
Banana, raw 105 27
Cherries, 10 50 10
Grapefruit, 1��2 raw, white 40 10
Grapes, 10 seedless 35 10
Cantaloupe, 5 in. diameter 95 22
Orange, 25��8 in. diameter 60 15
Peach, raw, 21��2 in. diameter 35 10
Strawberries, whole, 1 cup 45 10
Tomatoes, 1 whole 25 5
Watermelon, 4 � 8 in. wedge 155 35
Home and Garden Bulletin, No. 72, U.S. Department of Agriculture
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 62
Section F: Lines that Summarize Data 63
d. Will your graph work to predict the grams of carbohydrates ina cup of raisins if you know that they have 435 calories? Whyor why not?
e. Use both the graph of your line and your equation to predictthe number of carbohydrates in a banana. Do the twopredictions agree? How far off was your prediction?
3. Find the relationship between calories and grams of carbohydratesfor another food group. How does it compare to the relationshipin fruit? Share your findings with a classmate.
You have looked at several different ways to describe data in this unit.Choose one of them and describe a situation in which you would usea graphical representation to display the data.
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 63
64 Insights into Data
Additional Practice
Section Patterns in DataA
Mr. Flores surveyed his high school sophomore class to see if therewas a relationship between the number of hours the students studyper week and their grade point averages (GPA). He created the scatterplot shown below to display the data.
1. What is the GPA of the student who studies six hours per week?
2. Do you agree with each of the statements below? Explain youranswer.
a. The student who studied the least has the lowest GPA.
b. If you study at least nine hours a week you will have a GPA of 3.5.
3. Are there any outliers? If so, describe their locations.
Hours of Study
Comparison of GPA to Hours of Study
2 4 6 8 10 12 14 16
1
2
3
4
0
GPA
39.IID.SB.1220.qxd 12/21/2005 19:18 Page 64
Additional Practice 65
The table below shows the percentage of students who scored at orabove the basic levels in math and science on the 2005 NationalAssessment of Educational Progress in states from the Northeast andNorthwest. (Not every state administers the test.)
4. a. Do states with a high average percentage of students at orabove the basic level in math also have a high percentage ofstudents at or above the basic level in science?
b. What is the difference between the two regions, Northwest(NW) and Northeast (NE) in terms of their test scores? Can youthink of any reasons for this difference?
c. In which subject are the scores better? Explain how you cansee this in the graph.
d. The data might almost be grouped into three clusters, one atthe bottom left, one at the top right, and the cluster of pointsin the middle. What can you say about the states in the clusterat the bottom left and those at the top right?
Percentage at or above Percentage at or aboveState
Basic Level in Math Basic Level in ScienceRegion
IL 68 58 NW
IN 74 62 NW
MI 68 66 NW
MN 79 71 NW
MO 68 66 NW
ND 81 77 NW
OH 74 67 NW
WI 76 70 NW
ME 74 72 NE
VT 78 76 NE
MA 80 72 NE
NH 77 76 NE
RI 63 58 NE
CT 70 63 NE
Source: National Assessment of Educational Progress, National Center for Education Statistics,U.S. Department of Education
SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 10:36 AM Page 65
The graph on the left indicates that the priceof erasers increased from 1970 to 2005.
66 Insights into Data
Additional Practice
Section Interpreting GraphsC
2005
50¢
2000
45¢
1990
35¢
1980
30¢
1970
25¢
1. Do you think the graph represents thedata accurately? Explain your answer.
2. Draw a picture that accurately representsthe difference between the prices of theerasers from 1970 to 2005.
A taste test was conducted between two leading soft drinks. Testingbooths were set up at two different shopping centers in the townwhere the soda is made. A total of 425 people stopped by the boothsto take part in the taste test.
The results of the study were used in an advertising campaign:
“Three out of five people prefer the taste of Bingo Pop overother leading brands.”
1. Is this statement reliable or fair? Explain why or why not.
Two weeks later, the Bingo Pop company decided to do another tastetest to see if its advertisement campaign was effective. The boothswere set up in the same locations. Again, 425 people stopped by thebooths to take part in the taste test. This time, the results showed thatfour out of five people preferred the taste of Bingo Pop. The companyput out another ad:
“More and more people enjoy the taste of Bingo Pop every day.”
2. Do you think this is an accurate statement? Why or why not?
3. How could you ensure that the two surveys above were conductedaccurately?
Section Selecting SamplesB
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 66
Additional Practice 67
Additional PracticeAdditional Practice
Recall that in Section A of Additional Practice, Mr. Flores conducted asurvey about the number of hours students study and their GPAs. Mr.Harrison and Ms. Simmons conduct the same survey in their classes.The results from the two classes are shown in the tables.
Section Using DataD
1. a. Explain the advantages and/or disadvantages of using the following types of graphs to display the data:
• histogram • box plot • scatter plot
b. Reflect If you wanted to compare the two graphs of data,which graph would you use? Explain.
2. Make a scatter plot of the combined data from both classes.
Mr. Harrison’s Class Results
Hours SpentGPA
Studying
0 0
1 1
1 0
2 1
3 1
4 1.5
5 1
5 2
6 2
7 2.5
9 3
10 4
11 3.5
12 3
12 4
13 3.5
14 3.5
15 3.5
16 4
Ms. Simmons’s Class Results
Hours SpentGPA
Studying
1 0.5
2 0.5
4 1
4 1.5
5 1.5
6 2.5
7 2
8 2.5
8 3.5
10 2.5
10 3
11 3
11 4
12 3.5
13 3
13 4
14 4
15 4
16 3.5
39.IND.0526.eg.qxd 05/27/2005 14:06 Page 67
68 Insights into Data
Additional Practice
Section Correlating DataE
When you look carefully at your scatter plot, you can distinguish acluster of data in the upper-hand right corner.
3. Describe this cluster. What does this pattern say about the studying habits of the students in the two classes?
1. Study the scatter plots shown below. Indicate whether or notthere is a correlation for each plot. If there is a correlation, indicatewhether it is weak or strong.
2. What data could be represented by each of the above scatter plots?
3. Describe some data that show a strong correlation but does nothave a cause-effect relationship.
0 1
1
2
3
4
5
6
2 3 4 5 6 0 1
1
2
3
4
5
6
2 3 4 5 6 0 1
1
2
3
4
5
6
2 3 4 5 6
y y y
x x x
a b c
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 68
Additional Practice 69
Additional Practice
Section Lines That Summarize DataF
The relationship between the numbers of hours spent studying per week and GPA in Mr. Flores’s class can be described by a line as shown above.
1. a. Estimate the slope of the line.
b. How would you describe the slope in terms of the relationshipbetween number of hours spent studying and GPA?
2. What GPA would you expect of a student who studied 9 hours?
3. a. What criteria do you need to decide if the line above is drawnaccurately?
b. Is the line on the above graph an accurate line? Explain youranswer.
Hours of Study
0
1
2
3
4
2 4 6 8 10 12 14 16
GPA
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 69
70 Insights into Data
1. You may want to look for clusters (groups) in the data. There maybe something special about the data points in the clusters. Forexample, there may be a common feature for these data, such asa cluster of states that are in the same geographic region. Youmay want to look for patterns. Does the scatter plot show a trendof some kind? Is there more than one pattern? If the data are clustered, does each cluster have its own pattern? You can alsolook for outliers. What characteristics make a point an outlier?
2. a. You can write different general statements. Three examples:
• The heavier the vehicle is, the fewer miles per gallon it candrive.
• The change in the number of miles per gallon is not constantfor a given change in weight.
• If the weight is between 2,000 and 2,100 lb, the fuel economyis about the same, around 30 miles per gallon.
b. The fuel consumption of vehicle B is almost 30 miles pergallon, which is in the middle.
c. You can write different answers. There seem to be no apparentoutliers if you see all the points as lying on a curve. You canalso argue that the two points on each of the “ends” of thecurve are outliers because they are a little out of the generalpattern.
Section Patterns in DataA
01900 2000 2100 2200 2300
10
20
30
40
50
60
Mil
es p
er
Ga
llo
n
Weight (in lbs)
Vehicle Fuel Economy
A
B
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 70
Answers to Check Your Work 71
Answers to Check Your Work
3. a. Your graph might look like the one below.
You might suggest that as the percentage at or above the basiclevel in math increases, so does the percentage at or above thebasic level in science. However, North Carolina and Texas havehigh math scores and average science scores. These statestherefore fall outside that pattern.
b. Texas and North Carolina have the highest percentage of students at or above the basic level in math but are in themiddle of the states with respect to science.
c. More states performed better in math (9) than in science (4).There was one tie (OK). One way to tell this from the graph is by thinking about the points that would have the same percentage for both math and science. The line that would go through points such as (50, 50) and (60, 60) is the line M � S. The states that fall below this line have a higherpercentage of students at or above the basic level in math than they do in science.
40
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
42
MS
LA
SCAL
GA
AR
TN
WV
KY
KY
OK
MD
NC
TX
VA
44 46 48 50 52 54 56 58 60 62 64 66 68 70
Math
Scie
nce
Percentage of Students At or Above Basic
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 71
72 Insights into Data
Answers to Check Your Work
d. The states in the top cluster with more than 57% of the students at or above the basic level in math are Kentucky,Maryland, North Carolina, Oklahoma, Texas, Virginia, and West Virginia. These states are farther north; the other statesare all located in the far south.
e. You might say that Virginia did the best in both math andscience because they had nearly the highest percentage of students (65% compared to 67%) who were at or above thebasic level in math and were the highest (at 63%) in science.Texas and North Carolina had the highest percentage in math but were much lower in science (53% and 56%).
1. a. One disadvantage of taking one class as the sample is that you have all students from either sixth or seventh grade. Thesampling procedure would be biased because it would leaveout an important part of the population. You might also arguethat students in one class influence one another with respectto their preferences, and so the results of the sample will notbe reliable.
b. Different procedures are possible, for example:
• Make one list of all sixth- and seventh-grade studentsordered according to their last name; then take every fifthstudent in the sample.
• Randomly select students from each class, for example, byputting all the names in a box and taking out as many as you need for your sample.
2. a. You would expect half of the 50 numbers to be even, so about 25.
b. 28 out of the 50 are even.
c. No. The number will vary but is most likely around 25; in about 90% of the cases, the number will be between 20 and 30.
Section Selecting SamplesB
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 72
Answers to Check Your Work 73
Answers to Check Your Work
3. As an example for incorrectly choosing the sample, you mayhave chosen the example from problem 11c about the travelagencies or from 11d about the political poll. For “neglecting to account for the people who did not respond,”you may have chosen the example from problem 11b on thehealth food magazine survey.For letting interviewers select the people they want to interview,you may have chosen the example from problem 11a about thepolice officers interviewing people.Other examples can be found as well. Discuss your answers withyour classmates.
4. It is not true that the larger the sample, the less chance of bias. If the sample is not taken properly—for instance, because peoplewithout telephones cannot be chosen in the sample—a largersample does not change this. The same bias will still occurregardless of the sample size.
1. Different answers are possible. It is important for a graph to be anaccurate representation of data to reveal relationships betweenthe two variables so that proper conclusions can be drawn.
2. a. No, because the horizontal axis should be ordered like anumber line. Richard spaced the years evenly on the horizontalaxis, but there are more years between 1980 and 1990 thanbetween 1978 and 1979, and you cannot see this now. Sobetween 1980 and 1990, there seems to be a rather steepincrease, steeper than between 1997 and 1998. But actually, the first increase should be spread out over ten years,although you do not know what happened exactly in the years in between.
b. You might make a line graph in which the years are placed onthe axis with the right scale.
Section Interpreting GraphsC
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 73
74 Insights into Data
Answers to Check Your Work
3. Your answer will be different from your classmates’ answers.Have one of your classmates comment on your work and theother way around. Discuss both your articles or captions andparagraphs.
1. It is difficult to estimate the mean from a histogram; it is kind ofthe “balance” point of the distribution. The median can be foundin a histogram by finding the halfway point of the number ofdata. You can find this by counting in the bars. If, for example,there are 15 values, the eighth one represents the median. Youcan also count from both ends at the same time until you meet inthe middle; this is the median.
In a box plot, the median is drawn in the box as one of thesummary points. There is no way to find the mean and the mode from just the box plot.
2. Different answers are possible. Box plots are very useful if youwant to compare groups of data or if you want a summary of key points in the data.
3. a. Your answers may vary, but you may notice that the set ofcontinents in Set I seems to consist primarily of countries thathave a high level of industry, education, and generally goodeconomic conditions, compared to the continents in Set II that have countries where the standard of living is low andindustrial growth is not yet in place. It might be reasonable to conclude that teachers in Set I would earn more than teachers in Set II.
Section Using Plant Growth DataD
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 74
Answers to Check Your Work 75
Answers to Check Your Work
b. Your box plots might look like the following:
c. Both the median and range of the yearly salaries in the groupthat includes North America, Europe, and Australia/NewZealand are much larger than in the other groups of continents($24,000 and $44,000 as opposed to $11,000 and $24,000).Almost all of those in Set II are below the median salary forSet I. The United States data point is around the third quartile(Q3) of the first group, and this is higher than any of the statesin the second group.
1. Answers will vary, but you can say that if the points in a scatterplot form a tight cloud that is almost a straight line, there is astrong correlation. If the points are scattered more widely, thecorrelation is weaker; if points are in a cloud that is nearly a circle,no correlation exists.
5,0000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 55,000
Salary (in dollars)
Box Plots of Salary (in dollars)C
on
tin
en
t
I
II
Section Correlating DataE
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 75
76 Insights into Data
Answers to Check Your Work
2. a.
• The data points form a tight cloud that is almost a straightline, so the correlation is strong. As the weight of the vehicleincreases, the miles per gallon decrease; so the correlation is negative.
• The data points are in a circular pattern, so there seems tobe no correlation or relationship between the size of a birdand the number of eggs hatched. Some large birds hatchlots of eggs and some hatch few eggs; the same thing is truefor small birds.
• The points are in a rather wide cloud, so there seems to be a weak correlation. A person with a larger shoe size seemsto have a larger head circumference, so the correlation ispositive. On a closer look, it seems as if two clusters of dataexist. Within these clusters, the data are in a circular pattern.
b. It seems that the heavier the vehicle, the more fuel it wouldneed to operate, so this example might be a case of cause andeffect. The graph of the birds of Britain does not even show arelationship, so it would be unwise to search for a cause-and-effect relationship. The shoe size and head circumference plotdo show a weak relationship, but having big feet does notcause a large head; both are functions of how big the person is to start.
3. Your answers will all differ. Compare the example you found tothe examples of some of your classmates. If a strong correlationexists without a cause-effect relationship, often there is anothercommon feature that helps explain the correlation. For example,the correlation between the number of schools in a city and thenumber of shopping centers may be very strong, but both arefunctions of the population of the city. One does not cause theother. Another example might be the relation between scoringpoints and making fouls in a basketball game. The correlationmight be strong, but this does not mean that making fouls willincrease the number of points. Both are functions of how muchplaying time a player had.
4. scatter plot a with statement iiscatter plot b with statement iiiscatter plot c with statement iscatter plot d with statement iv
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 76
Answers to Check Your Work 77
Answers to Check Your Work
1. You can find the slope by dividing the vertical increase (or decrease) by the corresponding horizontal increase.
2. a. Your graph may look like the following.
You might say that as the calories in fruit increase, so does thenumber of grams of carbohydrates. Another answer might bethat most of the fruit has less than 30 g of carbohydrates, but thatwould not be a statement about the relationship with the numberof calories.
Calories
Carbohydrates vs Food Energy in FruitC
arb
oh
yd
rate
s (
in g
)
2000246810121416182022242628303234363840
40 60 80 100 120 140 160 180 200
Section Lines That Summarize DataF
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 77
78 Insights into Data
Answers to Check Your Work
b. One line might be Carbohydrates � 1 � 0.23 � Calories. Note that this line goes through the point representing the meannumber of calories and the mean number of carbohydrates,approximately (65,16).
c. For this example, the slope would be 0.23, or 23�����100, which
means that for an increase of 100 calories, the number ofgrams of carbohydrates increases by 23. The y-intercept is 1and indicates that if a type of fruit has 0 g of carbohydrates,you would still expect it to have about one calorie.
d. The graph will probably not work because it was made using a range of calories from 25 to 155, and 435 is far beyond thatrange. Even if you made the graph over on another scale, it isso far away from the given range that you cannot be sure thatthe trend you see in the data still holds.
e. Your answer will vary depending on the equation you have. If you used the equation for the number of carbohydrates in a banana, you will get about 25 grams. The graph gives youabout the same number of grams. In this case, the line wouldpredict 2 fewer grams than the actual number.
3. Your answers will depend on the kind of food you choose. You might look at grain products, fish, meat, dairy products, or vegetables. Have a classmate check your graph and your conclusions.
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 78
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 79
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 80
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 81
39.IND.0526.eg.qxd 05/27/2005 14:07 Page 82