insights into data - freudenthal instituut...2 insights into data i. ii. iii. iv. v. directions: i....

Data Analysis and Probability

Insightsinto Data

SE_FM_ppi_vi_ISBN9092_2006:39.IID.SB.CRPT.qxd 12/18/08 10:44 PM Page i

Mathematics in Context is a comprehensive curriculum for the middle grades.It was developed in 1991 through 1997 in collaboration with theWisconsin Centerfor Education Research, School of Education, University ofWisconsin-Madison andthe Freudenthal Institute at the University of Utrecht, The Netherlands, with thesupport of the National Science Foundation Grant No. 9054928.

The revision of the curriculum was carried out in 2003 through 2005, with thesupport of the National Science Foundation Grant No. ESI 0137414.

National Science FoundationOpinions expressed are those of the authorsand not necessarily those of the Foundation.

© 2010 Encyclopædia Britannica, Inc. Britannica, Encyclopædia Britannica, thethistle logo, Mathematics in Context, and the Mathematics in Context logo areregistered trademarks of Encyclopædia Britannica, Inc.

All rights reserved.

No part of this work may be reproduced or utilized in any form or by any means,electronic or mechanical, including photocopying, recording or by any informationstorage or retrieval system, without permission in writing from the publisher.

International Standard Book Number 978-1-59339-960-3

Printed in the United States of America

1 2 3 4 5 C 13 12 11 10 09

Wijers, M., de Lange, J., Bakker, A., Shafer, M. C., & Burrill, G. (2010). Insightsinto data. InWisconsin Center for Education Research & Freudenthal Institute(Eds.), Mathematics in context. Chicago: Encyclopædia Britannica, Inc.

SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 5/19/09 10:15 PM Page ii

The Mathematics in Context Development TeamDevelopment 1991–1997

The initial version of Insights into Data was developed by Monica Wijers and Jan de Lange. It was adapted for use in American schools by Mary C Shafer and Gail Burrill.

Wisconsin Center for Education Freudenthal Institute StaffResearch Staff

Thomas A. Romberg Joan Daniels Pedro Jan de LangeDirector Assistant to the Director Director

Gail Burrill Margaret R. Meyer Els Feijs Martin van ReeuwijkCoordinator Coordinator Coordinator Coordinator

Project Staff

Jonathan Brendefur Sherian Foster Mieke Abels Jansie NiehausLaura Brinker James A, Middleton Nina Boswinkel Nanda QuerelleJames Browne Jasmina Milinkovic Frans van Galen Anton RoodhardtJack Burrill Margaret A. Pligge Koeno Gravemeijer Leen StreeflandRose Byrd Mary C. Shafer Marja van den Heuvel-PanhuizenPeter Christiansen Julia A. Shew Jan Auke de Jong Adri TreffersBarbara Clarke Aaron N. Simon Vincent Jonker Monica WijersDoug Clarke Marvin Smith Ronald Keijzer Astrid de WildBeth R. Cole Stephanie Z. Smith Martin KindtFae Dremock Mary S. SpenceMary Ann Fix

Revision 2003–2005

The revised version of Insights into Data was developed by Arthur Bakker and Monica Wijers. It was adapted for use in American schools by Gail Burrill.

Wisconsin Center for Education Freudenthal Institute StaffResearch Staff

Thomas A. Romberg David C. Webb Jan de Lange Truus DekkerDirector Coordinator Director Coordinator

Gail Burrill Margaret A. Pligge Mieke Abels Monica WijersEditorial Coordinator Editorial Coordinator Content Coordinator Content Coordinator

Project Staff

Sarah Ailts Margaret R. Meyer Arthur Bakker Nathalie KuijpersBeth R. Cole Anne Park Peter Boon Huub Nilwik Erin Hazlett Bryna Rappaport Els Feijs Sonia PalhaTeri Hedges Kathleen A. Steele Dédé de Haan Nanda QuerelleKaren Hoiberg Ana C. Stephens Martin Kindt Martin van ReeuwijkCarrie Johnson Candace UlmerJean Krusi Jill VettrusElaine McGrath

39.IND.0526.eg.qxd 05/27/2005 14:02 Page iii

Cover photo credits: (left, middle) © Getty Images; (right) © Comstock Images

Illustrations2 (top left and right) Christine McCabe/© Encyclopædia Britannica, Inc.;11, 18 Holly Cooper-Olds; 28 Christine McCabe/© Encyclopædia Britannica, Inc.;46, 47 Holly Cooper-Olds; 59 © Encyclopædia Britannica, Inc.

Photographs1 (top) © Getty Images; (bottom) Lynn Betts, USDA Natural Resources ConservationService; 3 © Corbis; 6 (left to right) Ron Dahlquist; Mark E. Gibson/ Corbis;12 © Corbis; 16 Victoria Smith/HRW Photo; 17 (top) Sam Dudgeon/HRW Photo;(bottom) © Bettmann/Corbis; 29 Dennis MacDonald/Alamy; 32 Victoria Smith/HRW Photo; 35 © PhotoDisc/Getty Images; 38 Victoria Smith/HRW Photo;39 © PhotoDisc/Getty Images; 44 © Corbis; 46 Amos Morgan/PhotoDisc/GettyImages; 53 (left, middle) PhotoDisc/Getty Images; (right) Siede Preis/PhotoDisc/Getty Images; 54 PhotoDisc/Getty Images; 55 Siede Preis/PhotoDisc/Getty Images;56 George K. Peck; 58 Stephanie Friedman/HRW

SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/15/09 5:18 PM Page iv

Contents

Contents v

Letter to the Student vi

Section A Patterns in DataBean Sprout Experiment 1Living in Cities 3Summary 8Check Your Work 8

Section B Selecting SamplesCollecting “Fair” Data 11Biased Samples 16Random Numbers 18Summary 20Check Your Work 20

Section C Interpreting GraphsDifferent Impressions 22Summary 30Check Your Work 30

Section D Using DataExploring Growth 32Presenting the Bean Sprout Data 38Summary 40Check Your Work 41

Section E Correlating DataGrowing Babies 44Summary 50Check Your Work 50

Section F Lines That Summarize DataEgg Hunt 53Gone Fishing 59Summary 60Check Your Work 61

Additional Practice 64

Answers to Check Your Work 70

0

10

2

4

6

3

5

7

9

11

13

15

17

19

21

23

8

10

12

14

16

18

20

22

2425

1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21

Plant Height at 7 Days

Fre

qu

en

cy (

Nu

mb

er

of

Pla

nts

)

39.IND.0526.eg.qxd 05/27/2005 14:02 Page v

vi Insights into Data

Dear Student,

Welcome to Insights into Data. Do you look at the graphs in newspapers to see if they make sense? Numbers and graphs are used to describe situations all around you: sports, grades, sales, marketing, taxes, and even car ratings.

In this unit, you will learn how to use numbers and graphs to help you make decisions and draw conclusions. You will also study surveys and how they are conducted. You will grow mungbeans in soda, salt water, and tap water to see which is the best solution for growing sprouts. You will even learn to use lines to help you investigate the relationship between two things, such as the length and width of birds’ eggs. (Do you think birds’ eggs are mostly round?)

Look for graphs and numerical information in newspapers and magazines to develop your own insights into data.

Sincerely,

TThhee MMaatthheemmaattiiccss iinn CCoonntteexxtt DDeevveellooppmmeenntt TTeeaamm

39.IND.0526.eg.qxd 05/27/2005 14:02 Page vi

APatterns in Data

Bean Sprout Experiment

Farmers are concerned about parasites damagingtheir crops. Chemical companies develop pesticidesthat kill the parasites, but they have to be careful thatthe chemicals do not harm the crops. In this section,you will collect and examine data, and you will studyhow graphs can help you reach conclusions about adata set. In the experiment that follows, you willexamine the effect of different liquids on the growthof bean sprouts germinated from mung beans.

Section A: Patterns in Data 1

In the experiment below, you will investigate the answers to the following questions:

• How fast do bean sprouts grow?

• What happens to the growth of bean sprouts if the beans areplaced in different solutions?

You will use the results of this experiment later in the unit.

Before conducting an experiment, researchers hypothesize about, or predict, what will happen in the experiment.

1. Read the description of the activity on page 2 and then make a prediction about the outcome.

mung beans

39.IND.0526.eg.qxd 05/27/2005 14:02 Page 1

Your group will need the following items:

2 Insights into Data

i.

ii.

iii.

iv.

v.

Directions:

i. Cut three paper towels to fit the bottom of the petri dish.

ii. Put two layers of paper towels in the petri dish and arrange the mung beans on top. Figure out a way to identify the tenindividual beans so that you will be able to collect data foreach bean as it grows.

iii. Soak the paper towels and beans by adding several teaspoonsof your solution.

iv. Place another paper towel over the beans and dampen it withyour solution. Place the cover on the petri dish. Label the petridish with the name of your group and the type of solution used.

v. At approximately the same time each day, measure the lengthof each sprout in millimeters. Use the table on Student

Activity Sheet 1 to record the lengths of the sprouts and yourobservations about the growth of the beans. Keep track of theprogress of the bean sprouts for seven days. During this time,add more solution as needed to keep the paper towels wet.

• one petri dish

• paper towels

• ten mung beans that have been soakedovernight in tap water

• a metric ruler for measuring in millimeters

You will also need one of the following solutions:

• tap water

• 1 teaspoon of salt per pint of water

• 1 fluid ounce of cola per pint of water

• 1 fluid ounce of lemon-lime soda per pint of water

39.IND.0526.eg.qxd 05/27/2005 14:02 Page 2


The United States Census Bureau regularly investigates whatpercentage of people live in cities and what percentage live in ruralareas. The information also indicates the movement of people fromcities to rural areas and vice versa.

2. Why do you think it is important to know if people are movingfrom cities to rural areas?

Living in Cities

Bean Length of Bean Sprout (in mm) Observations

NumberDay 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7

and/or

Problems

1

2

3

4

5

6

7

8

9

10

39.IND.0526.eg.qxd 05/27/2005 14:02 Page 3

A


Patterns in Data

After the information is collected, the Census Bureau reports the percentage of urban population by state and the per capita income for each state.

3. Explain the meaning of “per capita income for each state.”

The scatter plot here and on Student Activity Sheet 2 shows the information collected by the United States Census Bureau about per capita income in each state and the District of Columbia and thepercentage of people who live in urban areas in that state. The meannationwide per capita income per state was about $30,000. Each stateis identified as a data point in the plot labeled by its postal code. Forexample, MS represents Mississippi, and DC represents the District of Columbia.

40 45 50 55 60 65 70 75 80 85 90 95 100

22,000

Percent Urban

Urban Population and Per Capita Income by State in 2000

24,000

26,000

28,000

30,000

32,000

34,000

36,000

38,000

40,000

42,000

44,000

Per

Cap

ita I

nco

me (

in d

oll

ars

)

MS

ME

MT

MO

MN

MD

MA

NJ

KY

IAKS

MI

IL

GAIN

LASC

TX

RIPA

VA

UT

TNSD ND

NH

DC

CT

NY

NE

NC

OK

NM

AZ

COCA

AK

OR

OH

ID

DE

FL

HI

NV

WV

VT

WY

WA

WI

AR

AL

Source: Per capita income in 2000: Statistical Abstract of the United States, 2003, Table 671. Percent urban: US Census Bureau

A

39.IID.SB.CRPT.qxd 04/06/2006 12:27 Page 4

APatterns in Data


Use Student Activity Sheet 2 to answer problems 4–15.

4. What does this graph tell you? Write two general statementsbased on the graph.

5. Look at the data point for Utah (UT).

a. What percentage of the people in Utah lived in urban areas in 2000?

b. What other information is shown by this data point?

6. Find the data point for California (CA). Explain what that pointrepresents.

Scott studied the scatter plot of the Census Bureau data on per capita income and percentage living in urban areas and said,“The higher the percentage of people who live in cities, the higher the per capitaincome is for the state.”

7. In general, do you agree with Scott’s statement?

Margaret studied the same scatter plot and made the followingcomment. “I don’t believe that $30,000 is the mean. There are only 20 states above the mean.”

8. Explain why so few states are above the mean. (Note: $30,000 isthe correct mean.)

Dan, Eliza, and Yolanda also studied the scatter plot of the CensusBureau data. Dan noticed,“Minnesota (MN) has a higher per capitaincome than Georgia (GA).”

9. a. Consider Dan’s statement. What else can you tell aboutMinnesota as compared to Georgia?

A

39.IND.0526.eg.qxd 05/27/2005 14:02 Page 5


Patterns in DataA

Eliza: “Hawaii (HA) and Nevada (NV) must be the same kind ofstates.”

b. Comment on Eliza’s statement. Do you agree?

Yolanda: “Alaska (AK) and Oklahoma (OK) are very different.”

c. Do you agree with Yolanda? Explain your answer.

10. Locate your home state on the scatter plot. Write a statementabout the per capita income in your state and its relativeposition on the graph.

Draw a horizontal line through a per capita income of $35,000.A group, or cluster, of states is above this line.

11. a. What states are in this cluster?

b. Write two sentences describing these states in terms of theirper capita income and the percentage of people in the stateliving in urban areas.

c. What can you say about the states below the line?

Sometimes you can find more information and make new statementsby looking more closely at a graph.

States in which fewer than half of the people live in urban areas havea per capita income that ranges from a little over $22,000 to about$29,500.

12. a. What is the range of the per capita income for states in which85% to 90% of the population live in urban areas?

b. What is the range of the per capita income for states in whichover 90% of the population live in urban areas?

SE_ppi_63_ISBN9603_2010.qxd:SE_ppi_63_ISBN9603_2010.qxd 4/16/09 8:46 AM Page 6

WEST

MIDWEST

SOUTH

NORTHEAST

WA

OR

MT

ID

WY

NV

CA

AZNM

UTCO

KS

NE

SD

NDMN

IA

WI

IL

MO

OK

TX

AR

LAMS

ALGA

TN

KY

SC

NC

VA

FL

INOH

WV

PADC

MI

NYVT

NH

ME

MD

DE

NJ

CT

RI

MA

AK

H I

APatterns in Data

The United States Census Bureau categorizesstates according to their geographical location.The map on this page shows these categories.


A

13. On the scatter plot on Student Activity Sheet 2, circle the dot foreach state in the Midwest in blue and circle the dot for each statein the South in red.

14. a. Washington, D.C. (DC) and Maryland (MD) might be calledoutliers in comparison to the other southern states. Explain what it means to be an outlier.

b. What other state might also be an outlier?

15. Explain the position of Illinois (IL) on the scatter plot. Compare itsposition to those of other states in the Midwest.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 7


Patterns in DataA

Data can be represented in a graph, such as a scatter plot like theone shown here. The data point A represents a car that weighs1,975 pounds and gets 51 miles per gallon.

Some conclusions you draw from a graph may be very obvious.For example, a scatter plot can show if there are clusters of dataor outliers.

Other conclusions may require more complex explanations, such asa description of a typical data point.

Often, careful examination of a graph can raise new questions. Moredata gathering and research may be necessary to answer these newquestions.

1. Describe some features you might look for in a scatter plot.Why might these be important?

01900 2000 2100 2200 2300

10

20

30

40

50

60

Fuel

(in

mile

sp

erg

allo

n)

Weight (in lbs)

Vehicle Fuel Economy

A



2. a. Study the scatter plot for Vehicle Fuel Economy shown in theSummary. What does this graph tell you? Write two generalstatements.

b. Vehicle B weighs 2,100 pounds. Locate the data point for B inthe scatter plot. What can you tell about the fuel consumptionof car B?

c. Is there an outlier in the scatter plot? Explain your answer.

The table below shows the percentage of eighth-grade studentswho scored at or above the basic level in math and science on the2005 National Assessment of Educational Progress in the southernstates of the United States.

Percentage at or above Percentage at or aboveState Basic Level in Mathematics Basic Level in Science

Alabama 66 48

Arkansas 64 56

Delaware 72 63

Florida 65 51

Georgia 62 53

Kentucky 64 63

Louisiana 59 47

Maryland 66 54

Mississippi 52 40

North Carolina 72 53

Oklahoma 63 57

South Carolina 71 54

Tennessee 61 55

Texas 72 53

Virginia 75 66

West Virginia 60 57

Source: http://nces.ed.gov/nationsreportcard/states



Patterns in DataA

3. a. Use Student Activity Sheet 3 to make a scatter plot of thepercentage of students at or above the basic level in mathand science. Identify each data point by labeling it with thestate it represents. Write a general statement about thepattern(s) in the data you can observe from the graph.

b. Which state(s) do not seem to fit the pattern?

c. How can you tell from the graph whether, overall, the statesseemed to do better in math or in science? Explain yourreasoning.

d. Circle the group of states whose percentage of studentsscoring at or above the basic level in both math and sciencewas more than 60%. Identify these states. How might thesestates differ from the others?

e. Which state(s) had the most students scoring at or above thebasic level in both math and science? Justify your answer.

Using the scatter plot of Urban Population and Per Capita Income byState in 2000, select two other states that are in the same region asyour state. Write two or more statements that compare your state’sdata point to that of your neighbors. If they are different, tell why.


Data can be obtained from organizations such as the United StatesCensus Bureau, and the results can then be graphed. However, it isnot always easy to get accurate data, as you may have seen in theunit Dealing with Data.

Questions such as the following are important in statistics:

• How do you get reliable data?

• What is the best way to visually present the data?

• How do you draw accurate conclusions based on the data?

Looking carefully at graphical representations of data is important.Even graphs based on complete data, such as the scatter plot on page 4, must be studied carefully before reliable conclusions can be made.

Section B: Selecting Samples 11

BSelecting Samples

Collecting “Fair” Data

A summer band camp has middle school students from all 50 statesand Washington, D.C. Twenty students from Delaware are at the camp.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 11

The Census Bureau data indicate thatabout 80% of the population in Delawarelives in urban areas, as shown in thescatter plot on page 4.

1. Do you think it is likely that of the 20 middle school students fromDelaware at the band camp, 16 live in urban areas? Explain your thinking.

Sue states, “Only eight out of the 20Delaware students in the band camp livein urban areas.”

2. Does this number surprise you? Whatare some possible reasons for therather low number?


Selecting SamplesB

The question to investigate is, “How likely is it that in a randomlyselected sample of 20 middle school students from Delaware, onlyeight of them live in urban areas?” Choosing a random sample isimportant because it helps reduce bias in the sampling process. Asample is biased when it favors certain outcomes or some parts of the population over others. Care must be taken so that any memberof the population has an equally likely chance of being chosen in thesample. In statistics, random means that each element of a set has anequal probability of occurring.

3. a. Reflect What is meant by a “randomly selected sample” ofstudents from Delaware?

b. How could someone randomly select 20 middle school students from Delaware?

Suppose you had a random sample of students from Delaware. Howmany of them do you think would be likely to come from urban areas?To investigate this question, you can create a model, or a simulation,of the situation in Delaware.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 12

Student Activity Sheet 4 is divided into ten rectangles.


Urban Urban

Urban Urban

Urban Urban

Urban Rural

Urban Rural

These rectangles represent the percentageof people from urban and rural areas inDelaware, where two out of every tenpeople are from a rural area.

• Cut out the rectangles. Fold them once and put them in a paperbag or box. Shake the container well.

• Take out a rectangle. Record in a table what is written on therectangle and put the rectangle back in the bag or box. Shake thecontainer to thoroughly mix the rectangles. Repeat this 20 times.

4. Explain how this activity has simulated taking a random sampleof 20 students from Delaware.

5. a. Make a table to tally the results for the entire class.

b. Reflect How do your results compare with your classmates’results? Explain any similarities or differences.

c. How many of your classmates have exactly eight studentsfrom Delaware who live in an urban area in their sample?Does this result surprise you? Why or why not?

d. How many of your classmates have exactly 16 students fromDelaware who live in an urban area in their sample? Does thisresult surprise you? Why or why not?

Urban

Rural

Name Number of “Urban” in Sample

Number of “Rural” in Sample

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 13

A

Organizing the data may help you see any patterns in the class results.You can use a table like the one below to show the possible numberof rectangles in the sample of 20 that had “urban” on them and a tallyof the students who had each number in their sample.

6. a. What do the three tally marks in the table mean?

b. Use the data collected in problem 5 to make a table like theone above.

c. Based on the results in the table what is your answer to thequestion, “How likely is it that in a randomly selected sampleof 20 middle school students from Delaware, only eight ofthem live in an urban area?”


Selecting SamplesB

Number of “Urban” in Number of Students in Class

Sample of 20 Who Had This Number

….

…

7

8

9

10

11

12

13

14

15

16

17

. . . .

///

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 14

A

It is often easier to get a clear picture of the data if you have a graph.

7. a. On Student Activity Sheet 5, use the data from the table youmade in problem 6b to make a histogram.

b. Based on your histogram, write two sentences about thenumber of Delaware students in a sample of 20 students wholive in cities.

c. Based on your data, how likely do you think it is to have 14 to18 Delaware students in a random sample of 20 students wholive in urban areas?

d. What do the results of your simulation tell you about thenumber of Delaware students at the band camp who are likely to live in urban areas?

8. a. Repeat the simulation you did in the activity on page 13.Collect the class data. Add the new results to the table youmade in problem 6b.

b. Make a histogram using the new table. How does this histogram compare with the first one?

c. What kind of results do you think you would get if you continued to repeat the experiment?


BSelecting Samples

Av

era

ge

Glo

ba

l Te

mp

era

ture

(in

°F

)

Year

2000-2001

1990-1999

1980-1989

1970-1979

1940-1969

1930-1939

1910-1929

1900-19092002

58

5

5

01900 2000 2100 2200 2300

10

20

30

40

50

60

Mil

es p

er

Ga

llo

n

Weight (in lbs)


A

B

Initial Length (in mm)

Bluegill Growth

Len

gth

aft

er

1 Y

ear

(in

mm

)

0 25 50 75 100 125 150 175 200

60

70

80

90

100

110

120

130

140

150

160

170

180

0

10

2

4

6

3

5

7

9

11

13

15

17

19

21

23

8

10

12

14

16

18

20

22

2425

1 2 3 4 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21


Fre

qu

en

cy

(N

um

be

r o

f P

lan

ts)

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 15

On pages 13–15 you completed a simulation. By taking a number of random samples of 20 from a population where 80% live in urbanareas, you created a sampling distribution of those that live in urbanareas. You probably found that having around 14 to 18 out of 20 who live in urban areas was a likely result. Your simulation probablyindicated that 8 out of the 20 did not happen very often in any of thesamples and so was a very unlikely result.

If only eight out of the 20 students from Delaware at the band camplive in urban areas, you can conclude that this sample of 20 studentsdid not seem to be typical of the population of Delaware with respectto the living environment. Having only eight of the 20 students fromurban areas could have occurred by chance, but it does not seem very likely.


Selecting SamplesB

Biased Samples

Many samples are biased because they favor certain outcomesor they favor some parts of the population over others. In suchinstances, there is a systematic error in the way the sample represents the population. Consider the following situation:

Some TV stations poll the public. Viewers are urged to callspecific numbers to voice their opinions. Dialing one numberregisters a “yes” vote; dialing another number registers a“no” vote.

9. Mention at least two problems with this type of sampling.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 16

Bias can result when underlying factors about a situation are notconsidered during the selection of a sample.

10. Why might the information collected about students at a bandcamp be biased?

11. Read each of the following survey situations carefully. Explainhow each poll could involve bias.


Selecting Samples B

a. The chief of police in a major U.S. city wants to determine how the public feels about the department.He prepares a questionnaire and sends police officersout to interview people in randomly selected sectionsof the city.

b. A magazine for health foods and organic healingwants to establish that large doses of vitaminsimprove health. The editor asks readers who haveregularly taken vitamins in large doses to write to the magazine and describe their experiences. Of the2,754 readers who reply, 93% report some benefitfrom taking large doses of vitamins.

c. A researcher wants to find out how many Americansintend to vacation in the United States in one year. To avoid bias, she selects 27 travel agencies in largecities and interviews every seventh visitor. The resultsof her research are published and titled “RecordNumber of Americans to Foreign Destinations.”

d. Reflect In 1936, the largest poll about the presidentialelection between Franklin Roosevelt and Alf Landonwas taken by a magazine called Literary Digest.The publisher sent out 10 million questionnaires to people listed in telephone books. They also used other sources, such as car registrations andsubscriber lists. The magazine received 2.4 millionreplies. As a result of the poll, Literary Digestpredicted that Landon would win by a margin of57% to 43%. However, Roosevelt won the election.Another research group used a much smallersample of 50,000 people and predicted correctlythat Roosevelt would win the election. Give somereasons why the smaller sample gave a better prediction.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 17

You simulated taking a random sample of students from Delaware bypulling rectangular pieces of paper out of a box. Simulations are oftendone using a set of random numbers. The set you will use consists ofnumbers from 0 through 9 in random order. You can read randomnumbers from a table or generate them on a computer or calculator.

Suppose you looked at a set of 50 random numbers ranging from 0through 9.

12. How many numbers in the set would you expect to be a 0 or a 9?Why?

The following is a set of 50 random numbers.

13 a. How many numbers in this set are a 0 or a 9? Compare this toyour answer to problem 12.

b. Would you expect to get exactly this many numbers being a 0 or a 9 every time you look at a set of 50 random numbers?


Selecting SamplesB

Random Numbers

1 2 6 7 2 4 0 1 7 0

2 7 9 3 7 9 0 4 7 2

1 4 6 2 2 5 6 1 6 4

0 5 7 6 4 6 4 7 3 5

2 7 9 0 4 1 2 0 2 7

Remember the band camp? You can also userandom numbers to simulate the chance thatyou would see only eight students fromDelaware who lived in an urban area.

14. Select a set of 20 random numbers from thetable by arbitrarily choosing one of the rowsor columns and counting out 20 numbers.How many of the 20 numbers that youselected are a 0 or a 9?

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 18

The 20 numbers could represent the 20 students from Delaware at theband camp. The 0s and 9s represent those who lived in rural areas, andthe other eight numbers (1 to 8) represent those living in urban areas.

15. a. Using your set of 20 random numbers from problem 14, howmany students did you have who lived in urban areas? (That is, how many of the random numbers in your set were 1 to 8?)

b. Collect the class results for their sets of 20 random numbersand make a histogram of the number of Delaware studentsfrom all of the sets who were from urban areas. Using thehistogram, how likely do you think it would be to have 14 to 18 Delaware students in a random sample live in cities?

16. How do your results from the simulation with random numberscompare to the simulation you did with the numbers in the box?


Selecting Samples B

Math HistoryThe United States Census

In the United States, the first prototype of a population pyramidwas published in the Statistical Atlas of the United States Basedon the Ninth Census (1870).

Statistical data about the population in the United States is collected by the Census Bureau. Fact-finding is one of America’soldest activities. In the early 1600s, a census was taken inVirginia, and people were counted in nearly all of the Britishcolonies that later became the United States.

Following independence, there was an almost immediate needfor a census of the entire nation. The first census was taken in 1790, under the direction of Secretary of State ThomasJefferson. That census, taken by U.S. Marshals on horseback,counted 3.9 million inhabitants.

Nowadays, graphs have fewer mistakes because most of themare made using computer software; however, they all looksomewhat similar. Of course unique graphs still exist. You canfind them in newspapers!

000000

054062

033

070

031

013

004016

057

044

189

479

001003

001000

Wyoming

Mississippi

002002

004003

012015

022026

036

051 056

096

123

147

125

151

088

037

Vermont

005005

016

030

041

051

062 065

084081

106 102

107110

051

029

039

016

Washington

136141

081088

065114

050145

024086

011038

005010

002003

002001

The total number of living inhabitants in each case, as reportedin the census, is reduced to thousandths, and the number ofthousandths of each sex in each decade of life represented bythe distance measured on the horizontal lines, severally, fromthe perpendicular base line.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 19


Selecting SamplesB

A population is a group of people or set of objects about which youwant to gather information.

You can collect data by questioning a sample of people from a specificpopulation or by examining a sample of objects from a set that haslike characteristics.

When taking a sample, it is important to do so randomly so that everymember of the population has an equal chance of being selected.

You can also collect data by designing and running an experiment orby carrying out a simulation.

When collecting data from a sample, you should avoid bias. Somepossible causes of bias are:

• incorrectly choosing the sample;

• neglecting to account for the people who do not respond; and

• letting interviewers select the people they want to interview.

A researcher is interested in preferences of middle school students.Your school is willing to participate in a survey for sixth- and seventh-grade students, but not all students can participate. Susan suggestsgiving the survey to all of the students in one class.

1. a. Will this be a fair sample? Explain your thinking.

b. How would you select a random sample of sixth- and seventh-grade students from your school?

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 20


2. a. If you look at a set of 50 random numbers ranging from 0 to 9,how many would you expect to be even?

b. Use the random number set from problem 12 and find howmany of these numbers are even. (Note: Count 0 as an evennumber.)

c. Would you expect to get this many even numbers out of everyset of 50 random numbers?

3. Select one example from this section that you think illustrateseach of the following causes of bias.

• incorrectly choosing the sample

• neglecting to account for the people who do not respond

• letting interviewers select the people they want to interview

4. Reflect Some people think that the larger the sample you take,the less chance you have of bias. Do you agree? Explain youranswer.

Samui chose a random sample of the eighth graders at his schooland found that their favorite sport was basketball. In his report he stated, “Eighth graders prefer basketball to any other sport.”Comment on his conclusion.

39.IND.0526.eg.qxd 05/27/2005 15:27 Page 21


CInterpreting Graphs

Different Impressions

Graphs are useful for representing information in a clear and conciseway. The graph of rage and fear is a good example of a graph thatconveys information using only two words.

1. a. Reflect Describe what is represented in this graph.

b. Why is this a “good” example?

Many people do not trust the information provided in statisticalcharts. Sometimes the data come from a poorly selected sample, or the data are presented improperly. Some graphs have “mistakes”or “misrepresent” data. When using graphs, it is important to thinkabout how they are constructed and to make sure the graphs do notgive the wrong impressions.

RAGE

FEARFEAR

Source: Data from “Catastrophe Theory” by E.C. Zeeman. Copyright © 1976 by Scientific American, Inc. All rights reserved.

39.IND.0526.eg.qxd 05/27/2005 14:03 Page 22

A

The graphs on this page give two different impressions of the relationship between the number of hours per week that a student works at a job and his or her grade point average.

2. a. Why do the graphs provide different impressions?

b. Which graph do you think a high school principal would usewhen talking to parents about their child’s decreasing gradepoint average? What argument do you think the principalwould make?

Section C: Interpreting Graphs 23


3.1

2824201612840

2.6

3.0

2.9

2.8

2.7

Hours Worked Per Week

Gra

de

Po

int

Avera

ge

3.2

28242016128400

2.8

2.4

1.6

0.8

2.0

1.2

0.4

Hours Worked Per Week

Gra

de P

oin

t A

ve

rag

e

39.IID.SB.1220.qxd 12/21/2005 18:50 Page 23

The table contains information about the world population for specific years.

3. Draw a graph to represent the data from the table.

4. a. Based on these data, what would you expect the world population to be in the current year?

b. Use an almanac or search the Internet to find the world population in the current year. How does the actual populationfor the current year compare to your answer for part a?

c. What do you think the world population was in 1985? How didyou make your estimate?

d. Which estimate of the world population do you think will bemore accurate: your estimate for 1985 or an estimate for 2025?Explain your reasoning.


Interpreting GraphsC

Year World Population

(in billions)

1630 0.5

1820 1.0

1890 1.5

1930 2.0

1950 2.5

1960 3.0

1968 3.5

1975 4.0

1981 4.5

1987 5.0

1994 5.5

2000 6.0

Source: http://www.ibiblio.org/lunarbin/worldpop/

39.IID.SB.1220.qxd 12/21/2005 18:50 Page 24

A



Use Student Activity Sheet 6, which shows examples of graphs madeby two students to answer problem 3.

5. a. Do these graphs accurately represent the data? Explain.

b. Compare these graphs to the graph you drew in problem 3.

You may hear that the climate is changing since the average globaltemperature is rising. The graph below shows the average globaltemperature from 1900 to 2002.

Avera

ge G

lob

al Tem

pera

ture

(in

°F)

Year

2000–2001

1990–1999

1980–1989

1970–1979

1940–1969

1930–1939

1910–1929

1900–1909

2002

58

57

56

Increase in Global Temperatures 1900–2002

Source: National Oceanic and Atmospheric Administration

6. a. Does the graph give you a good representation of the changein temperature? Explain your thinking.

b. How do the scales of the graph affect your impression?

c. Comment on the statement: The data show that the earth isgetting warmer than ever before.

39.IID.SB.1220.qxd 12/21/2005 18:51 Page 25

Although the two graphs below represent the same data, they givedifferent impressions.



980

5

10

15

20

25

99 00 01 02 03 04

Percentage of Delayed Flights

Pe

rce

nt

De

lay

ed

Year

98

15

20

25

99 00 01 02 03 04

Percentage of Delayed Flights

Year

Perc

en

t D

ela

yed

7. a. Which graph suggests that fewer flights are delayed?

b. How was this impression achieved?

c. What groups of people might choose to use each graph? Why?

d. Do you think these data were gathered by sampling, or do youthink they represent all flights? Give reasons to support youranswer.

39.IND.0526.eg.qxd 05/27/2005 14:04 Page 26


Interpreting Graphs C

8. The graph represents the percentage of airline seats filled duringthe second quarter of 2003 through the first quarter of 2004.

a. It appears that the first quarter of 2004 had double thepercentage of passengers as in the second quarter of 2003. Is this an accurate description? Explain your answer.

b. Write a statement that accurately describes the changes fromthe second quarter of 2003 to the first quarter of 2004.

Populations of Five Large U.S. Cities

9. a. What message does the graph above tell you?

b. Do you think the pictures express the populations in an accurate way? Why or why not?

69.5

Second Quarterof 2003

Passenger Load Factor

Third Quarterof 2003

Fourth Quarterof 2003

First Quarterof 2004

70.0 70.5 71.0 71.5 72.0 72.5 73.0

Percent

500,000 540,000 650,000 790,000 900,000

Oklahoma City, OK Charlotte, SC Memphis, TN Indianapolis, IN San José, CA

39.IID.SB.1220.qxd 12/21/2005 18:54 Page 27



10. a. How was the $209 computed?

b. What data are used to make this graph?

c. The last package has $838 written above it. The amount $838is 4 � $209.40 rounded to the nearest whole number. Explainhow this fits the data.

d. Is the size of the last package four times the size of the first?Carefully explain your answer.

$838

5 10 15 20

packages packages packages packages

per month per month per month per month

$628

$419

$209

Eating Cereal Is a Good Investment

Average cost of a 16 oz box

of cereal is $3.49.

Annual cost

39.IND.0526.eg.qxd 05/27/2005 14:04 Page 28

Design and carry out a statistical survey. Pay attention to the following aspects.

• What are you going to investigate?

• How do you write a good questionnaire?

• What is a good sample for your survey?

• How will you graphically represent your data?

• What conclusions can you make from your data?

• Do any new questions arise from your data?

• What further investigations might be necessary?

• Is there possible bias in the way you intend to carry out yoursurvey? If so, can you eliminate it?


39.IND.0526.eg.qxd 05/27/2005 14:04 Page 29



In order to make accurate conclusions from data, the data must bereliable and presented appropriately.

Data can be presented in many different ways.

• picture graphs • histograms

• line graphs • scatter plots

• bar graphs

When you see a graph, you should look carefully to make sure thatthe graph is a fair one that accurately tells the story of the data. Datamay be misrepresented if one or more of the following occurs:

• the graph’s axes are scaled improperly;

• origins on the graph are excluded;

• three-dimensional pictures are used inappropriately;

• numbers that should not be compared are compared;

• pictures that do not fit the numbers are used.

1. Why is it important for a graph to be an accurate representationof data?

Richard is a member of a neighborhood football club. His father andhis brother John were members as well.

They recorded the number of club members for 10 different years.

1983 45 2003 701984 41 2005 671985 53 2006 801995 60 2007 752002 68 2008 70



Richard graphs these data in the following way.

2. a. Does Richard’s graph represent the data accurately? Explain.

b. Draw another graph that you think accurately representsthe data.

3. Cut out a graph from a newspaper or magazine. Include thecaption or article that accompanies the graph. Attach the graphwith the article or caption to a sheet of paper. Write a paragraphthat explains how the graph presents the information in thecaption or article. Do you think the graph is a good representationof the data or not? Explain your reasoning.

Make a list of all of the ways you think a graph can be misleading.Make another list of things you should watch for when you arelooking at graphs in the media.

1983

10

20

30

40

50

60

70

80

90

1984 1985 1995 2002 2003 2005 2006 2007 2008

Year

Nu

mb

ero

fC

lub

Mem

ber

s



Graphs of data can help you get a better picture of how the data are distributed. Graphs can help you interpret the data and makestatements about an experiment.

The graph below represents the height of plants after a seven-dayperiod.

1. a. Study the graph. What does the bar at 7 millimeters (mm) represent?

b. Write three statements about how tall the plants in the studygrew.

DUsing Data

Exploring Growth

Day 1 Day 4 Day 5 Day 6 Day 7

0

10

2

4

6

3

5

7

9

11

13

15

17

19

21

23

8

10

12

14

16

18

20

22

2425

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21


Height (in mm)

Fre

qu

en

cy

(N

um

ber

of

Pla

nts

)

39.IND.0526.eg.qxd 05/27/2005 14:04 Page 32

Jo Mei, says “The mean height of the plants at the end of theexperiment is about 10 mm.”

Jorge, says “The mean height of the plants at the end of theexperiment is about 15 mm.”

2. a. Who do you think is right, Jo Mei or Jorge?

b. Which number is most easily found in the graph: the mean, the median, or the mode of the height values?

c. Reflect Explain how you might use the information in thegraph to find the median height of the plants.

Akir, Kari, Viviana, and Marja were studying the growth of plants. Afterseven days, they each graphed their data. Then they wrote statementsabout the growth of their plants. Unfortunately, some of the work wasmisplaced, and the rest was mixed together. On this page you see whatis left of their work.

3. For each student, match the statement with a graph. If a student’sgraph is missing, make an appropriate graph.

Section D: Using Data 33

Using Data D

39.IND.0526.eg.qxd 05/27/2005 14:04 Page 33

10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789

10111213141516

1817

1920

Height (in millimeters)

Fre

qu

en

cy (

nu

mb

er

of

pla

nts

)


10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789

10111213141516

1817

1920


Fre

qu

en

cy (

nu

mb

er

of

pla

nts

)


A

The histograms shown below and on the next page represent the frequency of heights of a group of plants at 10 days, 12 days, and 14 days.


Using DataD

39.IND.0526.eg.qxd 05/27/2005 14:04 Page 34

A


DUsing Data

4. a. Write down what each histogram tells you about the plantheights.

b. Describe the growth pattern of these plants.

10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789

10111213141516

1817

1920


Fre

qu

en

cy

(n

um

ber

of

pla

nts

)


39.IND.0526.eg.qxd 05/27/2005 14:05 Page 35


Using DataD

Josh decided to make a line graph of the mean plant height for all ofthe plants for certain days.

5. a. How did he find the height for day 10? What does this pointmean?

b. When did the plants seem to grow the most? How can you see this on the graph?

c. Comment on the advantages and disadvantages of usingJosh’s graph to describe the growth of the plants.

00

10

20

30

40

50

60

70

80

90

100

2 4 6 8 10

Day

Mean Height of Plants

Heig

ht

(in

mm

)

12 14 16 18 20

39.IID.SB.1220.qxd 12/21/2005 19:00 Page 36

Box plots can be also be used to compare the height of the plants.

Remember that a box plot is a graph in which the data are groupedinto four groups of roughly equal size. To draw a box plot, you needfive numbers from your data:

• the lowest number, or minimum

• the middle of the lower half, or first quartile (Q1)

• the median

• the middle of the upper half, or third quartile (Q3)

• the highest number, or maximum

6. a. Explain what these box plots tell you about the growth of the plants from the 12th to the14th day.

b. What can you tell about the growth of the plants from the box plots that you cannot tell from the line graph? From the histograms?


Using Data D

0

Day 14

Day 12

20 40 60 80

Height (in mm)

Box Plots of Plant Height

Da

y

100 120 140 160 180

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 37


Presenting the Bean Sprout Data

In the bean sprout growth experiment you conductedon page 2, you collected data and kept a record of thelengths of the bean sprouts and your observations.

Prepare a report for the class on the results of yourexperiment. Your report should include the following:

• the data you collected;

• a graph of the final length of each sprout;

• a written description of what the graph showsabout the final sprout lengths;

• a graph of the growth of the sprouts over time;

• a written description of what the graph showsabout the growth of the sprouts over time;

• some statements about the data in which you usethe mean, median, or mode (select the one you feelis best for your data);

• a statement about how the growth of the beansvaried;

• your conclusions about how the bean sprouts grewin the solution you used;

• a list of the things that affected the results of yourexperiment; and

• a list of the things you would change if you were torepeat the experiment.

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 38

A


DUsing Data

Your teacher will give you some self-stick notes. Write the final lengthof each bean sprout on a separate self-stick note. Put your notes onthe number line your teacher has drawn on the board to make a histogram. Use the finished histogram on the board to answer the following questions.

7. a. What is the average length of the bean sprouts?

b. How did the bean sprouts in your solution compare to thebean sprouts in other solutions? What conclusions can youdraw about the effects of your solution on the way the beansprouts grew?

The histogram gives a general picture of the growth of the beansprouts in each solution.

Suppose you want to compare the effects of the different solutionsmore directly. One way to compare solutions is with box plots.

8. a. Make a box plot for the final lengths of the bean sprouts inyour solution. As a class, decide on a common scale for thebox plots. Why is a common scale necessary?

b. Below your box plot, write the type of solution your groupused. Put your box plot on the board together with those from other groups. How do the box plots help you comparethe different solutions?

c. What is the difference between the information you can see on a histogram and the information you can see in a box plot?

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 39


Using Data

A data set can be graphed in different ways.

A histogram is a general picture of the data. It allows you to see howthe data are distributed.

A box plot is a graph in which the data are grouped into four groupsof approximately equal size. A box plot gives you a summary of thefive values:

• the lowest number, or minimum

• the middle of the lower half, or first quartile (Q1)

• the median

• the middle of the upper half, or third quartile (Q3)

• the highest number, or maximum

D

10 20 30 40 50 60 70 80 90 100 110 140 150120 130 160 170 18000123456789

10111213141516

1817

1920


Fre

qu

en

cy

(n

um

be

r o

f p

lan

ts)


39.IND.0526.eg.qxd 05/27/2005 14:05 Page 40


These five values divide the data in four groups, each of whichcontains about 25% of the data. A box plot shows an overall pictureof the data but does not allow you to see details. Box plots areparticularly useful for comparing several data sets.

A line graph (graph over time) shows change over a period of time—for example, the change in the length of the bean sprouts from day today. The graph allows you to look for trends in the data.

A description of the data and what you can see in the graph can makeinterpreting the graph easier. Statements about the center of the datausing mean or median and about the spread using range or quartilescan also give insights into the data.

1. Can you estimate the mean, median, or mode from each of thesetypes of graphs? Explain.

• a histogram • a box plot

2. When is it helpful to use box plots?

0

Day 14

Day 12

20 40 60 80

Height (in mm)

Box Plots of Plant Height

Day

100 120 140 160 180



The table contains the starting salaries of teachers in secondaryschools in different countries in 2001. The salaries are given in dollars and adjusted for different money values among countries. The countries have been coded by grouping the continents in which they are located: I—North America, Europe, Australia, and New Zealand; II—South America, Africa, and Asia.

3. a. What salary differences between the two groups of continentsdo you expect to find?

b. Make box plots to represent the starting salaries of teachers inthe two continent groups.

c. How would you describe the differences in starting salaries inthe two continent groups?

Country Starting Salary Continent Group

(in dollars)

Argentina 11,000 II

Australia 28,000 I

Austria 25,000 I

Belgium (Fl.) 31,000 I

Belgium (Fr.) 30,000 I

Brazil 17,000 II

Chile 12,000 II

Czech Republic 12,000 I

Denmark 30,000 I

Egypt 2,000 II

England 23,000 I

Finland 23,000 I

France 24,000 I

Germany 43,000 I

Greece 20,000 I

Hungary 8,000 I

Iceland 23,000 I

Indonesia 1,000 II

Ireland 24,000 I

Source: Organization for Economic Cooperation and Development

Using DataD

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 42


Country Starting Salary Continent Group

(in dollars)

Italy 25,000 I

Korea (South) 25,000 II

Malaysia 14,000 II

Netherlands 29,000 I

New Zealand 18,000 I

Norway 29,000 I

Paraguay 14,000 II

Peru 6,000 II

Philippines 11,000 II

Portugal 20,000 I

Scotland 22,000 I

Slovakia 5,000 I

Spain 31,000 I

Sweden 23,000 I

Switzerland 49,000 I

Thailand 6,000 II

Tunisia 21,000 II

United States 29,000 I

Uruguay 6,000 II

Write a paragraph that helps another class to prepare to do the beansprout experiment. Make sure you include any changes that youwould make that might make it easier to collect data that gives youinformation you want.

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 43


In 1962, researchers studied 100 newborn babiesto see whether there was a relationship betweenthe length of the body and the circumference ofthe head. The table shows the results of thisstudy.

Most tallies appear along the diagonal thatextends from the lower left to the upper rightcorner of the graph.

ECorrelating Data

Growing Babies

Source: Sunburst Communications

47

39

38

37

36

35

34

33

32

48 49 50 51 52 53 54 55 56

Body Length (in cm)

Cir

cu

mfe

ren

ce

of

the H

ead

(in

cm

)

III

I IIIIII

I IIII

IIII

IIII

IIIIIIIII

I

IIII

IIIIIIII

III

IIIIIIII

IIII

IIIII

IIIIII

I

I

I

III

I

I I

I

I

I

I

I

I I

II

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 44

1. In relation to the diagonal, where do you find the babies withlarge heads relative to their lengths?

2. As the body lengths increase, what happens to the head sizes ofthe babies?

The scatter plot displays a picture of the information in the table.

3. a. Explain why the axes do not start at (0, 0).

b. What information seems to get lost in the scatter plot whencompared with the table?

4. Are there babies for whom body lengths are almost equal tohead sizes?

Section E: Correlating Data 45

Correlating Data E

46 48 50 52 54 56 5830

32

34

36

38

40

Body Length (in cm)

Cir

cum

fere

nce

of

the

Hea

d(i

ncm

)


To show body lengths, head circumferences, and the number ofbabies studied, a three-dimensional graph can be drawn.

5. a. What information does the purple bar in the graph represent?Check this information against the data in the table.

b. What are the advantages and disadvantages of a three-dimensional graph?

Juan and Brenda have a discussion about the data.


Correlating DataE

Body Length (in cm)

32 34

36

38

Head

Circu

mfe

rence

(in

cm

)

4850

5254 56

Source: Sunburst Communications

I think you can say that thereis a relationship between ababy’s body length and thecircumference of its head.

I’m not sure about that.The tallies seem to be justspread over the chart.

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 46

A


Correlating Data

6. Do you agree with Brenda or Juan? Explain.

7. a. What do you think the head circumference is for a baby with abody length of 55 centimeters (cm)? How did you determineyour answer?

b. A one-year-old child has a height of 92 cm. What can you tellabout the circumference of the child’s head?

E

You can say that the longerthe baby, the larger thecircumference of its head.

Yes, I agree that you can find those exceptions.I can even find a baby that is 53 cm long andhas a head circumference of 33 cm, but thisseems to be an outlier!

If you look at the table,most babies fit in ageneral pattern.

Okay, but the relationship you described is not a strong one, and I think when thebabies start growing, this relationship willget even weaker or disappear.

That’s not true for all babies. Lookat the table. I see babies with bodylengths of 47 centimeters (cm) andof 52 cm, and both have a headcircumference of 34 cm.

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 47

Scatter plots are often made to investigate the relationship betweentwo variables. The points on a scatter plot may look like a “cloud.”When the cloud is long and slender, there is a strong correlationbetween the two variables. When the correlation is strong, knowingsomething about one of the variables helps you know how the othervariable will behave. If the cloud of points is very scattered or in acircle, no underlying relationship exists between the variables.Knowing something about one of the variables cannot tell you about the other. In this case, there is no correlation between the two variables.

8. Describe the correlation in the scatter plots. Indicate whether there seems to be no correlation, a weak correlation, or a strong correlation.

9. For which of the above scatter plots can you best predict thevalue of y when x � 4? Explain your reasoning.


Correlating DataAE

0 1

1

2

3

4

5

6

2 3 4 5 6 0 1

1

2

3

4

5

6

2 3 4 5 6

0 1

1

2

3

4

5

6

2 3 4 5 60 1

1

2

3

4

5

6

2 3 4 5 6

y y

y y

x x

x x

a b

c d

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 48

In the diagrams for problem 8, you can see that a correlation can beweak or strong. Correlations can also be positive or negative.

10. a. What do you think is meant by the phrase “negative correlation”?

b. Which of the scatter plots on the previous page show anegative correlation?

11. For each of the following cases, decide whether there is no correlation, a strong correlation, or a weak correlation betweenthe two variables mentioned. If there is a correlation, is it positiveor negative?

• a person’s height and pulse rate

• the number of hours of sports training per week and pulse rate

• the height of a dinosaur and the length of its tail

• results on a math test and a science test

• temperature outside in the summer and kilowatt hours of electricity used

• number of children per household and number of televisionsets per household

• number of hours students study and their grade point averages

In a cause-effect relationship between two variables, a change in onevariable directly causes a change in the other.

12. a. If you expect a strong correlation between the two variables,can you be sure that a cause-effect relationship exists? Why or why not?

b. Do you think there is a cause-effect relationship in any of thesituations in problem 11? Explain your reasoning.

Collect some data about one of the cases in problem 11 and use the data to check your answer about the correlation. Be sure to think carefully about how you will select your sample.


Correlating Data E

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 49


Correlating Data

A scatter plot shows information about two variables that are paired in some way and can help you see whether a relationship existsbetween the variables. By looking for a trend in the scatter plot, you can see that a relationship exists.

If the points on a scatter plot are close to forming a straight line, thereis a strong correlation. If the points are scattered all over and do notfollow a pattern, there is a weak correlation.

Correlations can also be positive or negative. A negative correlationexists if the values of one variable increase while the values of theother decrease.

A scatter plot with a strong correlation doesnot imply that a cause-effect relationshipbetween the variables exists. Other kinds of analysis must be done to determinewhether a change in one variable causes a change in the other.

E

1. Explain how a scatter plot can help you understand correlation.

0 1

1

2

3

4

5

6

2 3 4 5 6

y

x

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 50


19000

2000 2100 2200 2300

10

20

30

40

50

60

Mil

es p

er

Ga

llo

n

Weight (in lbs)

Vehicle Fuel Economya

14

12

10

8

6

4

2

00 20 40 60 80 100

Ma

xim

um

No

. o

f E

gg

s p

er

Ha

tch

Size of Bird (in cm)

Birds of Maineb

5034 36 38 40 42 44 46 48

52

54

56

58

60

62

Cir

cu

mfe

ren

ce

of

He

ad

(in

cm

)

Shoe Size (in cm)

Shoe Size

and Head Circumferencec

2. a. Describe the correlation in each of the three scatter plots.

b. In which of these plots might there be a cause-effect relationship?

3. Find an example of two variables that have a strong correlationbut do not have a cause-effect relationship.

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 51


Correlating Data

4. Match each of the descriptions of correlation below to a corresponding scatter plot.

i. linear, negative, and moderate

ii. linear, positive, and strong

iii. no apparent correlation

iv. linear, positive, and weak

For small children, foot size can be correlated with the ability to read;the larger the foot, the better they can read. But both reading and footsize are a function of age. Find another example of two data sets withstrong correlation but where the relationship is really due to anothercommon factor.

E

a b

c d

39.IND.0526.eg.qxd 05/27/2005 14:05 Page 52

Section F: Lines That Summarize Data 53

FLines That SummarizeData

Egg Hunt

Marisol noticed that a bird had recentlylaid eggs in a nest outside her window.The eggs appeared small, and she wondered whether they had the sameshape as chicken eggs.

1. a. Do you think all bird eggs havethe same shape?

b. How can you check youranswer?

Throughout this section, you will usescatter plots to study the relationshipbetween the lengths and widths ofvarious bird eggs.

On Student Activity Sheets 7, 8, and 9 you will find tables with dataon 96 birds of Britain. Since there are many birds, they are classifiedinto families. Among the data, you will find the average lengths andwidths of the eggs of each bird.

2. a. Draw an eider’s egg using the dimensions found in the table.

b. Draw a lesser whitethroat’s egg.

c. How are the eggs different? What might account for the differences?

Quail egg Chicken egg

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 53


Lines That Summarize DataF

You have discovered that eggs are different sizes. But what about their shapes? Is there a relationshipbetween the lengths and widths of bird eggs? Toanswer these questions, you will begin by looking at the eggs of one family, the warbler.

3. a. Find all the warblers in the table on Student

Activity Sheets 7, 8, and 9, and make a scatterplot of the lengths and widths of the warblereggs (put length on the horizontal axis andwidth on the vertical).

b. Describe the relationship between the lengthsand widths of the warbler eggs. Is there anycorrelation?

c. What do you expect the width of a warbler eggwill be if its length is 15 mm? How did you findyour answer?

The points in the scatter plot lie almost on a straight line. You mighthave used this information to find an answer to problem 3c. It is possible to summarize the pattern, or relationship, by drawing a linethat seems to best fit these points. This line will probably not gothrough all the points, but the points should lie close to the line.

4. a. On Student Activity Sheet 10, draw a straight line that seemsto “fit” these points. Use your line to predict the width of awarbler egg that is 15 mm long.

b. Does your line give you the same answer that you predicted inproblem 3c?

5. a. Use your line to determine what happens to the width of anegg if the length increases by 2 mm. Explain how you did this.Include a drawing with your answer.

b. Describe what will happen to the width of an egg if the lengthincreases by 1 mm.

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 54


Lines That Summarize Data F

In problem 4, you looked at a scatter plot and drew a line that summarized the relationship between the variables. Your line mayhave been very different from the lines drawn by your classmates.

6. How can you decide whether a line drawn to summarize the datain a scatter plot is a good line? What criteria would be helpful inmaking your decision?

7. a. Use your criteria to decide whether or not you drew a good line.

b. How does your line compare with the line drawn by a classmate?

There are many ways to find a good line. Different criteria are usedfor different purposes. One criterion is how well your line predicts avalue for something you already know.

8. a. Look at the graph of your line from problem 4. What does itpredict for the width of a lesser whitethroat warbler egg that is 16.5 mm long?

b. Now look at the data on Student Activity Sheets 7–9. What isthe actual value for the width of the lesser whitethroat warbleregg? How well did your line predict this value?

c. Overall, does the line you drew seem to predict values wellwhen you compare it with the actual data? Explain.

Ostrich egg Emu egg

Chicken egg Quail egg

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 55


Lines That Summarize DataF

In problem 4, you drew a line to predictthe width of the warbler egg. One way toget a line is to draw the line so that it goesthrough the point that is the mean of thelengths and the mean of the widths for theeggs. The slope of the line should reflectthe direction of the trend in the points.

Student Activity Sheet 11 shows a line for the warbler eggs thatcontains the point (mean length, mean width).

9. a. How is this line different from the line you drew in problem 4?

b. Use this new line to find the width of the lesser whitethroatwarbler egg that is 16.5 mm long. How close is the predictionto the width given in the table?

c. Overall, how well does the line seem to predict the widths ofthe warbler eggs?

One equation for the relationship between the length and width of the warbler eggs that goes through the mean is:

width = 0.7 � length � 1.1

10. a. Show that this line contains the point (mean length, meanwidth).

b. Use this equation to find the width of the lesser whitethroatwarbler egg that is 16.5 mm long.

c. How close is the width you found to the width given in thetable?

11. a. What is the slope of the line for the warbler eggs in question 10?

b. How can you find this number from the graph?

c. What does the slope tell you about the relationship betweenthe length and the width of warbler eggs?

39.IID.SB.1220.qxd 12/21/2005 19:10 Page 56



12. a. What does the other number in the equation represent on thegraph (the number that is not the slope)?

b. Does this number make sense in terms of egg length andwidth? Why or why not?

c. What length does the equation predict for eggs that are1 mm wide?

d. Do you think the equation is useful for very small numbers?Why or why not?

The scatter plot shows the data for 96 birds of Britain.

13. a. Where in the scatter plot are the data for the warbler eggs?

b. What happens to the relationship between length and width asthe bird eggs get larger?

0

10

20

30

40

50

60

20 40 60 8010 30 50 70 90 100

Wid

th(i

nm

m)

Length (in mm)

Birds of Britain



Line That Summarize DataF

An equation for the line for the scatter plot of the 96 birds of Britainon page 57 is:

width = 0.7 � length � 1.6

14. a. Compare the equation above to the equation for the warblers:width = 0.7 � length � 1.1. What can you tell about the twolines from the equations?

b. Sketch this line on Student Activity Sheet 11. How do the linescompare?

c. If you make predictions based on the line for all eggs, whatwill happen to your predictions as the egg lengths increase?

15. What can you tell about the egg shapes of birds from differentfamilies?

For this activity, you will need chicken eggs and either a tape measure or a ruler.

Measure some chicken eggs at home. Record the lengths and widths. In class, collect everyone’s data. Find the typical length and width for a chicken egg.

How does the typical chicken egg fit in the graph of the birds ofBritain?

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 58

Graphs of growth often result in curves because livingthings do not grow at a constant rate throughout theirlifetimes.

Researchers in fisheries are interested in studying thegrowth rates of fish. One study on bluegills comparedthe lengths of fish at the beginning of the year withtheir lengths at the end of the year.

The table contains the results of this research.

16. a. How much did the fish that was initially 161 mmgrow in one year?

b. Graph the data from the table.

c. What pattern do you see in the scatter plot fromproblem 16b?

17. a. Draw a straight line to represent the data in thescatter plot.

b. Estimate how long a 140-mm fish will be afterone year.

c. Is a straight line a good model to represent thedata in the graph? Explain your reasoning.

18. a. Sketch a model on the graph that betterrepresents the data.

b. Using this model for the growth, describe whatis happening to the growth of the fish as thelength changes.

c. If a fish is now 110 mm long, predict how long it will be in one year.

19. Describe the difference between using a straightline and using the curved line you created inproblem 18 to predict growth.



Gone FishingBluegill Growth

Initial Length after

Length x 1 Year y

(in mm) (in mm)

48 69

52 71

51 69

53 75

68 101

71 107

69 100

75 104

101 138

107 138

100 130

104 140

138 160

132 157

130 156

140 161

160 173

157 168

156 172

161 178

173 176

168 174

172 173

178 178

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 59


Lines That Summarize Data

If the relationship between two variables appears to be linear, a linecan be found to describe the relationship.

Straight lines can be used to predict unknown values and to checkexisting values over the range of the data in the problem. If the trendseems to be linear, predicting an unknown value between data pointswill probably give an answer close to the truth.

The slope of the line can be expressed in terms of the data. This canlead to statements such as the following:

When the — increases by — , the — — by — .

F

13119

12

13

14

15

16

17

11

10

15 17 19 21 23

Length (in mm)

Wid

th (

in m

m)

39.IID.SB.1220.qxd 12/21/2005 19:16 Page 60


Initial Length (in mm)

Bluegill Growth

Len

gth

aft

er

1 Y

ear

(in

mm

)

0 25 50 75 100 125 150 175 20060

70

80

90

100

110

120

130

140

150

160

170

180

You can draw a line that seems to capture the trend in the data. Youcan check how well your line seems to summarize the relationshipbetween the variables by checking how much the predictions madeusing the line would vary from the actual values in the data. A curvemay sometimes be used to describe a relationship that cannot bedescribed by a straight line. This is usually the case in the context ofgrowth, since growth rates are not constant.

1. Explain how you can find the slope of a line from the statementlike the one bolded in the Summary.

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 61


Lines that Summarize Data

The table shows the number of calories and grams of carbohydratesfor standard size servings of different kinds of fruit.

2. a. Make a graph of the data with the number of calories on thehorizontal axis and the grams of carbohydrates on the verticalaxis. Write a sentence about the relationship you can observebetween calories and carbohydrates in fruit.

b. Draw a line that seems to represent the relationship youobserved in the data. Write the equation for this line.

c. Describe what the slope and y-intercept each mean in terms ofthe data.

F

Fruit Calories Carbohydrates (grams)

Apple, raw 23��4 in. diameter 80 21

Apricot, 3 raw 60 3

Banana, raw 105 27

Cherries, 10 50 10

Grapefruit, 1��2 raw, white 40 10

Grapes, 10 seedless 35 10

Cantaloupe, 5 in. diameter 95 22

Orange, 25��8 in. diameter 60 15

Peach, raw, 21��2 in. diameter 35 10

Strawberries, whole, 1 cup 45 10

Tomatoes, 1 whole 25 5

Watermelon, 4 � 8 in. wedge 155 35

Home and Garden Bulletin, No. 72, U.S. Department of Agriculture

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 62

Section F: Lines that Summarize Data 63

d. Will your graph work to predict the grams of carbohydrates ina cup of raisins if you know that they have 435 calories? Whyor why not?

e. Use both the graph of your line and your equation to predictthe number of carbohydrates in a banana. Do the twopredictions agree? How far off was your prediction?

3. Find the relationship between calories and grams of carbohydratesfor another food group. How does it compare to the relationshipin fruit? Share your findings with a classmate.

You have looked at several different ways to describe data in this unit.Choose one of them and describe a situation in which you would usea graphical representation to display the data.

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 63


Additional Practice

Section Patterns in DataA

Mr. Flores surveyed his high school sophomore class to see if therewas a relationship between the number of hours the students studyper week and their grade point averages (GPA). He created the scatterplot shown below to display the data.

1. What is the GPA of the student who studies six hours per week?

2. Do you agree with each of the statements below? Explain youranswer.

a. The student who studied the least has the lowest GPA.

b. If you study at least nine hours a week you will have a GPA of 3.5.

3. Are there any outliers? If so, describe their locations.

Hours of Study

Comparison of GPA to Hours of Study

2 4 6 8 10 12 14 16

1

2

3

4

0

GPA

39.IID.SB.1220.qxd 12/21/2005 19:18 Page 64


The table below shows the percentage of students who scored at orabove the basic levels in math and science on the 2005 NationalAssessment of Educational Progress in states from the Northeast andNorthwest. (Not every state administers the test.)

4. a. Do states with a high average percentage of students at orabove the basic level in math also have a high percentage ofstudents at or above the basic level in science?

b. What is the difference between the two regions, Northwest(NW) and Northeast (NE) in terms of their test scores? Can youthink of any reasons for this difference?

c. In which subject are the scores better? Explain how you cansee this in the graph.

d. The data might almost be grouped into three clusters, one atthe bottom left, one at the top right, and the cluster of pointsin the middle. What can you say about the states in the clusterat the bottom left and those at the top right?

Percentage at or above Percentage at or aboveState

Basic Level in Math Basic Level in ScienceRegion

IL 68 58 NW

IN 74 62 NW

MI 68 66 NW

MN 79 71 NW

MO 68 66 NW

ND 81 77 NW

OH 74 67 NW

WI 76 70 NW

ME 74 72 NE

VT 78 76 NE

MA 80 72 NE

NH 77 76 NE

RI 63 58 NE

CT 70 63 NE

Source: National Assessment of Educational Progress, National Center for Education Statistics,U.S. Department of Education


The graph on the left indicates that the priceof erasers increased from 1970 to 2005.


Additional Practice

Section Interpreting GraphsC

2005

50¢

2000

45¢

1990

35¢

1980

30¢

1970

25¢

1. Do you think the graph represents thedata accurately? Explain your answer.

2. Draw a picture that accurately representsthe difference between the prices of theerasers from 1970 to 2005.

A taste test was conducted between two leading soft drinks. Testingbooths were set up at two different shopping centers in the townwhere the soda is made. A total of 425 people stopped by the boothsto take part in the taste test.

The results of the study were used in an advertising campaign:

“Three out of five people prefer the taste of Bingo Pop overother leading brands.”

1. Is this statement reliable or fair? Explain why or why not.

Two weeks later, the Bingo Pop company decided to do another tastetest to see if its advertisement campaign was effective. The boothswere set up in the same locations. Again, 425 people stopped by thebooths to take part in the taste test. This time, the results showed thatfour out of five people preferred the taste of Bingo Pop. The companyput out another ad:

“More and more people enjoy the taste of Bingo Pop every day.”

2. Do you think this is an accurate statement? Why or why not?

3. How could you ensure that the two surveys above were conductedaccurately?

Section Selecting SamplesB

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 66


Additional PracticeAdditional Practice

Recall that in Section A of Additional Practice, Mr. Flores conducted asurvey about the number of hours students study and their GPAs. Mr.Harrison and Ms. Simmons conduct the same survey in their classes.The results from the two classes are shown in the tables.

Section Using DataD

1. a. Explain the advantages and/or disadvantages of using the following types of graphs to display the data:

• histogram • box plot • scatter plot

b. Reflect If you wanted to compare the two graphs of data,which graph would you use? Explain.

2. Make a scatter plot of the combined data from both classes.

Mr. Harrison’s Class Results

Hours SpentGPA

Studying

0 0

1 1

1 0

2 1

3 1

4 1.5

5 1

5 2

6 2

7 2.5

9 3

10 4

11 3.5

12 3

12 4

13 3.5

14 3.5

15 3.5

16 4

Ms. Simmons’s Class Results

Hours SpentGPA

Studying

1 0.5

2 0.5

4 1

4 1.5

5 1.5

6 2.5

7 2

8 2.5

8 3.5

10 2.5

10 3

11 3

11 4

12 3.5

13 3

13 4

14 4

15 4

16 3.5

39.IND.0526.eg.qxd 05/27/2005 14:06 Page 67


Additional Practice

Section Correlating DataE

When you look carefully at your scatter plot, you can distinguish acluster of data in the upper-hand right corner.

3. Describe this cluster. What does this pattern say about the studying habits of the students in the two classes?

1. Study the scatter plots shown below. Indicate whether or notthere is a correlation for each plot. If there is a correlation, indicatewhether it is weak or strong.

2. What data could be represented by each of the above scatter plots?

3. Describe some data that show a strong correlation but does nothave a cause-effect relationship.

0 1

1

2

3

4

5

6

2 3 4 5 6 0 1

1

2

3

4

5

6

2 3 4 5 6 0 1

1

2

3

4

5

6

2 3 4 5 6

y y y

x x x

a b c

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 68


Additional Practice

Section Lines That Summarize DataF

The relationship between the numbers of hours spent studying per week and GPA in Mr. Flores’s class can be described by a line as shown above.

1. a. Estimate the slope of the line.

b. How would you describe the slope in terms of the relationshipbetween number of hours spent studying and GPA?

2. What GPA would you expect of a student who studied 9 hours?

3. a. What criteria do you need to decide if the line above is drawnaccurately?

b. Is the line on the above graph an accurate line? Explain youranswer.

Hours of Study

0

1

2

3

4

2 4 6 8 10 12 14 16

GPA

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 69


1. You may want to look for clusters (groups) in the data. There maybe something special about the data points in the clusters. Forexample, there may be a common feature for these data, such asa cluster of states that are in the same geographic region. Youmay want to look for patterns. Does the scatter plot show a trendof some kind? Is there more than one pattern? If the data are clustered, does each cluster have its own pattern? You can alsolook for outliers. What characteristics make a point an outlier?

2. a. You can write different general statements. Three examples:

• The heavier the vehicle is, the fewer miles per gallon it candrive.

• The change in the number of miles per gallon is not constantfor a given change in weight.

• If the weight is between 2,000 and 2,100 lb, the fuel economyis about the same, around 30 miles per gallon.

b. The fuel consumption of vehicle B is almost 30 miles pergallon, which is in the middle.

c. You can write different answers. There seem to be no apparentoutliers if you see all the points as lying on a curve. You canalso argue that the two points on each of the “ends” of thecurve are outliers because they are a little out of the generalpattern.

Section Patterns in DataA

01900 2000 2100 2200 2300

10

20

30

40

50

60

Mil

es p

er

Ga

llo

n

Weight (in lbs)


A

B

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 70


Answers to Check Your Work

3. a. Your graph might look like the one below.

You might suggest that as the percentage at or above the basiclevel in math increases, so does the percentage at or above thebasic level in science. However, North Carolina and Texas havehigh math scores and average science scores. These statestherefore fall outside that pattern.

b. Texas and North Carolina have the highest percentage of students at or above the basic level in math but are in themiddle of the states with respect to science.

c. More states performed better in math (9) than in science (4).There was one tie (OK). One way to tell this from the graph is by thinking about the points that would have the same percentage for both math and science. The line that would go through points such as (50, 50) and (60, 60) is the line M � S. The states that fall below this line have a higherpercentage of students at or above the basic level in math than they do in science.

40

40

42

44

46

48

50

52

54

56

58

60

62

64

66

68

70

42

MS

LA

SCAL

GA

AR

TN

WV

KY

KY

OK

MD

NC

TX

VA

44 46 48 50 52 54 56 58 60 62 64 66 68 70

Math

Scie

nce

Percentage of Students At or Above Basic

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 71



d. The states in the top cluster with more than 57% of the students at or above the basic level in math are Kentucky,Maryland, North Carolina, Oklahoma, Texas, Virginia, and West Virginia. These states are farther north; the other statesare all located in the far south.

e. You might say that Virginia did the best in both math andscience because they had nearly the highest percentage of students (65% compared to 67%) who were at or above thebasic level in math and were the highest (at 63%) in science.Texas and North Carolina had the highest percentage in math but were much lower in science (53% and 56%).

1. a. One disadvantage of taking one class as the sample is that you have all students from either sixth or seventh grade. Thesampling procedure would be biased because it would leaveout an important part of the population. You might also arguethat students in one class influence one another with respectto their preferences, and so the results of the sample will notbe reliable.

b. Different procedures are possible, for example:

• Make one list of all sixth- and seventh-grade studentsordered according to their last name; then take every fifthstudent in the sample.

• Randomly select students from each class, for example, byputting all the names in a box and taking out as many as you need for your sample.

2. a. You would expect half of the 50 numbers to be even, so about 25.

b. 28 out of the 50 are even.

c. No. The number will vary but is most likely around 25; in about 90% of the cases, the number will be between 20 and 30.

Section Selecting SamplesB

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 72



3. As an example for incorrectly choosing the sample, you mayhave chosen the example from problem 11c about the travelagencies or from 11d about the political poll. For “neglecting to account for the people who did not respond,”you may have chosen the example from problem 11b on thehealth food magazine survey.For letting interviewers select the people they want to interview,you may have chosen the example from problem 11a about thepolice officers interviewing people.Other examples can be found as well. Discuss your answers withyour classmates.

4. It is not true that the larger the sample, the less chance of bias. If the sample is not taken properly—for instance, because peoplewithout telephones cannot be chosen in the sample—a largersample does not change this. The same bias will still occurregardless of the sample size.

1. Different answers are possible. It is important for a graph to be anaccurate representation of data to reveal relationships betweenthe two variables so that proper conclusions can be drawn.

2. a. No, because the horizontal axis should be ordered like anumber line. Richard spaced the years evenly on the horizontalaxis, but there are more years between 1980 and 1990 thanbetween 1978 and 1979, and you cannot see this now. Sobetween 1980 and 1990, there seems to be a rather steepincrease, steeper than between 1997 and 1998. But actually, the first increase should be spread out over ten years,although you do not know what happened exactly in the years in between.

b. You might make a line graph in which the years are placed onthe axis with the right scale.

Section Interpreting GraphsC

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 73



3. Your answer will be different from your classmates’ answers.Have one of your classmates comment on your work and theother way around. Discuss both your articles or captions andparagraphs.

1. It is difficult to estimate the mean from a histogram; it is kind ofthe “balance” point of the distribution. The median can be foundin a histogram by finding the halfway point of the number ofdata. You can find this by counting in the bars. If, for example,there are 15 values, the eighth one represents the median. Youcan also count from both ends at the same time until you meet inthe middle; this is the median.

In a box plot, the median is drawn in the box as one of thesummary points. There is no way to find the mean and the mode from just the box plot.

2. Different answers are possible. Box plots are very useful if youwant to compare groups of data or if you want a summary of key points in the data.

3. a. Your answers may vary, but you may notice that the set ofcontinents in Set I seems to consist primarily of countries thathave a high level of industry, education, and generally goodeconomic conditions, compared to the continents in Set II that have countries where the standard of living is low andindustrial growth is not yet in place. It might be reasonable to conclude that teachers in Set I would earn more than teachers in Set II.

Section Using Plant Growth DataD

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 74



b. Your box plots might look like the following:

c. Both the median and range of the yearly salaries in the groupthat includes North America, Europe, and Australia/NewZealand are much larger than in the other groups of continents($24,000 and $44,000 as opposed to $11,000 and $24,000).Almost all of those in Set II are below the median salary forSet I. The United States data point is around the third quartile(Q3) of the first group, and this is higher than any of the statesin the second group.

1. Answers will vary, but you can say that if the points in a scatterplot form a tight cloud that is almost a straight line, there is astrong correlation. If the points are scattered more widely, thecorrelation is weaker; if points are in a cloud that is nearly a circle,no correlation exists.

5,0000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 55,000

Salary (in dollars)

Box Plots of Salary (in dollars)C

on

tin

en

t

I

II

Section Correlating DataE

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 75



2. a.

• The data points form a tight cloud that is almost a straightline, so the correlation is strong. As the weight of the vehicleincreases, the miles per gallon decrease; so the correlation is negative.

• The data points are in a circular pattern, so there seems tobe no correlation or relationship between the size of a birdand the number of eggs hatched. Some large birds hatchlots of eggs and some hatch few eggs; the same thing is truefor small birds.

• The points are in a rather wide cloud, so there seems to be a weak correlation. A person with a larger shoe size seemsto have a larger head circumference, so the correlation ispositive. On a closer look, it seems as if two clusters of dataexist. Within these clusters, the data are in a circular pattern.

b. It seems that the heavier the vehicle, the more fuel it wouldneed to operate, so this example might be a case of cause andeffect. The graph of the birds of Britain does not even show arelationship, so it would be unwise to search for a cause-and-effect relationship. The shoe size and head circumference plotdo show a weak relationship, but having big feet does notcause a large head; both are functions of how big the person is to start.

3. Your answers will all differ. Compare the example you found tothe examples of some of your classmates. If a strong correlationexists without a cause-effect relationship, often there is anothercommon feature that helps explain the correlation. For example,the correlation between the number of schools in a city and thenumber of shopping centers may be very strong, but both arefunctions of the population of the city. One does not cause theother. Another example might be the relation between scoringpoints and making fouls in a basketball game. The correlationmight be strong, but this does not mean that making fouls willincrease the number of points. Both are functions of how muchplaying time a player had.

4. scatter plot a with statement iiscatter plot b with statement iiiscatter plot c with statement iscatter plot d with statement iv

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 76



1. You can find the slope by dividing the vertical increase (or decrease) by the corresponding horizontal increase.

2. a. Your graph may look like the following.

You might say that as the calories in fruit increase, so does thenumber of grams of carbohydrates. Another answer might bethat most of the fruit has less than 30 g of carbohydrates, but thatwould not be a statement about the relationship with the numberof calories.

Calories

Carbohydrates vs Food Energy in FruitC

arb

oh

yd

rate

s (

in g

)

2000246810121416182022242628303234363840

40 60 80 100 120 140 160 180 200

Section Lines That Summarize DataF

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 77



b. One line might be Carbohydrates � 1 � 0.23 � Calories. Note that this line goes through the point representing the meannumber of calories and the mean number of carbohydrates,approximately (65,16).

c. For this example, the slope would be 0.23, or 23��100, which

means that for an increase of 100 calories, the number ofgrams of carbohydrates increases by 23. The y-intercept is 1and indicates that if a type of fruit has 0 g of carbohydrates,you would still expect it to have about one calorie.

d. The graph will probably not work because it was made using a range of calories from 25 to 155, and 435 is far beyond thatrange. Even if you made the graph over on another scale, it isso far away from the given range that you cannot be sure thatthe trend you see in the data still holds.

e. Your answer will vary depending on the equation you have. If you used the equation for the number of carbohydrates in a banana, you will get about 25 grams. The graph gives youabout the same number of grams. In this case, the line wouldpredict 2 fewer grams than the actual number.

3. Your answers will depend on the kind of food you choose. You might look at grain products, fish, meat, dairy products, or vegetables. Have a classmate check your graph and your conclusions.

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 78

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 79

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 80

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 81

39.IND.0526.eg.qxd 05/27/2005 14:07 Page 82

insights into data - freudenthal instituut...2 insights into data i. ii. iii. iv. v. directions: i....

Documents