unit 1.1 investigating data 1. frequency and histograms ccss: s.id.1 represent data with plots on...

25
Unit 1.1 Investigating Data 1

Upload: damon-whicker

Post on 14-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

1

Unit 1.1

Investigating Data

2

Frequency and Histograms

CCSS: S.ID.1 Represent data with plots on the real number line (dot

plots, histograms, and box plots).Also N.Q.1

3

Types of Data Graphs

• Dot Plots• Frequency Tables• Histograms• Box-and-whisker plots• 2-way tables

4

Dot Plots

One dot represents one occurrence of the item. Sometimes an X is used instead of a dot. These plots are sometimes called line plots.

5

Creating a Dot Plot

• Find the least and greatest value in a data set.• Use these values to draw a number line. • For each piece of data, draw a dot above the

number line that corresponds to the data.

6

Frequency Tables

Frequency tables show the number of times something occurs in a given interval. From this chart, we don’t have individual data, just numbers in each group.

7

Histograms

Data is continuous numerical data (range). Bars touch each other.

8

Histograms

• Bar graph used to display the frequency of data divided into equal intervals

• Bars must be equal width and should touch, but not overlap

9

Steps to Make a Histogram

• Make a frequency table• Use scale and intervals from table• Draw a bar for the number in each interval• Title the graph and label axes

10

Shape of Histograms

11

Measures of Central Tendency and Dispersion

CCSS: S.ID.2 Use statistics appropriate to the data distribution to

compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.

Also S.ID.3, N.Q.2

12

Measures of Central Tendency

Central tendency – where the “center” of the data is.

Mean ( )– numerical average of the dataMode – most frequent number in the dataMedian – middle number of the data if put into

numerical order from lowest to highest

x

13

Measures of Dispersion

Dispersion - How spread out the data is.

Range – difference between the maximum value and minimum value of the data

Standard deviation – measure of how values in a data set vary (deviate) from the mean.

14

Standard Deviation

• Symbol: σ • Calculation:1. Find the mean of the data2. Find the difference of each item from the

mean.3. Square the differences.4. Find the average of the differences.5. Take the square root.

15

Example of Calculating Std. Dev.Data : 12.6, 15.1, 11.2, 17.9, 18.2

X X-bar X – (X-bar) (X – X-bar)2

12.6 15

15.1 15

11.2 15

17.9 15

18.2 15

Average of the difference of the squares:

Square root of the averages (σ):

Note: x-bar stands for the mean of data

16

Interpreting the Standard Deviation

• As the data becomes more widely distributed, the standard deviation increases.

• A small standard deviation means that the data are clustered tightly around the mean.

17

Box-and-Whisker Plots

CCSS: S.ID.2 Use statistics appropriate to the data distribution to compare center

(median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.

Also N.Q.1, S.ID.1

18

Box-and-Whisker Plot

Graph that summarizes a set of data by displaying it along a number line. It consists of a box and two whiskers.

19

Box-and-whisker Plot

• Comprised of 5 numbers (sometimes called the 5-number summary): Min – minimum value (left whisker) Q1 – median of lower half of data (left side of box) Median (Q2) (middle line) Q3 – median of upper half of data (right side of box) Max – maximum value (right whisker)

20

Quartiles

• Quartiles – values that divide a data set into 4 equal parts.

• The middle half of the data (Q3 – Q1) is called the interquartile range or IQR. (contained in the box)

• From Min to Q1 – 25% of data• From Q1 to median – 25% of data• From Median to Q3 – 25% of data• From Q3 to Max – 25% of data

21

Interpreting Box Plots

• Shows middle of data, range (spread) of data, extreme values. Does not show individual data or mean (average).

• Outlier – a data value that is much higher or much lower than other values in the data set.

• Percentile rank – percent of data values that are ≤ that value.

22

Two-way Tables

CCSS: S.ID.5 Summarize categorical data for two categories in two-

way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies).

23

2-way Tables

Way of organizing data to show data that pertain to two different categories.

Can find the conditional probability of events occurring.

24

2-way Tables (cont’d)

1. What is the probability that if a student plays a sport, he also takes a foreign language?

2. What is the probability that if a student doesn’t take a foreign language, she doesn’t play a sport?

3. What is the probability that a student doesn’t take a foreign language?

25

More 2-way tables

1. What is the probability that a student has a MP3 player?

2. What is the probability that if a student doesn’t have an MP3 player, he has a cell phone?

3. What is the probability that if a student doesn’t have a cell phone, he has a MP3 player?