(or chapter 4). brave new data we are no longer limited to charts which only work for categorical...

20
More Chapter 3! (or Chapter 4)

Upload: ralph-hudson

Post on 21-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

More Chapter 3!(or Chapter 4)

Page 2: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Brave New DataWe are no longer limited to charts which only

work for categorical data.We have three more charts at our disposal.Even though I do not think the book stresses

this enough, frequency tables and relative frequency tables are still useful for quantitative data.

Bar charts and pie charts, however, are not.

Page 3: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Just Kidding On The Bar ChartsWe do not use bar charts (or column charts)

for quantitative data.We use histograms.These are, as you can hopefully see, charts

with bars.Doesn’t that make them bar charts?

Page 4: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

It Totally ShouldDistinguishing between histograms and bar

charts borders on obnoxiousness.However, it is important to note that bar

charts have gaps and histograms do not, and so we call them something different.

The primary purpose in the distinction is help reinforce that categorical data and quantitative data are different.

Page 5: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

What Makes Histograms SpecialBar charts can show data in whatever order

they like, but histograms need to go in order.Since they are used with quantitative data,

the order is built in.If there is an interval with no data, you are

still expected to have it in your graph as an empty space.

A gap on a histogram means there was a gap in the data.

Page 6: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Categorical vs. QuantitativeSometimes the data can be made into either.For example, scores and letter grades can

both be found for the quiz.

11 12 13 14 150

2

4

6

8

10

12

First Quiz Score,4th Hour, 2013

A B C0

2

4

6

8

10

12

14

First Quiz Grade,4th Hour, 2013

Page 7: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Histograms FTW!In later chapters, histograms will be the preferred

plot for categorical data in general.Dotplots are a fun way to amuse yourself…if you

are into making charts and graphs.Stem-and-leaf plots are useful, but can take forever

and also require intense attention to detail.They are very convenient for displaying two

distributions side by side.While not mentioned much in this chapter, there

are also lineplots, which are like histograms, except instead of bars, there is a line connecting the frequencies.

Page 8: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Stem-and-leaf DiagramsAlso known as stemplots.The stem contains the beginning of each data

point (such as the tens place or hundreds place).

Each data point is called a leaf.Each leaf needs to be the same number of

characters.If you have double-digit leaves, it is wise to

leave a space after each one.

Page 9: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Stem-and-leaf DiagramsThe stem can be broken down into partial

categories, such as high and low.The stem can be surrounded on both sides

with leaves, representing two distributions side by side.

The leaves need to all take up the same space.On a computer, the Courier font ensures that

all text takes up the same space left to right.

Page 10: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Stem-and-leaf DiagramsThe leaves should be in numerical order

within each stem.This means sorting the data first.No stem should be left out, which means that

if you do not have any leaves for a given stem, you still need it, but it just gets left empty.

Probably best done by hand, and for relatively small data sets (like 30 subjects or less).

Page 11: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

TimeplotsThese are the only lineplots mentioned in

your book.They are the most common kind of line plot.You basically plot the dots and connect the

dots.They are not only really straightforward, but

they are also intuitive, so regular people can look at them and see trends.

Page 12: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Back to HistogramsHistograms should be evenly scaled.

This means even class widths.The book would say even bin sizes.This also can be expressed as even intervals.

Histograms should include every interval between the start and stop of the data.Even if they are empty.Especially if they are empty.

Page 13: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Describing DistributionsThere are three key areas to describe when

discussing a distribution.Shape – This usually means counting the high points,

checking for symmetry, and looking for extreme values.These high points are called modes.

Center – This will usually focus on an appropriate measure of central tendency.The Mean and the Median are common.

Spread – This will usually mean giving an indication of how spread out the data is.The Range and the Standard Deviation are common.

Page 14: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

ShapeWithin the shape category, there are three things

we will tend to focus on.First is known as modality. That is basically just

how many bumps.One exception is a uniform distribution.

The second is symmetry and skew.A graph is considered skewed if it leans more

towards one side of the mode than the other.The third is outliers.

Next chapter we will learn how to identify them numerically, but for now, we will only focus on outliers that are obvious.

Page 15: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

CenterNext chapter we will learn all about

calculating centers.For now, we will just rely mostly on intuition.If the graph is skewed in a direction, the

mean will be more in that direction than in the other direction, and might not match up perfectly with the mode.

Page 16: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

SpreadWe will learn how to calculate spread.We will learn how to calculate spread by

hand even.Once we have finished that, you will

understand spread really well.For now, it just relates to how wide the graph

is.

Page 17: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Comparing DistributionsA good rule of thumb is that if you have two distributions,

graph them separately, but on the same scale as one another.

If you fix the scales, a more accurate picture of how they match up forms.

Comparing two distributions on a stem-and-leaf diagram is a bit different.

Comparing two sets of categorical variables on the same bar chart can be handy as long as they have the same categories.

It is common for bar charts to be used in place of histograms to compare two or more sets of quantitative data.It is worth your while to see this kind of graph.

Page 18: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

AssignmentsChapter 3: 5, 9, 19, 25, 33, 37

Due Thursday Chapter 4: 4, 8, 11, 17, 18, 30, 33

Due MondayRead Chapter 4 (as in the rest of it).Begin studying for Chapter 4 Quiz on Friday.

Page 19: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Quiz BulletpointsBe familiar with the differences between a

histogram and a bar chart.Be familiar with the advantages and

disadvantages of a stem-and-leaf diagram.Be familiar with the what-not-to-do list in

chapter 4.Be familiar with the what-not-to-do list in

chapter 4.Know when and how to transform data.

Page 20: (or Chapter 4). Brave New Data We are no longer limited to charts which only work for categorical data. We have three more charts at our disposal. Even

Bee Tee Dubs…My birthday is on this Thursday, as in two

days from now, and ISU’s visits to the school are on Wednesday, so today is probably a good day to start any party planning you guys were considering.

Remember, a party is not mandatory.If it is preferred we can do an additional

lecture that day instead.If you want a party, please bring in the food

and beverages, as a class, to justify one.