organizing data looking for patterns and departures from them
TRANSCRIPT
Organizing Data
Looking for Patterns and departures from them
Exploring Data
Introduction Displays Descriptions
What is Statistics?
A science and not MATH An examination with clear
explanations and not just crunching numbers
A WHODUNIT: who, what, and why Delving into the 411 of groups of
individuals according to variables
Definitions
Individuals are the objects described by a set of data
Variable is any characteristic of an individual Categorical: non – numeric
groups/classes (name, sex, city). Quantitative: numerical values (age,
scores, mileage).
Whodunit
Who? Individuals and how many What? Number of variables, the
names and identifying the units associated with them.
Why? Reason data gathered and its intended purpose. Is the data to support or refute?
Distribution
The distribution of a variable tells us what values the variable takes and how often.
Exercise 1.1: Fuel – Efficient Cars
A) Individuals: Make and model of 1998 motor vehicles
B) Vehicle type – categorical
Transmission type – categorical Number of Cyl. – quantitative Mileage rate in city (mpg) –
quantitative Highway mileage (mpg) - quantitative
Assignment
Page 7: 1.2 – 1.4
1.1 Displaying Distributions
Displays and graphs help to place the written text in a more visible form.
All good displays have a title and axes are labeled and equal intervals are used when appropriate.
Categorical Variables
Categorical variables are best displayed with bar graphs or pie charts. Bar graphs: quick comparisons. Pie charts or pie graphs: show parts
of the whole (percentages used)
Quantitative Variables Quantitative variables are best
displayed by dotplots and stemplots (double stemplots)
Keep these features in mind Shape – mound, skewed left/right Center - median Spread – smallest and largest values Outliers - unusual features
Dotplots and Stemplots
Read the construction of these. Look quickly at our choices of soft
drinks from Table 1.1. Construct a dotplot and stemplot
for the caffeine content. Complete exercises 1.5 – 1.8.
Histograms When we have many values for our
quantitative variable and those values can be grouped together to get a clearer picture of the distribution.
Steps: Arrange the data into equal widths called classes, receive a count within those classes(height of bar), label and scale axes.
Activity (tech): Presidential Ages at Inauguration.
Activity: Getting to Know You
Due: Friday, August 27, 2010 Select a display for your data: bar
graph, circle graph, line graph, histogram, stemplot, dotplot.
Be as creative as you can in addition to drawing or using Excel or any other type of display technology.
Activity: Getting to Know You Write a report on the data collected
following the guidelines. Quantitative Data: Discuss the shape of
your distribution, center, spread and any unusual features (outliers). What inferences can you make about your class?
Categorical Data: Write about any observations you can draw about your class.
Frequencies and Percentiles Sometimes it is interesting to describe
the relative position of an individual within a distribution.
pth percentile is the value such that p percent of the observations fall at or below p.
An ogive or relative cumulative frequency graph allows us to see the distribution as a whole.
Presidents
Look at the middle of page 29. Frequency – count Relative frequency – percentage of
those falling within a certain group (class)
Ogive allows us to look at individuals compared to the whole ( percentiles related to all involved.)
Assignment
Exercises 1.12
Time Plots
Variable plotted against the time it was measured.
Time is always marked on the horizontal axis and the variable of interest on the vertical axis.
Connecting the points help us to see trends.
Assignment
Exercises: 1.21, 1.23, 1.25 – 1.29