measurement variables describing distributions © 2014 project lead the way, inc. computer science...

Post on 13-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Measurement Variables

Describing Distributions

© 2014 Project Lead The Way, Inc.Computer Science and Software Engineering

• A nearly perfect analogycontinuous : discreteanalog : digitalfloat : int

• Measurements of continuous variables are made discrete by "binning" them.

• How old are you? Time is continuous, but you answer in discrete, binned values.

Continuous vs. Discrete

• Categorical (e.g., zip codes)categories with no meaningful

order• Ordinal (e.g., rank in a race)

ordered, but increasing by 1 has no consistent meaning

• Interval (e.g., grade level)Ordered, with consistent steps up, but no meaning for "doubling" or "tripling"

• Ratio (e.g., height)Ordered, with "2 times" being

"double"

Levels of a Measurement Variable

Sample vs. Population• Population =

infinite pool of measurements, or all measurements possible

• Sample = subset of population

• Population parameters= population mean= population standard deviation

• These are inferred from data

Sample vs. Population• Sample

statistics = sample mean = sample standard deviation

• These describe data

Sample vs. Population• Infer population distribution from

sample histogram • Sample histogram matches parent

distribution better with large sample visualized with small intervals

• Half of the area under the distribution is to the left of the median

Median

Mean, Median, Mode

• y-axis shows values of the data• Splits data into quartiles

Box Plot

heig

ht

Each box contains 25% of the data

The IQR (Interquartile Range) Contains 50% of the Data

Whiskers extend to max and min… usually

Box Plot

Whiskers and Outliers Show max/min

The Range Contains 100% of the Data

• A family of distributions with very similar shape

• One normal distribution for each μ and σ

Normal Distributions

μ

σ

• μ ("mu") = population mean

• σ ("sigma") = population standard deviation

• One normal distribution for any pair μ , σ• Example: μ = 6 and σ = 2.2

A Normal Distribution

μ

σ

• μ ("mu") = population mean

• σ ("sigma") = population standard deviation

• μ = 0 and σ = 1

The Standard Normal Distribution

μ

σ

The Empirical Rule: 67% - 95% - 99.7%

67% area

95% area

99.7% area

values within μ ±

σ

values within μ ±

values within μ ± 3σ

Shape, Center, Spread

• These distributions are both positively-skewed because they are right-tailed

Shape, Center, Spread

Shape, Center, Spread

top related