lecture 1. making sense of data: data variation
DESCRIPTION
Lecture 1. Making Sense of Data: Data Variation. David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management. Making Sense of Data: Data Variation. Introductions Instructor: David R. Merrell TA s: Max Hernandez-Toso and Hao Xu Course Content: USEFUL STATISTICS. - PowerPoint PPT PresentationTRANSCRIPT
Lecture 1. Making Sense of Data: Data Variation
David R. Merrell90-786 Intermediate Empirical
Methods for Public Policy and Management
Making Sense of Data: Data Variation
Introductions Instructor: David R. Merrell TA s: Max Hernandez-Toso and Hao Xu
Course Content: USEFUL STATISTICS
Statistics is the use of data to reduce uncertainty about potential observations
Course Information Web site
http://Duncan.heinz.cmu.edu/GeorgeWeb/
Heinz 90-786 Front Page.htm Data files
r:/academic/90786
Making Sense of Data Motivation in management and
policy What is data? What’s the use of data? Data variation
Motivation for Statistical Input
Managerial Decision Making Changes in societal or organizational
conditions Differences between observations and
expectations Policy Making
Impact of changing the system
What is Data?
Unit of analysis Number of variables
one, two, more than two Level of measurement / kind
of data Nominal, Ordinal, Interval
Unit of analysis
Focus of attention: a case that can be be separately and uniquely identified
person (student, woman, tenant, .. place (city, street intersection, river, … object (car, power plant, ...) organization (school, corporation, …) incident (birth, election) time period(day, season, year, ...)
Variables Characteristics, attributes, and
occurrences observed about each unit of analysis
Require specific step-by-step procedure to obtain values for the variable
Examples
Driver's license application study Unit of analysis: people who apply for a driver's
license. Outcome variable: License issued or not Other variables: Applicant's age, sex, and race
Snowfall in Pittsburgh Units of analysis: Snowstorms Outcome variable: depth of the snowfall from each
storm Other variables: date of snowstorm, temperature
Nominal data Classifies outcomes by categories Categories must be mutually
exclusive and exhaustive Examples:
Marital status, region of the country, religion, occupation, school district, place of birth, blood type
Ordinal data Classifies outcomes by ranked
categories Examples:
Officers in the U.S. Army can be classified as: 1 = general 5 = captain 2 = colonel 6 = first lieutenant 3 = lieutenant colonel 7 = second lieutenant 4 = major
Education (highest diploma or degree attained)
Interval data Classifies outcomes on a
continuous scale Examples:
Scholastic Aptitude Test (SAT) score Consumer Price Index (CPI) Time of day
Description Summary of observations In February, 1997 the M1A money
supply in Taiwan rose 6.46% over February, 1996
Housing starts in June, 1996, rose to a seasonally adjusted rate of 1,480,000 units from a revised 1,461,000 in May
Evaluation Comparison of observed state of
affairs against expectations Expectations are based on: ethical
norms, managerial plans and budgets
Estimation Uses observations to assess an attribute
of a population or to predict future values.
A new charter school in Boston raised test scores an average of 7 percentile points. How would other charter schools do? How will this charter school do in the future?
Data Variation: Data Compression and Display Boxplots Five number summary
minimum lower quartile point median upper quartile point maximum
Batting Average of 263 major league baseball players
Aver Career
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Compressed Data ValuesMedian 0.263Minimum 0.196Maximum 0.353
Range 0.155
Mode 0.250
Mean 0.263Standard Deviation 0.023
Batting Average of 263 major league baseball players
Aver Career
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Median 0.263
Maximum0.352
Minimum0.196