economics 105: statistics

32
Economics 105: Statistics http://www.davidson.edu/academic/economics/foley/105/ index.html Powerpoint slides meant to help you listen in class print out BEFORE you do the day’s reading! things will seem fast if you do the reading AFTER lecture I expect you to do the reading prior to class. Everyone does the first few weeks, hard part is continuing to do so (before class). If you don’t, things will seem to go fast. Stats is one of those classes where you can teach yourself quite a bit via the

Upload: hanh

Post on 23-Feb-2016

57 views

Category:

Documents


0 download

DESCRIPTION

Economics 105: Statistics. http://www.davidson.edu/academic/economics/foley/105/index.html Powerpoint slides meant to help you listen in class print out BEFORE you do the day ’ s reading! things will seem fast if you do the reading AFTER lecture - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Economics 105: Statistics

Economics 105: Statistics• http://www.davidson.edu/academic/economics/foley/105/index.html

•Powerpoint slides – meant to help you listen in class– print out BEFORE you do the day’s reading! – things will seem fast if you do the reading AFTER lecture – I expect you to do the reading prior to class. Everyone does the first few weeks, hard part is continuing to do so (before class). If you don’t, things will seem to go fast. – Stats is one of those classes where you can teach yourself quite a bit via the reading. I expect you to do so. – yes, these are high expectations.– Not everyone likes Powerpoint ...

Page 2: Economics 105: Statistics

Economics 105: Statistics• Today

– What is Statistics? – Presenting data

• For next time: Read Chapters 1 – 3.5

Page 3: Economics 105: Statistics

Organizing and Presenting Data Graphically

• Data in raw form are usually not easy to use for decision making

–Some type of organization is needed•Table•Graph

•Techniques reviewed in Chapter 2:Bar charts and pie charts

Pareto diagramOrdered array

Stem-and-leaf displayFrequency distributions, histograms and polygons

Cumulative distributions and ogivesContingency tables

Scatter diagrams

Page 4: Economics 105: Statistics

Raw Form of DataExample: A manufacturer of insulation

randomly selects 20 winter days and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Page 5: Economics 105: Statistics

Tabulating Numerical Data: Frequency Distributions

•What is a Frequency Distribution?– A frequency distribution is a list or a table …– containing class groupings (ranges within

which the data fall) ...– and the corresponding frequencies with which

data fall within each grouping or category

Page 6: Economics 105: Statistics

Why Use a Frequency Distribution?

•It is a way to summarize numerical data

•It condenses the raw data into a more useful form

•It allows for a quick visual interpretation of the data

Page 7: Economics 105: Statistics

Class Intervals and Class Boundaries

• Each class grouping has the same width• Determine the width of each interval by

• Usually at least 5 but no more than 15 groupings

• Class boundaries never overlap• Round up the interval width to get desirable

endpoints

Page 8: Economics 105: Statistics

Frequency Distribution ExampleExample: A manufacturer of insulation

randomly selects 20 winter days and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Page 9: Economics 105: Statistics

• Sort raw data in ascending order:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

• Find range: 58 - 12 = 46• Select number of classes: 5 (usually between 5 and 15)

• Compute class interval (width): 10 (46/5 then round up)

• Determine class boundaries (limits): – 10, 20, 30, 40, 50, 60

• Compute class midpoints: 15, 25, 35, 45, 55

• Count observations & assign to classes

Frequency Distribution Example(continued)

Page 10: Economics 105: Statistics

Frequency Distribution Example

Class Frequency

10 but less than 20 3 .15 1520 but less than 30 6 .30 3030 but less than 40 5 .25 25 40 but less than 50 4 .20 2050 but less than 60 2 .10 10 Total 20 1.00 100

RelativeFrequency Percentage

Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

(continued)

Page 11: Economics 105: Statistics

Tabulating Numerical Data: Cumulative Frequency

Class

10 but less than 20 3 15 3 1520 but less than 30 6 30 9 4530 but less than 40 5 25 14 7040 but less than 50 4 20 18 9050 but less than 60 2 10 20 100 Total 20 100

Percentage Cumulative Percentage

Data in ordered array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Frequency Cumulative Frequency

Page 12: Economics 105: Statistics

Graphing Numerical Data: The Histogram

• A graph of the data in a frequency distribution is called a histogram

• The class boundaries (or class midpoints) are shown on the horizontal axis

• the vertical axis is either frequency, relative frequency, or percentage

• Bars of the appropriate heights are used to represent the number of observations within each class

Page 13: Economics 105: Statistics

Class Midpoints

Histogram

(No gaps between

bars)

Class

10 but less than 20 15 320 but less than 30 25 630 but less than 40 35 540 but less than 50 45 450 but less than 60 55 2

FrequencyClass

Midpoint

Page 14: Economics 105: Statistics

Graphing Numerical Data: The Frequency Polygon

Class Midpoints

Class

10 but less than 20 15 320 but less than 30 25 630 but less than 40 35 540 but less than 50 45 450 but less than 60 55 2

FrequencyClass

Midpoint

(In a percentage polygon the vertical axis would be defined to show the percentage of observations per class)

Page 15: Economics 105: Statistics

Graphing Cumulative Frequencies: The Ogive (Cumulative % Polygon)

Class Boundaries (Not Midpoints)

Class

Less than 10 0 010 but less than 20 10 1520 but less than 30 20 4530 but less than 40 30 7040 but less than 50 40 9050 but less than 60 50 100

Cumulative Percentage

Lower class

boundary

Page 16: Economics 105: Statistics

Summary Measures

Arithmetic Mean

Median

Mode

Describing Data Numerically

Variance

Standard Deviation

Coefficient of Variation

Range

Interquartile Range

Geometric Mean

Skewness

Central Tendency Variation ShapeQuartiles

Page 17: Economics 105: Statistics

Measures of Central Tendency

Central Tendency

Arithmetic Mean Median Mode Geometric Mean

Overview

Midpoint of ranked values

Most frequently observed value

Page 18: Economics 105: Statistics

Arithmetic Mean• The arithmetic mean (mean) is the most

common measure of central tendency

– For a sample of size n:

Sample size Observed values

Page 19: Economics 105: Statistics

Arithmetic Mean• Mean = sum of values divided by the number of values• Affected by extreme values (outliers)

(continued)

0 1 2 3 4 5 6 7 8 9 10

Mean = 3

0 1 2 3 4 5 6 7 8 9 10

Mean = 4

Page 20: Economics 105: Statistics

Median• In an ordered array, the median is the “middle”

number (50% above, 50% below)

• Not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10

Median = 3

0 1 2 3 4 5 6 7 8 9 10

Median = 3

Page 21: Economics 105: Statistics

Finding the Median• The location of the median:

– If the number of values is odd, the median is the middle number

– If the number of values is even, the median is the average of the two middle numbers

• Note that is not the value of the median,

only the position of the median in the ranked data

Page 22: Economics 105: Statistics

Mode• A measure of central tendency• Value that occurs most often• Not affected by extreme values• Used for either numerical or categorical

(nominal) data• There may may be no mode• There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

Page 23: Economics 105: Statistics

• Five houses on a hill by the beach

Review Example

House Prices:

$2,000,000 500,000 300,000 100,000 100,000

Page 24: Economics 105: Statistics

Review Example:Summary Statistics

• Mean: ($3,000,000/5) = $600,000

• Median: middle value of ranked data = $300,000

• Mode: most frequent value = $100,000

House Prices:

$2,000,000 500,000 300,000 100,000 100,000Sum $3,000,000

Page 25: Economics 105: Statistics

• Mean is generally used, unless extreme values (outliers) exist

• Then median is often used, since the median is not sensitive to extreme values.– Example: Median home prices may

be reported for a region – less sensitive to outliers

Which measure of location is the “best”?

Page 26: Economics 105: Statistics

Geometric Mean• Geometric mean

– Used to measure the rate of change of a variable over time

• Geometric mean rate of return– Measures the status of an investment over time

– Where Ri is the rate of return in time period i

Page 27: Economics 105: Statistics

ExampleAn investment of $100,000 declined to $50,000

at the end of year one and rebounded to $100,000 at end of year two:

50% decrease 100% increase

The overall two-year return is zero, since it started and ended at the same level.

Page 28: Economics 105: Statistics

Example

Use the 1-year returns to compute the arithmetic mean and the geometric mean:

Arithmetic mean rate of return:

Geometric mean rate of return:

Misleading result

More accurate result

(continued)

Page 29: Economics 105: Statistics

Quartiles• Quartiles split the ranked data into 4 segments

with an equal number of values per segment

25% 25% 25% 25%

• The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger

• Q2 is the same as the median (50% are smaller, 50% are larger)

• Only 25% of the observations are greater than the third quartile

Q1 Q2 Q3

Page 30: Economics 105: Statistics

Quartile FormulasFind a quartile by determining the value in the appropriate position in the ranked data, where

First quartile position: Q1 = (n+1)/4

Second quartile position: Q2 = (n+1)/2 (the median position)

Third quartile position: Q3 = 3(n+1)/4

where n is the number of observed values

Page 31: Economics 105: Statistics

(n = 9)

Q1 is in the (9+1)/4 = 2.5 position of the ranked

data, so use the value half way between the 2nd and 3rd values, so Q1 = 12.5

Quartiles

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

• Example: Find the first quartile

Q1 and Q3 are measures of noncentral location Q2 = median, a measure of central tendency

Page 32: Economics 105: Statistics

(n = 9)Q1 is in the (9+1)/4 = 2.5 position of the ranked data,

so Q1 = 12.5

Q2 is in the (9+1)/2 = 5th position of the ranked data,so Q2 = median = 16

Q3 is in the 3(9+1)/4 = 7.5 position of the ranked data,so Q3 = 19.5

Quartiles

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

• Example: (continued)