introduction to statistics

18

Click here to load reader

Upload: thrphys1940

Post on 19-Jul-2016

9 views

Category:

Documents


5 download

DESCRIPTION

economics

TRANSCRIPT

Page 1: Introduction to Statistics

1/18

EC114 Introduction to Quantitative Economics1. Introduction to Statistics

Department of EconomicsUniversity of Essex

11/13 October 2011

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 2: Introduction to Statistics

2/18

Outline

Reference: R. L. Thomas, Using Statistics in Economics,McGraw-Hill, 2005, Prerequisites 1 and 2.

1 Statistics in Economics

2 Descriptive Statistics

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 3: Introduction to Statistics

Statistics in Economics 3/18

Why Study Quantitative Economics?

In Economics, we make an argument using quantitativeevidence.

A historian might defend an argument using historicalquotationsEconomists make arguments using quantitiese.g. “Unemployment rose last year by 1 million becauseGDP fell by 0.5%”

We use statistics to interpret quantitative evidence

The good news is that knowledge of statistics pays verywell!

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 4: Introduction to Statistics

Statistics in Economics 4/18

In Economics we use Statistics (and Econometrics) toanalyse and interpret economic data with a view to:

1 modelling economic relationships;- What determines wages? Age, experience, occupation,education?

2 testing economic theories;- Are share price movements unpredictable?

3 identifying trends;- Are global air temperatures rising?

4 forecasting/prediction;- We predict GDP next year given different government taxpolicies

5 making better decisions.- Which assets should we buy?

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 5: Introduction to Statistics

Statistics in Economics 5/18

• The data we observe can be for different types of variable:1 aggregate: relating to the whole economy or specific

sectors/regions, e.g. consumers’ expenditure,producer price inflation.

2 individual: relating to individual firms or householdse.g. household expenditure, firms’ investmentexpenditure.

• The data can be observed in different ways:1 time series i.e. for a given variable over time,

e.g. consumers’ expenditure from 1955–2009;2 cross section i.e. for a given variable at a particular

point in time, e.g. car firms’ investment in 2005;3 panel data i.e. for a variable on individual units over

time, e.g. households’ expenditure in the U.K. from1990–2009.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 6: Introduction to Statistics

Descriptive Statistics 6/18

• What can we learn from data? We learn little from lookingat a large set of numbers, so we attempt to summarise thedata using descriptive statistics.• Table P.1 in Thomas reports the yearly clothing expenditure

of 373 families and uses this data set to illustrate the use ofdescriptive statistics – try to follow what he does.• We shall use a smaller data set of 10 observations picked

from Table P.1, these being

2806, 1743, 3201, 2401, 3567, 1666, 2111, 2848, 1572,2651• These are observations on a variable we shall denote X.• We shall use the index i to denote a generic observation

Xi, where i takes on the values 1, 2, 3, . . . , 9, 10.• Hence X1 = 2806, X2 = 1743, . . . , X9 = 1572, X10 = 2651.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 7: Introduction to Statistics

Descriptive Statistics 7/18

Measures of Central Tendency

• What is a typical value for X in the data set?• The most common answer is to compute the (arithmetic)

mean, or average, of the values for X:

X̄ =X1 + X2 + . . .+ X9 + X10

10=

24, 56610

= 2, 456.6;

the typical clothing expenditure is £2,456.60.• In general, if we have n observations, we would write

X̄ =X1 + X2 + . . .+ Xn−1 + Xn

n=

n∑i=1

Xi

n.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 8: Introduction to Statistics

Descriptive Statistics 8/18

The summation notation is very useful and the followingproperties are worth learning:

1 If α is a constant i.e. it does not vary and does not dependon i, then

n∑i=1

αXi = αX1 + . . .+ αXn = α

n∑i=1

Xi,

n∑i=1

α = α+ . . .+ α (ntimes) = nα.

2 If X and Y are two variables, with n observations on each,then

n∑i=1

(Xi + Yi) = (X1 + Y1) + . . .+ (Xn + Yn) =n∑

i=1

Xi +n∑

i=1

Yi.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 9: Introduction to Statistics

Descriptive Statistics 9/18

The above properties can also be combined: if α and β are twoconstants, then

n∑i=1

(αXi + βYi) =n∑

i=1

αXi +n∑

i=1

βYi = α

n∑i=1

Xi + β

n∑i=1

Yi.

Sometimes the summation is written∑n

i=1 or∑

i or simply∑

when the range of summation is obvious.For example, the sample mean may be written

X̄ =∑

i Xi

n=

∑Xi

n.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 10: Introduction to Statistics

Descriptive Statistics 10/18

• Another measure of the typical value is the median, M.• It is obtained by arranging the data in ascending order and

choosing the value in the middle.• If n is odd, then

M = X(n+1)/2

e.g. if n = 125 then (n + 1)/2 = (125 + 1)/2 = 63 so that Mis the 63rd observation:

M = X63.

• Note that there are 62 observations below M and 62observations above M.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 11: Introduction to Statistics

Descriptive Statistics 11/18

• If n is even, M is the average of the two middle numbers:

M =Xn/2 + Xn/2+1

2

e.g. if n = 126 then n/2 = 63 and n/2 + 1 = 64 so that

M =X63 + X64

2.

• Putting our sample of 10 observations in ascending order:

1572 1666 1743 2111 2401 2651 2806 2848 3201 3567• Here, n = 10 and so n/2 = 5 and n/2 + 1 = 6; hence

M =X5 + X6

2=

2401 + 26512

=5052

2= 2526.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 12: Introduction to Statistics

Descriptive Statistics 12/18

• A third measure of central tendency is the mode, which isthe most frequent observation.• In our sample of 10 clothing expenditures the mode has

little meaning because all the observations are different!• But when values are repeated in a data set the mode

depicts the most common value.

• The mean is used most widely but can be distorted byextreme values, in which case the median is moremeaningful.• Suppose the largest observation in our data set was not

3567 but 13567, which is an extreme value compared tothe other nine observations.• In this case the mean becomes 34,566/10=3,456.6 (larger

than all but the largest, extreme, observation) but themedian remains unchanged at 2526.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 13: Introduction to Statistics

Descriptive Statistics 13/18

Measures of Variation

• By how much do the observations vary around their centralvalue?• The variance is the mean squared deviation around the

mean.• Let xi denote the deviation of observation i from the mean,

X̄, i.e. xi = Xi − X̄.• The squared deviation is x2

i = (Xi − X̄)2, and the mean ofthese (the variance) is

v2 =∑

i x2i

n=

∑i(Xi − X̄)2

n.

• It is often easier to calculate

v2 =∑

i X2i

n− X̄2.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 14: Introduction to Statistics

Descriptive Statistics 14/18

Returning to our 10 observations on clothing expenditure:

i Xi Xi − X̄ (Xi − X̄)2 X2i

1 2806 349.4 122,080.36 7,873,6362 1743 −713.6 509,224.96 3,038,049...

......

......

9 1572 −884.6 782,517.16 2,471,18410 2651 194.4 37,791.36 7,027,801

Sums 24,566 0.0 4,139,506.40 64,488,342

Hence v2 =4, 139, 506.40

10= 413, 950.64

or v2 =64, 488, 342

10− (2, 456.6)2

= 6, 448, 834.2− 6, 034, 883.56 = 413, 950.64.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 15: Introduction to Statistics

Descriptive Statistics 15/18

• What does a variance of 413,950.64 actually mean?• We can make relative statements e.g. it is larger than a

variance of 10 and smaller than a variance of one million,but can we say any more about the variation about themean?• It is common to consider the standard deviation, the

positive square root of v2:

v =

√∑i x2

i

n.

• For our data set we find that v =√

413, 950.64 = 643.39which we interpret as being approximately the averagedeviation of observations from their mean.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 16: Introduction to Statistics

Descriptive Statistics 16/18

• An alternative measure of variation is to take the meanabsolute deviation rather than the mean squared deviationfrom the mean:

mdev =∑

i |Xi − X̄|n

.

• From our previous table we find:i Xi Xi − X̄ |Xi − X̄|

1 2806 349.4 349.42 1743 −713.6 713.6...

......

...9 1572 −884.6 884.6

10 2651 194.4 194.4

Sums 24,566 0.0 5,580.0

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 17: Introduction to Statistics

Descriptive Statistics 17/18

• Hence we find that mdev=5580/10=558.• Although mdev may seem the natural measure of the

average variation from the mean, it is used less widely thanthe standard deviation, mainly because:

1 mathematically it is easier to analyse squared valuesthan absolute values, and therefore...

2 the theory about variance and standard deviation ismore developed.

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics

Page 18: Introduction to Statistics

Summary 18/18

Summary

Measures of central tendency (mean, mode, median)Measures of variation (variance, standard deviation, meandeviation)

Next week:

Frequency distributionsMutually exclusive and independent eventsConditional probabilities

EC114 Introduction to Quantitative Economics 1. Introduction to Statistics