introduction to statistics
DESCRIPTION
economicsTRANSCRIPT
1/18
EC114 Introduction to Quantitative Economics1. Introduction to Statistics
Department of EconomicsUniversity of Essex
11/13 October 2011
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
2/18
Outline
Reference: R. L. Thomas, Using Statistics in Economics,McGraw-Hill, 2005, Prerequisites 1 and 2.
1 Statistics in Economics
2 Descriptive Statistics
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Statistics in Economics 3/18
Why Study Quantitative Economics?
In Economics, we make an argument using quantitativeevidence.
A historian might defend an argument using historicalquotationsEconomists make arguments using quantitiese.g. “Unemployment rose last year by 1 million becauseGDP fell by 0.5%”
We use statistics to interpret quantitative evidence
The good news is that knowledge of statistics pays verywell!
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Statistics in Economics 4/18
In Economics we use Statistics (and Econometrics) toanalyse and interpret economic data with a view to:
1 modelling economic relationships;- What determines wages? Age, experience, occupation,education?
2 testing economic theories;- Are share price movements unpredictable?
3 identifying trends;- Are global air temperatures rising?
4 forecasting/prediction;- We predict GDP next year given different government taxpolicies
5 making better decisions.- Which assets should we buy?
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Statistics in Economics 5/18
• The data we observe can be for different types of variable:1 aggregate: relating to the whole economy or specific
sectors/regions, e.g. consumers’ expenditure,producer price inflation.
2 individual: relating to individual firms or householdse.g. household expenditure, firms’ investmentexpenditure.
• The data can be observed in different ways:1 time series i.e. for a given variable over time,
e.g. consumers’ expenditure from 1955–2009;2 cross section i.e. for a given variable at a particular
point in time, e.g. car firms’ investment in 2005;3 panel data i.e. for a variable on individual units over
time, e.g. households’ expenditure in the U.K. from1990–2009.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 6/18
• What can we learn from data? We learn little from lookingat a large set of numbers, so we attempt to summarise thedata using descriptive statistics.• Table P.1 in Thomas reports the yearly clothing expenditure
of 373 families and uses this data set to illustrate the use ofdescriptive statistics – try to follow what he does.• We shall use a smaller data set of 10 observations picked
from Table P.1, these being
2806, 1743, 3201, 2401, 3567, 1666, 2111, 2848, 1572,2651• These are observations on a variable we shall denote X.• We shall use the index i to denote a generic observation
Xi, where i takes on the values 1, 2, 3, . . . , 9, 10.• Hence X1 = 2806, X2 = 1743, . . . , X9 = 1572, X10 = 2651.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 7/18
Measures of Central Tendency
• What is a typical value for X in the data set?• The most common answer is to compute the (arithmetic)
mean, or average, of the values for X:
X̄ =X1 + X2 + . . .+ X9 + X10
10=
24, 56610
= 2, 456.6;
the typical clothing expenditure is £2,456.60.• In general, if we have n observations, we would write
X̄ =X1 + X2 + . . .+ Xn−1 + Xn
n=
n∑i=1
Xi
n.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 8/18
The summation notation is very useful and the followingproperties are worth learning:
1 If α is a constant i.e. it does not vary and does not dependon i, then
n∑i=1
αXi = αX1 + . . .+ αXn = α
n∑i=1
Xi,
n∑i=1
α = α+ . . .+ α (ntimes) = nα.
2 If X and Y are two variables, with n observations on each,then
n∑i=1
(Xi + Yi) = (X1 + Y1) + . . .+ (Xn + Yn) =n∑
i=1
Xi +n∑
i=1
Yi.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 9/18
The above properties can also be combined: if α and β are twoconstants, then
n∑i=1
(αXi + βYi) =n∑
i=1
αXi +n∑
i=1
βYi = α
n∑i=1
Xi + β
n∑i=1
Yi.
Sometimes the summation is written∑n
i=1 or∑
i or simply∑
when the range of summation is obvious.For example, the sample mean may be written
X̄ =∑
i Xi
n=
∑Xi
n.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 10/18
• Another measure of the typical value is the median, M.• It is obtained by arranging the data in ascending order and
choosing the value in the middle.• If n is odd, then
M = X(n+1)/2
e.g. if n = 125 then (n + 1)/2 = (125 + 1)/2 = 63 so that Mis the 63rd observation:
M = X63.
• Note that there are 62 observations below M and 62observations above M.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 11/18
• If n is even, M is the average of the two middle numbers:
M =Xn/2 + Xn/2+1
2
e.g. if n = 126 then n/2 = 63 and n/2 + 1 = 64 so that
M =X63 + X64
2.
• Putting our sample of 10 observations in ascending order:
1572 1666 1743 2111 2401 2651 2806 2848 3201 3567• Here, n = 10 and so n/2 = 5 and n/2 + 1 = 6; hence
M =X5 + X6
2=
2401 + 26512
=5052
2= 2526.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 12/18
• A third measure of central tendency is the mode, which isthe most frequent observation.• In our sample of 10 clothing expenditures the mode has
little meaning because all the observations are different!• But when values are repeated in a data set the mode
depicts the most common value.
• The mean is used most widely but can be distorted byextreme values, in which case the median is moremeaningful.• Suppose the largest observation in our data set was not
3567 but 13567, which is an extreme value compared tothe other nine observations.• In this case the mean becomes 34,566/10=3,456.6 (larger
than all but the largest, extreme, observation) but themedian remains unchanged at 2526.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 13/18
Measures of Variation
• By how much do the observations vary around their centralvalue?• The variance is the mean squared deviation around the
mean.• Let xi denote the deviation of observation i from the mean,
X̄, i.e. xi = Xi − X̄.• The squared deviation is x2
i = (Xi − X̄)2, and the mean ofthese (the variance) is
v2 =∑
i x2i
n=
∑i(Xi − X̄)2
n.
• It is often easier to calculate
v2 =∑
i X2i
n− X̄2.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 14/18
Returning to our 10 observations on clothing expenditure:
i Xi Xi − X̄ (Xi − X̄)2 X2i
1 2806 349.4 122,080.36 7,873,6362 1743 −713.6 509,224.96 3,038,049...
......
......
9 1572 −884.6 782,517.16 2,471,18410 2651 194.4 37,791.36 7,027,801
Sums 24,566 0.0 4,139,506.40 64,488,342
Hence v2 =4, 139, 506.40
10= 413, 950.64
or v2 =64, 488, 342
10− (2, 456.6)2
= 6, 448, 834.2− 6, 034, 883.56 = 413, 950.64.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 15/18
• What does a variance of 413,950.64 actually mean?• We can make relative statements e.g. it is larger than a
variance of 10 and smaller than a variance of one million,but can we say any more about the variation about themean?• It is common to consider the standard deviation, the
positive square root of v2:
v =
√∑i x2
i
n.
• For our data set we find that v =√
413, 950.64 = 643.39which we interpret as being approximately the averagedeviation of observations from their mean.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 16/18
• An alternative measure of variation is to take the meanabsolute deviation rather than the mean squared deviationfrom the mean:
mdev =∑
i |Xi − X̄|n
.
• From our previous table we find:i Xi Xi − X̄ |Xi − X̄|
1 2806 349.4 349.42 1743 −713.6 713.6...
......
...9 1572 −884.6 884.6
10 2651 194.4 194.4
Sums 24,566 0.0 5,580.0
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Descriptive Statistics 17/18
• Hence we find that mdev=5580/10=558.• Although mdev may seem the natural measure of the
average variation from the mean, it is used less widely thanthe standard deviation, mainly because:
1 mathematically it is easier to analyse squared valuesthan absolute values, and therefore...
2 the theory about variance and standard deviation ismore developed.
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics
Summary 18/18
Summary
Measures of central tendency (mean, mode, median)Measures of variation (variance, standard deviation, meandeviation)
Next week:
Frequency distributionsMutually exclusive and independent eventsConditional probabilities
EC114 Introduction to Quantitative Economics 1. Introduction to Statistics