topic 4.1
DESCRIPTION
TRANSCRIPT
ECON 377/477
Topic 4.1
Stochastic Frontier AnalysisPart 1
Outline
• Introduction• The stochastic production frontier• Estimating the parameters• Predicting technical efficiency• Hypothesis testing• Conclusion
3ECON377/477 Topic 4.1
Introduction• Assume cross-sectional data on I firms• A simple method to estimate a production
frontier using such data is to envelop the data points using an arbitrarily chosen function
• Consider a Cobb-Douglas production frontier:
ln qi = xiβ – ui i = 1, …, I
where qi is the output of the i-th firm, xi is a K×1 vector containing the logarithms of inputs; β is a vector of unknown parameters; and ui is a non-negative random variable associated with technical inefficiency
4ECON377/477 Topic 4.1
Introduction
• This production frontier is deterministic insofar as qi is bounded from above by the non-stochastic (deterministic) quantity
• A problem with frontiers of this type (and with the DEA frontier studied in Topic 3) is that no account is taken of measurement errors and other sources of statistical noise
• All deviations from the frontier are assumed to be the result of technical inefficiency
5ECON377/477 Topic 4.1
Introduction• A solution to the problem is to introduce another
random variable representing statistical noise• The resulting frontier is known as a stochastic
production frontier, the estimation of which is the focus of this topic
• We begin by describing the basic stochastic production frontier model, where (the logarithm of) output is specified as a function of:o a non-negative random error, which represents technical
inefficiencyo a symmetric random error accounting for noise
6ECON377/477 Topic 4.1
The stochastic production frontier• The stochastic frontier production function model
is of the form:
ln qi = xiβ + vi – ui
where vi is a symmetric random error to account for statistical noise
• The model is called a stochastic frontier production function because the output values are bounded from above by the stochastic (random) variable
7ECON377/477 Topic 4.1
The stochastic production frontier
• The random error vi can be positive or negative
• Therefore, the stochastic frontier outputs vary about the deterministic part of the model, exp(xiβ)
• These features of the stochastic frontier model can be illustrated graphically
• It is convenient to restrict attention to firms that produce the output qi using only one input, xi
8ECON377/477 Topic 4.1
The stochastic production frontier
• In this case, a Cobb-Douglas stochastic frontier model takes the form:
ln qi = β0 + β1 ln xi + vi – ui
• Alternatively,
qi = exp(β0 + β1 ln xi) × exp(vi) × exp(– ui)
9ECON377/477 Topic 4.1
Deterministic component
Noise Inefficiency
The stochastic production frontier
• Such a frontier is depicted on the next slide where we plot the inputs and outputs of two firms, A and B
• The deterministic component of the frontier model has been drawn to reflect the existence of diminishing returns to scale
• Values of the input are measured along the horizontal axis and outputs are measured on the vertical axis
• Firm A uses the input level xA to produce the output qA, while Firm B uses the input level xB to produce the output qB
10ECON377/477 Topic 4.1
The stochastic production frontier
11ECON377/477 Topic 4.1
Deterministic frontier qi = exp(β0 + β1 ln xi)
Noise effect
Inefficiency effect
Inefficiency effect
Noise effect
xixA xB
qA
qB
qi
0
qA*
qB*
qA* ≡ exp(β0 + β1ln xA + vA) qB* ≡ exp(β0 + β1ln xB + vB)
No inefficiency effects: uA= uB = 0
The stochastic production frontier
• The frontier output for Firm A lies above the deterministic part of the production frontier only because the noise effect is positive (vA > 0)
• The frontier output for Firm B lies below the deterministic part of the frontier because the noise effect is negative (i.e., vB < 0)
• The observed output of Firm A lies below the deterministic part of the frontier because the sum of the noise and inefficiency effects is negative (vA – uA < 0)
12ECON377/477 Topic 4.1
The stochastic production frontier
• The (unobserved) frontier outputs tend to be evenly distributed above and below the deterministic part of the frontier
• But observed outputs tend to lie below the deterministic part of the frontier
• Indeed, they can only lie above the deterministic part of the frontier when the noise effect is positive and larger than the inefficiency effect
• Much of stochastic frontier analysis is directed towards the prediction of the inefficiency effects
13ECON377/477 Topic 4.1
The stochastic production frontier
• The most common output-oriented measure of technical efficiency is the ratio of observed output to the corresponding stochastic frontier output:
• This measure of technical efficiency (TE) takes a value between zero and one
14ECON377/477 Topic 4.1
exp( )exp( )
exp( ) exp( )i i i i
i ii i i i
q v uTE u
v v
x β
x β x β
The stochastic production frontier
• TE measures the output of the i-th firm relative to the output that could be produced by a fully efficient firm using the same input vector
• The first step in predicting the technical efficiency, TEi, is to estimate the parameters of the stochastic production frontier model
• Because TEi is a random variable, and not a parameter, we use the term ‘predict’ instead of ‘estimate’
15ECON377/477 Topic 4.1
Estimating the parameters
• It is common to assume that each vi is distributed independently of each ui, and that both errors are uncorrelated with the explanatory variables in xi
• In addition, we assume:
E(vi) = 0 (zero mean)
E(vi2) = σv
2 (homoskedastic)
E(vivj) = 0 for all i ≠ j (uncorrelated)
E(ui2) = constant (homoskedastic)
E(uiuj) = 0 for all i ≠ j (uncorrelated)16ECON377/477 Topic 4.1
Estimating the parameters
• We cannot use the OLS estimates to compute measures of technical efficiency
• One solution to this problem is to correct for the bias in the intercept term using an estimator known as the corrected ordinary least squares (COLS) estimator
• A better solution is to make some distributional assumptions concerning the two error terms and estimate the model using the method of maximum likelihood (ML)
17ECON377/477 Topic 4.1
Estimating the parameters: the half-normal distribution
• We assume the vis are independently and identically distributed normal random variables with zero means and variances
• We also assume the uis are independently and identically distributed half-normal random variables with scale parameter σu
2
• That is, the pdf of each ui is a truncated version of a normal random variable having zero mean and variance
18ECON377/477 Topic 4.1
Estimating the parameters: the half-normal distribution
• We parameterise the log-likelihood function for this so-called half-normal model in terms of:
σ2 = σv2 + σu
2
λ2 = σu2/σv
2 ≥ 0
• If λ = 0, there are no technical inefficiency effects and all deviations from the frontier are due to noise
19ECON377/477 Topic 4.1
Estimating the parameters: the log-likelihood function
• The log-likelihood function is:
where y is a vector of log-outputs; εi = vi - ui is a composite error term; and Φ(x) is the cdf of the standard normal random variable evaluated at x
• The likelihood function is maximised using an iterative optimisation procedure
20ECON377/477 Topic 4.1
22
21 1
1ln ( | , , ) ln ln
2 2 2
I Ii
ii i
IL
y β
Estimating the parameters: ML• The ML estimation of the half-normal stochastic
frontier model is illustrated in CROB, page 248, by presenting annotated SHAZAM output from the estimation of a translog production frontier:
21ECON377/477 Topic 4.1
1
2
32
1
1 2
1 32
2
2 32
3
1
ln
ln
ln
,0.5(ln )
ln ln
ln ln
0.5(ln )
ln ln
0.5(ln )
i
i
i
i
i i
i i
i i
i
i i
i
t
x
x
x
x
x x
x x
x
x x
x
x 0 1 2 3 11 12 13 22 23 33 , β
Estimating the parameters: ML
• It is easier to use purpose-built software packages such as FRONTIER and LIMDEP
• The FRONTIER instruction and data files used for estimating the half-normal model are presented in CROB, Tables 9.2 and 9.3
• The instruction file should be self-explanatory (see the comments on the right-hand side of the file on the next slide)
• The frontier output file is presented in CROB, Table 9.4
22ECON377/477 Topic 4.1
Estimating the parameters: FRONTIER instruction file
23ECON377/477 Topic 4.1
1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODELchap9.txt DATA FILE NAMEchap9_2.out OUTPUT FILE NAME1 1=PRODUCTION FUNCTION, 2=COST FUNCTIONy LOGGED DEPENDENT VARIABLE (Y/N)344 NUMBER OF CROSS-SECTIONS1 NUMBER OF TIME PERIODS344 NUMBER OF OBSERVATIONS IN TOTAL10 NUMBER OF REGRESSOR VARIABLES (Xs) n MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]n STARTING VALUES (Y/N)
Estimating the parameters: FRONTIER data file
24ECON377/477 Topic 4.1
1.000000 1.000000 0.1850809 1.000000 0.1538426 0.3961101 0.9214233E-01 0.1183377E-01 0.6093860E-01 0.1417541E-01 0.7845160E-01 0.3649850E-01 0.4245104E-02 2.000000 1.000000 0.4590094 1.000000 0.5725529 0.5296415 0.4723926 0.1639084 0.3032478 0.2704698 0.1402600 0.2501987 0.1115774 3.000000 1.000000 0.4226059 1.000000 0.4613273 0.4505042 0.2864401 0.1064114 0.2078299 0.1321426 0.1014770 0.1290424 0.4102396E-01 4.000000 1.000000 -0.3031307 1.000000 -0.4259759 -0.4657866 -0.7656522 0.9072774E-01 0.1984139 0.3261494 0.1084786 0.3566305 0.2931116
Firm 3Year 1
The log output of firm 3 in year 1
Estimating the parameters: alternative distributional specifications
• Alternative specifications of ui to the half-normal distribution include:o Truncated normalo Exponential with mean λo Gamma with mean λ and degrees of freedom m
• Theoretical considerations and computational complexity may influence the choice
• Nevertheless, estimated elasticities and technological change effects are fairly robust to this change in the distributional assumption
25ECON377/477 Topic 4.1
Estimating the parameters: alternative distributional specifications
• Different distributional assumptions may give rise to different predictions of technical efficiency
• But when we rank firms on the basis of predicted technical efficiencies, the rankings are often quite robust to distributional choice
• In such cases, the principle of parsimony favours the simpler half-normal and exponential models
26ECON377/477 Topic 4.1
Predicting technical efficiency: firms
• The technical efficiency of the i-th firm is defined by TEi = exp(–ui)
• This result provides a basis for the prediction of both firm and industry technical efficiency
• Firm technical efficiency refers to the individual TE scores of firms within an industry
• Industry efficiency can be viewed as the average of the TEs of all the firms in the industry
27ECON377/477 Topic 4.1
Predicting technical efficiency: firms
• We can summarise information about ui in the
form of the truncated normal pdf as:
where:
and:
28ECON377/477 Topic 4.1
*
2*22
***
1 1( ) exp
22i
i i i i
up u q u u
* 2 2(ln ) /i i i uu q x β
2 2 2 2* / .v u
Predicting technical efficiency: firms
• This conditional pdf gives information about likely and unlikely values of ui after firm i has been selected in our sample and after we have observed its output, qi
• In most situations, we are interested in the efficiency of the i-th firm, TEi = exp(–ui)
• We use p(ui | qi) to derive the predictor that minimises the mean square prediction error:
29ECON377/477 Topic 4.1
2* *
***
* *
ˆ exp( ) exp .2
i ii i i i
u uTE E u q u
Predicting technical efficiency: industry
• A natural predictor of industry efficiency is the average of the predicted efficiencies of the firms in the sample:
• Industry efficiency can also be viewed as the expected value of the efficiency of the i-th firm before any firms have been selected in the sample
30ECON377/477 Topic 4.1
1
1 ˆI
ii
TE TEI
Predicting technical efficiency: industry
• Before we have collected the sample, our knowledge of ui can be summarised in the form of the half-normal pdf:
• We can use this unconditional pdf to derive results similar to the above firm-specific results
• An optimal estimator of industry efficiency is:
31ECON377/477 Topic 4.1
2
22
2( ) exp .
22
ii
uu
up u
2
ˆ exp( ) 2 exp .2u
i uTE E u
Hypothesis testing
• The t- and F-tests are no longer justified in small samples because the composed error in the stochastic frontier model is not normally distributed
• In addition to testing hypotheses concerning β, stochastic frontier researchers are often interested in testing for the absence of inefficiency effects
• Stochastic frontier researchers normally use the Wald and likelihood ratio (LR) tests
32ECON377/477 Topic 4.1
Hypothesis testing
• The one-sided nature of the alternative hypothesis implies these tests are difficult to interpret
• Moreover, they do not have the asymptotic chi-square distributions
• We will use the LR test statistic in this part of the unit
• This statistic is asymptotically distributed as a mixture of chi-square distributions
33ECON377/477 Topic 4.1
Hypothesis testing
• In the case of the truncated-normal model, the null hypothesis should be rejected at the 5 per cent level of significance if the LR test statistic exceeds 5.138
• This value is taken from Table 1 in Kodde and Palm (1986) and is smaller than the 5 per cent critical value,
• Table 9.9 in CROB presents FRONTIER output from the estimation of a truncated-normal model
34ECON377/477 Topic 4.1
20.95 (2) 5.99,
Hypothesis testing
• From the results reported in this table, we compute:
LR = –2[–88.8451 + 71.6403] = 34.41• This value, which is also reported in Table 9.9,
exceeds 5.138 so we reject the null hypothesis• We can also use estimates from the truncated-
normal model to test the null hypothesis that the simpler half-normal model is adequate
• The relevant null and alternative hypotheses are H0: μ = 0 and H1: μ ≠ 0
35ECON377/477 Topic 4.1
Conclusion
• Unfortunately, the simple production frontier model does not permit the prediction of the technical efficiencies of firms that produce multiple outputs
• Moreover, the ML method does not allow us to assess the reliability of our inferences in small samples
• These are two of the issues to be addressed in Part 2 of Topic 4, together with how the parameters of multiple-output technologies can be estimated using distance and cost functions
36ECON377/477 Topic 4.1