topic 4.1

ECON 377/477

Topic 4.1

Stochastic Frontier AnalysisPart 1

Outline

• Introduction• The stochastic production frontier• Estimating the parameters• Predicting technical efficiency• Hypothesis testing• Conclusion

3ECON377/477 Topic 4.1

Introduction• Assume cross-sectional data on I firms• A simple method to estimate a production

frontier using such data is to envelop the data points using an arbitrarily chosen function

• Consider a Cobb-Douglas production frontier:

ln qi = xiβ – ui i = 1, …, I

where qi is the output of the i-th firm, xi is a K×1 vector containing the logarithms of inputs; β is a vector of unknown parameters; and ui is a non-negative random variable associated with technical inefficiency


Introduction

• This production frontier is deterministic insofar as qi is bounded from above by the non-stochastic (deterministic) quantity

• A problem with frontiers of this type (and with the DEA frontier studied in Topic 3) is that no account is taken of measurement errors and other sources of statistical noise

• All deviations from the frontier are assumed to be the result of technical inefficiency


Introduction• A solution to the problem is to introduce another

random variable representing statistical noise• The resulting frontier is known as a stochastic

production frontier, the estimation of which is the focus of this topic

• We begin by describing the basic stochastic production frontier model, where (the logarithm of) output is specified as a function of:o a non-negative random error, which represents technical

inefficiencyo a symmetric random error accounting for noise


The stochastic production frontier• The stochastic frontier production function model

is of the form:

ln qi = xiβ + vi – ui

where vi is a symmetric random error to account for statistical noise

• The model is called a stochastic frontier production function because the output values are bounded from above by the stochastic (random) variable


The stochastic production frontier

• The random error vi can be positive or negative

• Therefore, the stochastic frontier outputs vary about the deterministic part of the model, exp(xiβ)

• These features of the stochastic frontier model can be illustrated graphically

• It is convenient to restrict attention to firms that produce the output qi using only one input, xi



• In this case, a Cobb-Douglas stochastic frontier model takes the form:

ln qi = β0 + β1 ln xi + vi – ui

• Alternatively,

qi = exp(β0 + β1 ln xi) × exp(vi) × exp(– ui)


Deterministic component

Noise Inefficiency


• Such a frontier is depicted on the next slide where we plot the inputs and outputs of two firms, A and B

• The deterministic component of the frontier model has been drawn to reflect the existence of diminishing returns to scale

• Values of the input are measured along the horizontal axis and outputs are measured on the vertical axis

• Firm A uses the input level xA to produce the output qA, while Firm B uses the input level xB to produce the output qB

10ECON377/477 Topic 4.1


11ECON377/477 Topic 4.1

Deterministic frontier qi = exp(β0 + β1 ln xi)

Noise effect

Inefficiency effect

Inefficiency effect

Noise effect

xixA xB

qA

qB

qi

0

qA*

qB*

qA* ≡ exp(β0 + β1ln xA + vA) qB* ≡ exp(β0 + β1ln xB + vB)

No inefficiency effects: uA= uB = 0


• The frontier output for Firm A lies above the deterministic part of the production frontier only because the noise effect is positive (vA > 0)

• The frontier output for Firm B lies below the deterministic part of the frontier because the noise effect is negative (i.e., vB < 0)

• The observed output of Firm A lies below the deterministic part of the frontier because the sum of the noise and inefficiency effects is negative (vA – uA < 0)

12ECON377/477 Topic 4.1


• The (unobserved) frontier outputs tend to be evenly distributed above and below the deterministic part of the frontier

• But observed outputs tend to lie below the deterministic part of the frontier

• Indeed, they can only lie above the deterministic part of the frontier when the noise effect is positive and larger than the inefficiency effect

• Much of stochastic frontier analysis is directed towards the prediction of the inefficiency effects

13ECON377/477 Topic 4.1


• The most common output-oriented measure of technical efficiency is the ratio of observed output to the corresponding stochastic frontier output:

• This measure of technical efficiency (TE) takes a value between zero and one

14ECON377/477 Topic 4.1

exp( )exp( )

exp( ) exp( )i i i i

i ii i i i

q v uTE u

v v

x β

x β x β


• TE measures the output of the i-th firm relative to the output that could be produced by a fully efficient firm using the same input vector

• The first step in predicting the technical efficiency, TEi, is to estimate the parameters of the stochastic production frontier model

• Because TEi is a random variable, and not a parameter, we use the term ‘predict’ instead of ‘estimate’

15ECON377/477 Topic 4.1

Estimating the parameters

• It is common to assume that each vi is distributed independently of each ui, and that both errors are uncorrelated with the explanatory variables in xi

• In addition, we assume:

E(vi) = 0 (zero mean)

E(vi2) = σv

2 (homoskedastic)

E(vivj) = 0 for all i ≠ j (uncorrelated)

E(ui2) = constant (homoskedastic)

E(uiuj) = 0 for all i ≠ j (uncorrelated)16ECON377/477 Topic 4.1

Estimating the parameters

• We cannot use the OLS estimates to compute measures of technical efficiency

• One solution to this problem is to correct for the bias in the intercept term using an estimator known as the corrected ordinary least squares (COLS) estimator

• A better solution is to make some distributional assumptions concerning the two error terms and estimate the model using the method of maximum likelihood (ML)

17ECON377/477 Topic 4.1

Estimating the parameters: the half-normal distribution

• We assume the vis are independently and identically distributed normal random variables with zero means and variances

• We also assume the uis are independently and identically distributed half-normal random variables with scale parameter σu

2

• That is, the pdf of each ui is a truncated version of a normal random variable having zero mean and variance

18ECON377/477 Topic 4.1

Estimating the parameters: the half-normal distribution

• We parameterise the log-likelihood function for this so-called half-normal model in terms of:

σ2 = σv2 + σu

2

λ2 = σu2/σv

2 ≥ 0

• If λ = 0, there are no technical inefficiency effects and all deviations from the frontier are due to noise

19ECON377/477 Topic 4.1

Estimating the parameters: the log-likelihood function

• The log-likelihood function is:

where y is a vector of log-outputs; εi = vi - ui is a composite error term; and Φ(x) is the cdf of the standard normal random variable evaluated at x

• The likelihood function is maximised using an iterative optimisation procedure

20ECON377/477 Topic 4.1

22

21 1

1ln ( | , , ) ln ln

2 2 2

I Ii

ii i

IL

y β

Estimating the parameters: ML• The ML estimation of the half-normal stochastic

frontier model is illustrated in CROB, page 248, by presenting annotated SHAZAM output from the estimation of a translog production frontier:

21ECON377/477 Topic 4.1

1

2

32

1

1 2

1 32

2

2 32

3

1

ln

ln

ln

,0.5(ln )

ln ln

ln ln

0.5(ln )

ln ln

0.5(ln )

i

i

i

i

i i

i i

i i

i

i i

i

t

x

x

x

x

x x

x x

x

x x

x

x 0 1 2 3 11 12 13 22 23 33 , β

Estimating the parameters: ML

• It is easier to use purpose-built software packages such as FRONTIER and LIMDEP

• The FRONTIER instruction and data files used for estimating the half-normal model are presented in CROB, Tables 9.2 and 9.3

• The instruction file should be self-explanatory (see the comments on the right-hand side of the file on the next slide)

• The frontier output file is presented in CROB, Table 9.4

22ECON377/477 Topic 4.1

Estimating the parameters: FRONTIER instruction file

23ECON377/477 Topic 4.1

1 1=ERROR COMPONENTS MODEL, 2=TE EFFECTS MODELchap9.txt DATA FILE NAMEchap9_2.out OUTPUT FILE NAME1 1=PRODUCTION FUNCTION, 2=COST FUNCTIONy LOGGED DEPENDENT VARIABLE (Y/N)344 NUMBER OF CROSS-SECTIONS1 NUMBER OF TIME PERIODS344 NUMBER OF OBSERVATIONS IN TOTAL10 NUMBER OF REGRESSOR VARIABLES (Xs) n MU (Y/N) [OR DELTA0 (Y/N) IF USING TE EFFECTS MODEL]n ETA (Y/N) [OR NUMBER OF TE EFFECTS REGRESSORS (Zs)]n STARTING VALUES (Y/N)

Estimating the parameters: FRONTIER data file

24ECON377/477 Topic 4.1

1.000000 1.000000 0.1850809 1.000000 0.1538426 0.3961101 0.9214233E-01 0.1183377E-01 0.6093860E-01 0.1417541E-01 0.7845160E-01 0.3649850E-01 0.4245104E-02 2.000000 1.000000 0.4590094 1.000000 0.5725529 0.5296415 0.4723926 0.1639084 0.3032478 0.2704698 0.1402600 0.2501987 0.1115774 3.000000 1.000000 0.4226059 1.000000 0.4613273 0.4505042 0.2864401 0.1064114 0.2078299 0.1321426 0.1014770 0.1290424 0.4102396E-01 4.000000 1.000000 -0.3031307 1.000000 -0.4259759 -0.4657866 -0.7656522 0.9072774E-01 0.1984139 0.3261494 0.1084786 0.3566305 0.2931116

Firm 3Year 1

The log output of firm 3 in year 1

Estimating the parameters: alternative distributional specifications

• Alternative specifications of ui to the half-normal distribution include:o Truncated normalo Exponential with mean λo Gamma with mean λ and degrees of freedom m

• Theoretical considerations and computational complexity may influence the choice

• Nevertheless, estimated elasticities and technological change effects are fairly robust to this change in the distributional assumption

25ECON377/477 Topic 4.1

Estimating the parameters: alternative distributional specifications

• Different distributional assumptions may give rise to different predictions of technical efficiency

• But when we rank firms on the basis of predicted technical efficiencies, the rankings are often quite robust to distributional choice

• In such cases, the principle of parsimony favours the simpler half-normal and exponential models

26ECON377/477 Topic 4.1

Predicting technical efficiency: firms

• The technical efficiency of the i-th firm is defined by TEi = exp(–ui)

• This result provides a basis for the prediction of both firm and industry technical efficiency

• Firm technical efficiency refers to the individual TE scores of firms within an industry

• Industry efficiency can be viewed as the average of the TEs of all the firms in the industry

27ECON377/477 Topic 4.1


• We can summarise information about ui in the

form of the truncated normal pdf as:

where:

and:

28ECON377/477 Topic 4.1

*

2*22

***

1 1( ) exp

22i

i i i i

up u q u u

* 2 2(ln ) /i i i uu q x β

2 2 2 2* / .v u


• This conditional pdf gives information about likely and unlikely values of ui after firm i has been selected in our sample and after we have observed its output, qi

• In most situations, we are interested in the efficiency of the i-th firm, TEi = exp(–ui)

• We use p(ui | qi) to derive the predictor that minimises the mean square prediction error:

29ECON377/477 Topic 4.1

2* *

***

* *

ˆ exp( ) exp .2

i ii i i i

u uTE E u q u

Predicting technical efficiency: industry

• A natural predictor of industry efficiency is the average of the predicted efficiencies of the firms in the sample:

• Industry efficiency can also be viewed as the expected value of the efficiency of the i-th firm before any firms have been selected in the sample

30ECON377/477 Topic 4.1

1

1 ˆI

ii

TE TEI

Predicting technical efficiency: industry

• Before we have collected the sample, our knowledge of ui can be summarised in the form of the half-normal pdf:

• We can use this unconditional pdf to derive results similar to the above firm-specific results

• An optimal estimator of industry efficiency is:

31ECON377/477 Topic 4.1

2

22

2( ) exp .

22

ii

uu

up u

2

ˆ exp( ) 2 exp .2u

i uTE E u

Hypothesis testing

• The t- and F-tests are no longer justified in small samples because the composed error in the stochastic frontier model is not normally distributed

• In addition to testing hypotheses concerning β, stochastic frontier researchers are often interested in testing for the absence of inefficiency effects

• Stochastic frontier researchers normally use the Wald and likelihood ratio (LR) tests

32ECON377/477 Topic 4.1

Hypothesis testing

• The one-sided nature of the alternative hypothesis implies these tests are difficult to interpret

• Moreover, they do not have the asymptotic chi-square distributions

• We will use the LR test statistic in this part of the unit

• This statistic is asymptotically distributed as a mixture of chi-square distributions

33ECON377/477 Topic 4.1

Hypothesis testing

• In the case of the truncated-normal model, the null hypothesis should be rejected at the 5 per cent level of significance if the LR test statistic exceeds 5.138

• This value is taken from Table 1 in Kodde and Palm (1986) and is smaller than the 5 per cent critical value,

• Table 9.9 in CROB presents FRONTIER output from the estimation of a truncated-normal model

34ECON377/477 Topic 4.1

20.95 (2) 5.99,

Hypothesis testing

• From the results reported in this table, we compute:

LR = –2[–88.8451 + 71.6403] = 34.41• This value, which is also reported in Table 9.9,

exceeds 5.138 so we reject the null hypothesis• We can also use estimates from the truncated-

normal model to test the null hypothesis that the simpler half-normal model is adequate

• The relevant null and alternative hypotheses are H0: μ = 0 and H1: μ ≠ 0

35ECON377/477 Topic 4.1

Conclusion

• Unfortunately, the simple production frontier model does not permit the prediction of the technical efficiencies of firms that produce multiple outputs

• Moreover, the ML method does not allow us to assess the reliability of our inferences in small samples

• These are two of the issues to be addressed in Part 2 of Topic 4, together with how the parameters of multiple-output technologies can be estimated using distance and cost functions

36ECON377/477 Topic 4.1

topic 4.1

Technology

stochastic frontier

dea frontier

output qb10econ377477

observed output of firm

noise6econ377477 topic

00xixaxb11econ377477

xi8econ377477 topic

estimate15econ377477