center for innovation, research and competence in the learning economy

56
CIRCLE, Lund University, Sweden CENTER FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Longitudinal Data Analysis – methods and applications in Innovation Studies Martin Andersson CIRCLE, Lund university

Upload: maya

Post on 23-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

CENTER FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY. Longitudinal Data Analysis – methods and applications in Innovation Studies. Martin Andersson CIRCLE, Lund university. OUTLINE. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

CENTER FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY

Longitudinal Data Analysis – methods and applications in Innovation Studies

Martin Andersson

CIRCLE, Lund university

Page 2: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

OUTLINE

• Part I: WHY? - identification problems in Innovation Studies and social sciences more broadly

• Part II: WHAT? - introducing panel data analysis

• Part II: HOW? - lab session on panel data

Page 3: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Part I

Identification problems in Innovation Studies and social sciences more broadly

Page 4: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Main goal in regression analysis is often to learn about causal relationships from micro-data capturing non-experimental economic behavior.

– Studies ask "treatment effect" questions of the form: what is the effect of X on Y?

• What is the effect of R&D investment on a firm’s productivity?• Does the the local milieu of a firm affects its innovativeness?• What is the effect of general purpose technologies in growth?• What is the effect of a a local university on new firm formation?• Does entrepreneurship influence regional economic growth?

Page 5: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Does the estimated - parameter reflect a causal effect of X on Y?

• How to ”isolate” the effect of X on Y?

iiii XY

Page 6: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• Identification is closely linked to consistency:

– is the ”true” parameter– is our estimate

– Ideally, we choose a model (right technique, right variables and right assumptions) whichs means that is consistent, such that it converges to when N is large

– This is essentially what identification is all about

true

true

Page 7: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• Manski (2003): the selection problem ”The researcher wants to compare the outcomes that people would experience if

they were to receive alternative treatments. However, treatments are mutually exclusive. At most, the researcher can observe the outcome that each person experiences under the treatment that this person actually receives. The researcher cannot observe the outcomes that people would have experienced under other treatments. These other outcomes are counterfactual. Hence, data on treatments and outcomes cannot by themselves reveal treatment effects.”

• IDEAL: “treated” individuals selected randomly• When is random treatment selection appropriate?

– in the analysis of data from classical randomized experiments. This is the main reason why randomized experiments are valued so highly.

The assumption of random treatment selection is usually suspect in non-experimental settings, where observed treatments may be self-selected or otherwise chosen purposefully.

Page 8: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• Example 1.1 in Jurada (2007):

– Suppose that you are interested in the effect of military service on subsequent earnings. You can look at the mean difference in the outcome between veterans and non-vets.

• …. but, inside this number hides not only a causal effect of the service, but also the composition of other causal variables in each group, both observed and unobserved.

• Are there variables that affect both participation in the program and the outcome? Are the vets earning more because of the military service or are the high-earners more likely to enroll in the army?

Page 9: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• One solution is to control for factors that may drive selection.

– Typical procedure in most empirical papers» we are interested in x but to isolate its effect we control for z.

– when is controlling for observable factors enough to identify a causal effect ? => when is “selection on observables plausible”?

• When is it plausible that conditional on Z, assignment to treatment is “ideal”, i.e. as good as random?

If applicants to a college are screened based on Z, but conditional on passing the Z test, they are accepted based on a random draw.

IMPORTANT TO THINK ABOUT THE DATA GENERATING PROCESS (DGP)

Page 10: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• An issue of ”selection” vs. ”learning”

• Applies to several different topics in Innovation Studies– Roles of selection and learning is typically of great conceptual interest and

policy relevant

– usually we think of ”learning” as reflecting a causal effect:

– Three examples from the literature:

• Persistence of Innovation• Exporting and productivity• Urban Wage Premium

Page 11: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 1: Persistence of Innovation

• Bettina Peters: – Persistence of Innovation: stylised facts and panel data evidence,

Journal of Technology Transfer, 2009

• German manufacturing and services firms 1994-2002:– Is innovation persistent? => Yes!

– What drives this?

Page 12: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 1: Persistence of Innovation

• 1: “True” state dependence.

– a causal behavioral effect: the decision to innovate in one period in itself enhances the probability to innovate in the subsequent period.

• (i) success breeds success (Mansfield 1968)

• (ii) innovations involve dynamic increasing returns (Nelson and Winter 1982 and Malerba and Orsenigo 1993)

• (iii) sunk costs in R&D investments (Sutton 1991)

Page 13: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 1: Persistence of Innovation• 2: Selection on time-invariant characteristics

– Innovating firms may have characteristics which make them particularly ”innovation-prone”

– If these characteristics themselves show persistence over time, they will induce persistence in innovation behavior.

– If these are not appropriately controlled for, past innovation may appear to affect future innovation merely because it picks up the effect of the persistent characteristics.

– In contrast to true state dependence this phenomenon is therefore called spurious state dependence

Page 14: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 2: Exporting and Productivity• Stylized fact that exporters are more productive

– In Sweden, persistent exporters are about 20 % more productive than are non-exporters (Andersson et al 2008)

– Why?

– Learning-by-exporting: • Causal effect from exporting on productivity

– Knowledge accumulation through interaction with foreign customers may stimulate innovation and productivity

– Export markets more competitive and stimulate reduction of X-inefficiencies and adoption of ‘best-practice’ routines

– Self-selection: • Exports associated with entry costs, implying productivity thresholds that only more

productive firms can overcome (Bernard and Jensen 2004, Greenaway and Kneller 2007, Wagner 2007)

Simply analzying the relationship between exports and productivity withouth further controls and/or study of time sequences (ex post // ex ante) tells us nothing about the relevance of the different explanations.

Page 15: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 3: Urban Productivity Premium• Wages (and productivity) generally higher in larger regions

Figure 1. The relationship between mean wages (log) and accessibility to total economic activity (log) across

Swedish municipalities in 2008.

Log accessibility to wages

Log m

ean

wage

Page 16: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 3: Urban Productivity Premium

– Selection • the “best” and the “brightest” move to the cities

– Learning • Causal effect “from the environment” on productivity

– operating in a dense agglomeration stimulate a worker’s productivity

Page 17: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 3: Urban Productivity Premium

Alfred Marshall on selection in 1890:

”In almost all countries there is constant migration towards the towns. The large towns and especially London absorb the very best blood from all the rest of England: the most enterprising, the most highly gifted, those with the highest physique and the strongest characters go there to find scope for their abilities”

Page 18: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 3: Urban Productivity Premium

• Learning relates to ‘pure’ agglomeration effects and is conceptually rooted in the literature on agglomeration economies and localized human capital spillovers (Rauch 1993, Glaeser 2008).

• Agglomerations as “innovation environments” (Glaeser 1999)

Page 19: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Example 3: Urban Productivity Premium

• Big literature focused on untangling the relative roles of selection and learning in explaining the UWP.

– Risk of overestimating learning (causal effect from agglomeration on productivity) if not appropriately controlling for selection

Page 20: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Thinking in terms of ”selection” and ”learning” important for identification

• but,

• … selection and learning effects are, at least conceptually, seldom mutually exclusive

• and,

• their relative roles often bear on theory as well as policy

Page 21: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Example UWP:

• Theory and conceptualizations: – if selection is the dominant source of the city productivity premium, then

theory should focus on why cities attract more productive workers rather than why cities are more productive (Glaeser and Maré 2001)

• Policy – learning effects provides support for policies stimulating the growth of large

city agglomerations.

Page 22: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• Example persistence of innovation:

• Theory and conceptualizations:

– Endogenous growth models:

• Romer (1990) assumes that innovation behaviour is persistent at the firm level to a very large extent.

• Aghion and Howitt (1982) suggest that the process of creative destruction leads to a perpetual renewal of innovators.

– Empirical knowledge about the dynamics in firms’ innovation behaviour is a tool to assess different endogenous growth models

Page 23: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Policy • If innovation is state dependent, innovation–stimulating policy measures such as

government support programmes are supposed to have a more profound effect because they do not only affect current innovation activities but are also likely to induce a permanent change in favour of innovation.

• If, on the other hand, individual heterogeneity induces persistent behaviour, support programmes are unlikely to have long–lasting effects and policy should concentrate more on measures which have the potential to improve innovation–relevant firm–specific factors.

Page 24: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• SUMMARY:

• For identification of the effect of X on Y, accounting for selection is imperative.

• Selection vs. learning a key issue in many lines of inquiry in Innovation Studies, as well as in the social sciences more broadly

• But how to account for selection?

Page 25: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• Selection on observables:

– Estimate effect of X on Y, while controlling for observable characteristics of the ”observational units”, such as individuals or firms.

• Firms: productiviy, employment size, location, capital stock, human capital, industry affliation ownership structure

• Individuals: age, gender, education, place of residence, tenure, etc.

• NOTE: one reason for the growing popularity of using micro-level datasets on individuals and firms is the potential for accounting for selection on observables

Page 26: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification• Problem!

– We do not observe all relevant attributes of firms and individuals that may be of importance in explaining the phenomena we are interested in.

• Firms: managerial abilities, organizational routines, attitudes towards risk, technological opportunities, etc

• Individuals: IQ, skills, creativity, risk attitudes and all other sorts of innate abilities

– What to do?

Page 27: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification

• One option is selection on “unobservables”

• We can do this with panel data.

• Many researchers maintain that the main advantage of panel data is that one can get rid of unobserved heterogeneity, since unobserved heterogeneity is considered as ‘the‘ problem of non-experimental research.

Page 28: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Part II

Introducing panel data analysis

Page 29: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

What is panel data?

• Panel data are a form of longitudinal data, involving regularly repeated observations on the same individuals

• Individuals may be people, households, firms, areas, etc

• Repeated observations over time

• repeated cross-sectional time-series

Page 30: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Typical data structureId Year y x z1 t1 y11 x11 z11

1 t2 y12 x12 z12

2 t1 y21 x21 z21

2 t2 y22 x22 z22

. . . . .

. . . . .

. . . . .

N t1 yN1 xN1 zN1

N t2 yN2 xN2 zN2

Page 31: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Terminology in panel data applications• A balanced panel has the same number of time observations (T) for each of the

n individuals

• An unbalanced panel has different numbers of time observations (Ti) on each individual

• A compact panel covers only consecutive time periods for each individual – there are no “gaps”

• Attrition is the process of drop-out of individuals from the panel, leading to an unbalanced (and possibly non-compact) panel

• A short panel has a large number of individuals but few time observations on each

• A long panel has a long run of time observations on each individual, permitting separate time-series analysis for each

Page 32: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Benefits of panel data

• They are more informative (more variability, less collinearity, more degrees of freedom), estimates are more efficient.

• They allow to study individual dynamics

• Some phenomena are inherently longitudinal (e.g. poverty persistence; unstable employment)

• The ability to make causal inference is enhanced by temporal ordering

• They allow to control for individual unobserved heterogeneity

Page 33: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

A note on casual inference and panel data• This example is based on Brüderl (2005)

• Let i denote an individual, t time, T treatment and C non-treatment. Y is the outcome we are interested in.

• Optimal identification is: => impossible! (clones not available)

• Cross-sectional data: => compare treated with untreated

• This only provides the “true causal effect” if the assumption of unit homogeneity (no unobserved heterogeneity) holds. Requires ’perfect’ controls

• With panel data: => ’within estimation’

• We observe the same indiviual before and after treatment. Unit homogeneity here is needed only in an intrapersonal sense!!

Cti

Tti YY 0,0,

Ctj

Tti YY 0,0,

Cti

Tti YY 0,1,

Page 34: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

The issue with not accounting for unobserved ability

• We want to analyze the effect of human capital x on a firm i’s innovation output, y.

• We have panel data and set up the following model:

– Despite panel data, all issues of selection is at work

here as well.

– It may be the managerial ability of the firms that matters. The effect of x on y may be biased because more high-ability managers tend to recruit more human capital. (omitted variable bias)

ititit xy

Page 35: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

The issue with not accounting for unobserved ability

• Unobserved heterogeneity, such as managerial ability, end up in the error term .

• But if high-ability managers hire more human capital, then this means that and are correlated. – This violates the assumption of exogenity

• Endogeneity (X-variable correlates with the error term) results in biased regression estimates.

– Endogeneity can be a consequence of unobserved heterogeneity.

it

itx it

Page 36: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

How panel data can take care of unobserved heterogeneity

• Panel data in itself do not remedy the problem of unobserved heterogeneity

– but one can apply techniques using panel data that do that.

• Within transformation does the trick.

• Panel data:– Variance across individuals (between variance)– Variance within individuals over time (within variance)

Page 37: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

How panel data can take care of unobserved heterogeneity

• Suppose the managerial ability of each firm i is time-invariant.

– Denote this by (firm-specific fixed effects)– A model including (unobserved) managerial ability would read:

(1)

– Take the average value of each i:

(2)

We have ”taken away” the time dimension and have a cross-section with average values of the time periods. (between variation)

itiitit xy

i

iiii xy

Page 38: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

How panel data can take care of unobserved heterogeneity

• Now substract the second equation from the first:

• is gone!

• Why? => it is constant over time, so its mean value over the periods for each i is the same:

• Time-constant unobserved heterogeneity is no longer a problem

iitiitiit xxyy )(

0 iiii

i

Page 39: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

How panel data can take care of unobserved heterogeneity

• Within transformation means that the data is "time-demeaned".

• Only the within variation is left, because we subtract the between variation.

• The within-transformation made possible by panel data allows researchers to account for time-invariant unobserved heterogeneity– Better identification

Page 40: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Is unobserved heterogeneity empirically important?

• Yes!– in many research papers the magnitude of the estimated

effects of x on y depends to large extent on whether one accounts for unobserved heterogeneity or not

• Example:– Andersson, Klaesson and Larsson (2012),

– ”Selection and Learning of Workers in Cities”

Page 41: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Is unobserved heterogeneity empirically important?

• Main question:

– How important is selection and learning, repectively, in explaining the urban wage premium?

– Panel data of private sector workers 2001-2010

Page 42: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification of selection• Workers are heterogeneous:

– eduction, age, gender: observed– innate ’abilities’: unobserved

• Suppose wages are10% higher in cities

– how much of this is due to workers in cities being better educated and older?

– how much of the wage diffence remains after controlling education and age?

• if selection is important, we should observe that the wage premium drops as we account for worker heterogeneity.

Page 43: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identification of selection

We run different models and test how sensitve the city wage premium is to observed and unobserved worker heterogeneity

Page 44: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Identificatin of learning• 1: indirectly quantified while accounting for selection

– remainder wage gap after controlling for spatial sorting of workers

• 2: identification of workers that move from urban to rural regions.

– faster human capital accumulation in cities => the advantages of having worked in a larger dense city should remain while moving away.

– we estimate the wage premium for workers that move away from dense agglomerations and test if their wage drops or remains upon moving.

Page 45: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Wages, education levels and skills in the Swedish economic geography

Table 3. Key figures divided by fraction of non-routine work tasks.

Job type Mean wage (EUR)

Graduate share

Mean experience

Metropolitan share

All types of professions High fraction non-routine tasks

29 698 36 683

15% 28%

22 23

27% 36%

Low fraction non-routine tasks 23 088 3% 21 19% Note: Graduate share is the fraction of workers with a university education of at least three years. Metropolitan

share is the fraction of workers that work in three biggest labor market regions: Stockholm, Göteborg and Malmö. Wages converted to EUR using the 2008 exchange rate between SEK and EUR of 9.68. High (low) fraction non-routine jobs are those with fraction non-routine tasks above (below) the mean fraction across all occupations (see Table 2).

Page 46: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Wages, education levels and skills in the Swedish economic geography

Table 4. Mean wages (2008) and unadjusted wage gap between metropolitan and non-metropolitan workers.

Job type Metropolitan wage (EUR)

Non metropolitan wage (EUR)

Wage differential

All types of professions 34 417 27 926 23% High fraction non-routine tasks 41 024 34 245 20% Low fraction non-routine tasks 22 634 23 195 -2%

Note: The metropolitan areas are defined as the three biggest labor market regions: Stockholm, Göteborg and Malmö. Wages converted to EUR using the 2008 exchange rate between SEK and EUR of 9.68. High (low) fraction non-routine jobs are those with fraction non-routine tasks above (below) the mean fraction across all occupations (see Table 2).

How much of the wage premium in larger cities is due to selection?

Page 47: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Empirical model

...lnlnlnln 321 Ert

Rrt

Mrtirt DeDeDew

(3) irt

TR

tR RttRT

t ttR RR DDDD γZ

11

81

1...

De = market potential measures (Harris 1954)

In red: time effects, regional effects, region-specific ”shocks”

In green: worker characteristics

Page 48: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Table 6. The relationship between spatial economic density and wages, all private sector workers.

Raw OLS Mincerian OLS Full OLS Raw with

worker FE Full with

worker FE Municipal density (log)

0.0326*** 0.0218*** 0.0205*** 0.00773*** 0.00538*** (0.00322) (0.00224) (0.00123) (0.000242) (0.000242)

Regional density (log)

0.0335*** 0.0218*** 0.0195*** 0.00790*** 0.00522*** (0.00777) (0.00425) (0.00641) (0.000518) (0.000514)

Extra-regional density (log)

-0.0323* -0.0221 -0.0248*** -0.0127*** -0.00797*** (0.0185) (0.0139) (0.00788) (0.000679) (0.000674)

Years of schooling 0.0930*** 0.0823*** 0.117*** (0.00468) (0.00300) (0.0190)

Experience 0.0503*** 0.0408*** 0.0587*** (0.00326) (0.00279) (0.0190)

Experience^2 -0.000781*** -0.000635*** -0.000745*** (5.42e-05) (4.64e-05) (2.84e-06)

Immigrant (dummy) -0.136*** -0.108*** (0.00737) (0.00409)

Male (dummy) 0.351*** 0.330*** (0.00947) (0.00425)

Tenure 0.0176*** -0.0109*** (0.000458) (0.000127)

Number of prior employees

-0.0120*** -0.0189*** (0.00286) (0.000176)

New occupation (dummy)

-0.0930*** -0.0268*** (0.00222) (0.000366)

Employer size (log) 0.0257*** 0.0183*** (0.00225) (0.000175)

Year dummies Yes Yes Yes Yes Yes Region dummies Yes Yes Yes Yes Yes Region*Year effects Yes Yes Yes Yes Yes Education type dummies No Yes Yes No Yes

Industry dummies No No Yes No Yes Observations 12,367,700 12,367,700 12,367,700 12,367,700 12,367,700 Individuals 2,681,164 2,681,164 2,681,164 2,681,164 2,681,164 R-squared 0.031 0.248 0.288 0.059 0.078

Note: The table reports estimates of wage-density elasticities for private sector workers in Sweden 2002-2008. Raw refers to the wage equation in equation (3) without any further controls. The Mincerian model adds years of schooling, experience and its squared value as well as dummies for immigrants, males and education specialization. The full specification further adds variables reflecting labor market status and employer characteristics of each worker. OLS refers to the pooled OLS estimator and FE to a panel estimator with worker fixed effects. All variables are defined in Table 1. The full FE model excludes immigrant and sex dummies as these reflect time-invariant worker characteristics. All models include year and region dummies as well as region-year dummies, where the latter account for any region-specific time-varying shocks shared by all workers in the same local labor market region. The dependent variable is the natural logarithm of wage earnings. Robust standard errors are presented in brackets. *** p<0.01, ** p<0.05, * p<0.1.

Page 49: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Is unobserved heterogeneity empirically important?

• The wage-density elasticity drops from 3.3% to 0.8% when accounting for worker fixed effects (within transformation)!!

• Selection on observables relatively unimportant.

Page 50: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Table 7. The relationship between spatial economic density and wages for workers with occupations associated with high fractions of non-routine job tasks.

Raw OLS Mincerian OLS Full OLS Raw with

worker FE Full with

worker FE Municipal density (log)

0.0317*** 0.0253*** 0.0250*** 0.00810*** 0.00655*** (0.00206) (0.00260) (0.00141) (0.000345) (0.000346)

Regional density (log)

0.0364*** 0.0263*** 0.0240*** 0.00868*** 0.00618*** (0.00734) (0.00523) (0.00829) (0.000772) (0.000769)

Extra-regional density (log)

-0.0271 -0.0231 -0.0253** -0.0124*** -0.00834*** (0.0232) (0.0183) (0.0114) (0.00105) (0.00105)

Years of schooling 0.0797*** 0.0766*** 0.133* (0.00312) (0.00289) (0.0691)

Experience 0.0556*** 0.0513*** 0.0926 (0.00204) (0.00185) (0.0691)

Experience^2 -0.000862*** -0.000794*** -0.000737*** (3.95e-05) (3.54e-05) (4.05e-06)

Immigrant (dummy) -0.0419*** -0.0371*** (0.00285) (0.00260)

Male (dummy) 0.353*** 0.351*** (0.00319) (0.00306)

Tenure 0.0105*** -0.00644*** (0.000400) (0.000172)

Number of prior employees

0.000754 -0.00741*** (0.000741) (0.000237)

New occupation (dummy)

-0.0703*** -0.0151*** (0.00209) (0.000502)

Employer size (log) 0.0259*** 0.0149*** (0.00314) (0.000236)

Year dummies Yes Yes Yes Yes Yes Region dummies Yes Yes Yes Yes Yes Region*Year effects Yes Yes Yes Yes Yes Education type dummies No Yes Yes No Yes

Industry dummies No No Yes No Yes Observations 5,986,454 5,986,454 5,986,454 5,986,454 5,986,454 Individuals 0.038 0.258 0.280 0.061 0.074 R-squared 1,388,166 1,388,166 1,388,166 1,388,166 1,388,166

Note: The table reports estimates of wage-density elasticities for private sector workers in Sweden 2002-2008 with occupations associated with high fractions of non-routine job tasks (see Table 2). Raw refers to the wage equation in equation (3) without any further controls. The Mincerian model adds years of schooling, experience and its squared value as well as dummies for immigrants, males and education specialization. The full specification further adds variables reflecting labor market status and employer characteristics of each worker. OLS refers to the pooled OLS estimator and FE to a panel estimator with worker fixed effects. All variables are defined in Table 1. The full FE model excludes immigrant and sex dummies as these reflect time-invariant worker characteristics. All models include year and region dummies as well as region-year dummies, where the latter account for any region-specific time-varying shocks shared by all workers in the same local labor market region. The dependent variable is the natural logarithm of wage earnings. Robust standard errors are presented in brackets. *** p<0.01, ** p<0.05, * p<0.1

Non-routine workers

Page 51: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Table 9. Wage premium for workers moving away from a metropolitan region to the rest of the country, by fraction of non-routine job tasks.

All private sector workers

High fraction non-routine tasks

Low fraction non-routine tasks

Dummy for moving away from metropolitan region

0.00286 *** (0.0014)

0.01085*** (0.0020)

-0.00055 (0.0021)

Model Full with worker fixed effects

Full with worker fixed effects

Full with worker fixed effects

Note: The table reports the coefficient estimate of a dummy variable reflecting a move from any of Sweden’s three metropolitan labor market regions (Stockholm, Göteborg and Malmö) to anywhere else in Sweden. The underlying model is a panel estimator with worker fix effects including the full set of additional control variables reported in the ‘Full with worker FE’ specification in Tables 6-7. Complete estimation results are obtained from the authors upon request. The dependent variable is the natural logarithm of wage earnings. Robust standard errors are presented in brackets. *** p<0.01, ** p<0.05, * p<0.1

RESULTS FOR ”MOVERS” (from urban to rural)

Page 52: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

CONCLUSIONS

• Who you are and the kind of job you have are more important than where you live in explaining spatial wage disparities.

– The main reason why workers in denser regions earn more is simply that they are different from the workers in more rural regions.

• Learning effects (or pure agglomeration economies) are not zero but are quantitatively of smaller importance than spatial sorting.

Page 53: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Some issues• Identification of parameters:

– In the FE-model, parameters are identified from changes in indiviuals over time.

– Time-invariant variables cannot be estimated (gender, race, education etc......)

– Effects of dummy variables is only identified based on those individuals that change status over time (e.g. from 0 to 1, or 1 to 0)

– There must be some variation in the variable of interest. Otherwise, we cannot estimate its effect. This is potentially a problem, if only a few observations show a change in a variable.

Page 54: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

An example• Suppose you estimate the effect of a location in a high-tech

cluster on firm innovation. – We have a dummy which is 1 for firms in a cluster, and 0 otherwise

– Model 1 is a cross-section:

• Here the is identified from differences between firms ’inside’ and ’outside’ a high-tech cluster

– Model 2 is a an FE-model (within transformation):

• Here the is identified from firms that move into (or out from) a high-tech cluster. Captures instantaneous effect on y from moving into a cluster.

iiii Clusterxy 21

2

itiititit Clusterxy 21

2

Page 55: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Some issues continued• Fixed and random effects:

– An alternative to FE is Random-effects (RE).

– RE exploits both within and between variation. But, it relies on the restrictive assumption that the unobserved individual effects are uncorrelated with the explanatory variable (x). The FE-model do not.

– The RE-estimator, however, provides estimates for time-constant covariates. Many researchers want to report effects of sex, race, etc. Therefore, they choose the RE-estimator over the FE-estimator.

Page 56: CENTER FOR INNOVATION, RESEARCH AND  COMPETENCE  IN THE LEARNING ECONOMY

CIRCLE, Lund University, Sweden

Panel data estimation in practice

• Most statistical packages, such as STATA or LIMDEP, have built-in meny-based procedures for panel data analysis

• Next class: panel data analysis in practice using STATA– declaring data to be panel– describing datasets– descriptive statistics– estimation of FE, RE and BE models– interpretation of coefficients