evaluating russian economic growth without the revolution ...€¦ · evaluating russian economic...

Evaluating Russian Economic Growthwithout the Revolution of 1917∗

Ivan Korolev†

July 5, 2017

Abstract

This paper uses modern econometric techniques, such as the lasso and the synthetic controlmethod, to construct the counterfactual GDP per capita series for Russia for 1917–1940. Thegoal of this paper is twofold: first, to predict how the Russian economy might have developedwithout the Revolution; second, to evaluate and compare various econometric methods forcomputing the counterfactual GDP per capita series. The counterfactuals based on the pre-ferred method, the synthetic control, suggest that without the Revolution Russia might havegrown at about 1.6% a year in 1917–1940.

∗I am grateful to Ran Abramitzky and Frank Wolak for continuous support and encouragement. I alsothank Simeon Djankov, Sergei Guriev, Caroline Hoxby, Andrei Markevich, and Hans-Joachim Voth for valu-able comments and discussions. I gratefully acknowledge the financial support from the Stanford GraduateFellowship Fund as a Koret Fellow and from the Stanford Institute for Economic Policy Research as a B.F.Haley and E.S. Shaw Fellow. All remaining errors are mine.

†Department of Economics, Stanford University, 579 Serra Mall, Stanford, CA, 94305. E-mail:[email protected]. Website: http://web.stanford.edu/~ikorolev/

1

mailto:[email protected]

http://web.stanford.edu/~ikorolev/

1 Introduction

A major question in Russian economic history and a subject of ongoing public debate

in today’s Russia1 is how economic growth in the Soviet Union in 1920–1930s would have

compared to counterfactual growth of Russia if the Russian Revolution had not happened

in 1917. Some people believe that Soviet industrialization allowed the Soviet Union to grow

faster than ever before and to achieve what otherwise would have been impossible, while

others argue that fast economic growth of 1930s was just a return to the pre-revolution trend

and that Tsarist Russia would have achieved the same level of economic development.

There are several studies that try to understand the consequences of the various aspects

of the Revolution on economic development of Russia. In particular, Hunter and Szyrmer

(2014) evaluate Stalin’s economic policies in 1928–1940 using a multi-sector and multi-period

linear model; Allen (2003) looks at Russian industrialization that took place in the beginning

of 1930s and tries to understand what would have happened without centralized industrial-

ization; Cheremukhin et al. (2017) look at the effects of Stalin’s industrialization on economic

growth in Russia using a two-sector macroeconomic model. These studies mostly relied on

theory-based simulations, which has its advantages and disadvantages. On the one hand, this

approach allows the authors to analyze particular policy changes and consider a wide range

of scenarios; on the other hand, it might be difficult to model all factors that are important

for economic growth (e.g. productivity, technology, international trade, and so on) together.

This paper complements the existing literature by studying the consequences of the Rus-

sian Revolution of 1917 using a data-driven approach. The goal of this paper is twofold.

First, I construct the counterfactual series of Russian GDP per capita from 1917 on, and

hence I am able to predict how Russia might have developed without the Revolution of 1917.

Second, I evaluate various econometric methods that could be used to construct the coun-

terfactual series, and then I provide insights about their performance that can be useful in1For a brief summary of Russians’ opinion on Stalin see, e.g., https://www.forbes.com/2010/03/16/

russia-joseph-stalin-victory-day-opinions-contributors-cathy-young.html.

2

https://www.forbes.com/2010/03/16/russia-joseph-stalin-victory-day-opinions-contributors-cathy-young.html

https://www.forbes.com/2010/03/16/russia-joseph-stalin-victory-day-opinions-contributors-cathy-young.html

a variety of other settings, especially when the number of potential control variables is large

relative to the sample size.

My results can be viewed as a credible scenario of what might have happened without

the Revolution, but not as an estimate of the causal effect of the Revolution. Essentially, I

predict Russian GDP per capita after 1917 based on the pre-1917 data for Russia and both

pre-1917 and post-1917 data for other countries. Even though I will routinely use the notion

of “counterfactual series” for alternative scenarios and forecasts of Russian GDP per capita

throughout the paper, it should be clear these counterfactuals do not answer any particular

causal questions.

Since the Revolution took place in Russia because of certain political, economic, and

social issues, I cannot measure the causal effect of the Revolution by comparing Russia to

other countries. Moreover, the Revolution consisted of several events and included various

policy changes, and my methods do not allow me to analyze particular events separately.

Instead, my counterfactuals can be viewed as an attempt to find a stable predictive relation-

ship between Russia and other countries before 1917 and, assuming it would have remained

unchanged after 1917, construct predictions according to this relationship.

For instance, one might think that because of international trade, technology spillovers,

or other causes the growth rates in different countries are interconnected. Then I can recover

the relationship between the Russian economy and economies of other countries using econo-

metric methods, without modeling this relationship explicitly, and then I can make forecasts

based on this predictive relationship. In order to do so, I need to assume that the predictive

relationship was stable over time and would not have changed after 1917 if the Revolution

had not happened.

I try to validate this assumption by conducting placebo tests, in which I construct the

counterfactual GDP per capita series for the countries that did not experience a revolution

and show that these counterfactual series are reasonably close to the actual series.

I use two methods to estimate the predictive relationship between Russia and other coun-

3

tries. The first method is based on running a usual OLS regression of Russian GDP per

capita on GDP per capita of other countries using the pre-1917 data and then constructing

the forecast of Russian GDP per capita after 1917 as fitted values. However, the problem

with this approach is that I do not have enough years before the Revolution to run an unre-

stricted regression: I have 32 countries in the control group and only 32 observations before

the Revolution. Hence, I use the lasso to select the “best” regressors and then run an OLS

regression with only the selected regressors to construct the counterfactual series as the fitted

values from this OLS regression. From now on, I will call this method simply “the lasso”.

Another method is a recently developed synthetic control method. This method has

been used in Abadie and Gardeazabal (2003), Abadie et al. (2010), and Pinotti (2012).

For more a more theoretical discussion of the synthetic control method and its relation to

other approaches, see Abadie et al. (2015) and Doudchenko and Imbens (2016). The idea

of this approach is to use the pre-treatment data to find a so-called synthetic control, a

counterfactual observation that had been most similar to the treated observation before the

treatment took place, and then to look at how this counterfactual observation would have

behaved after the treatment. Here I am interested in counterfactual development of Tsarist

Russia without the Revolution, so the treatment is essentially the Russian Revolution that

took place in 1917.

I compare these two methods using placebo tests that involve other countries, where

revolutions did not happen, so that I know how the true growth path without revolutions

looks like, or Russia before 1917. I discuss these placebo tests in detail in Subsection 4.1.

The remainder of the paper is organized as follows. Section 2 gives historical background

of the Russian Revolution. Section 3 describes various approaches I use in this paper. Sec-

tion 4 describes the way to evaluate and compare various methods and then presents the

main findings. Section 5 concludes.

4

2 Background

In this section I give a brief overview of the historical context of the Russian Revolution of

1917. In the second half of the 19th century and beginning of the 20th century Russia was an

agrarian country, poorer than other European countries and approximately as rich as Japan.

Russia was facing economic and political challenges that, in particular, led to the Revolution

of 1905, which, however, did not result in dramatic political changes. To be precise, the

Revolution of 1905 led to a constitutional reform, but Russia remained a monarchy and

Tsar Nicholas II stayed in power. The government started to implement large-scale agrarian

reforms in 1906, but these efforts arguably were cut short by World War I in 1914. Despite

political and economic challenges, the economy was steadily growing before World War I.

In 1914 Russia entered World War I, which led to more serious economic, social, and

political problems, and in 1917 the Revolution took place. In fact, the Russian Revolution

of 1917 consisted of two revolutions. First, in March 1917, Tsar Nicholas II was forced to

abdicate and the provisional government was formed. Then, in November 1917 there was

another revolution, and as a result, the Bolsheviks came to power, replacing the provisional

government. The Civil War (loosely speaking, the war between the Bolsheviks’ Red Army

and the Anti-Bolsheviks White Army) started as a result of the Revolution. There is no

universal agreement among historians on the exact dates of the Civil War: historians believe

it started in November 1917 or in the summer of 1918 and ended between 1920 and 1922

(see Bradley (1975), Mawdsley (2007), Bullock (2008)). The Bolsheviks won the war, and

the Soviet Union was created in December 1922.

During 1920s, the economic and political systems in the Soviet Union became more and

more centralized and controlled by the government, and by the beginning of 1930s the Soviet

Union became a country with planned economy. Starting in 1928, the government attempted

a centralized industrialization in order to quickly turn an agrarian country into an industrial-

ized one. The cost of industrialization was high: according to Wheatcroft and Davies (2004),

between 5.5 and 6.5 million people died during the famine of 1932–1933; then, according

5

to Pipes (2001), about 680,000 people were executed during political repressions known as

the Great Purge in 1936–1938. The benefits of the industrialization are still unclear: some

historians (e.g. Allen (2003)) claim that it led to faster economic growth in the country, while

others (e.g. Cheremukhin et al. (2017)) argue that the Russian economy would have grown

at comparable rates even without the forced industrialization. Figure 1 plots the evolution

of Russian GDP per capita over time in order to illustrate the main patterns of economic

growth in Russia and the Soviet Union in 1885–1940. As a benchmark, it also plots the

evolution of American GDP per capita. I would like to stress that the GDP per capita data

for Russia and the Soviet Union was constructed by Markevich and Harrison (2011) retro-

spectively. Hence, I am not concerned about possible attempts from the Soviet government

to manipulate the data.

Figure 1: Russian and American GDP per Capita, 1885–1940

The natural logarithm of actual Russian GDP per capita is plotted as a solid line. The natural logarithm

of American GDP per capita is plotted as a dashed line. The vertical line marks 1917, the year when the

Revolution took place.

In this paper I try to estimate the counterfactual GDP per capita in Russia had the

Russian Revolution not taken place in 1917 and compare it with the actual GDP per capita.

It is worth noting that I do not discuss whether it was possible for the Russian government

6

to avoid the Revolution. It might be the case that in order to avoid the Revolution, it would

have been necessary to implement political or economic reforms that would have changed the

Russian growth path significantly.

3 Data and Empirical Strategy

The main source of data for my analysis is the updated version of the Maddison project,

i.e. Bolt and van Zanden (2013), which has data on Russian GDP per capita from 1885 on

as well as information about other countries. Table 1 lists all countries in the dataset. The

construction of country characteristics and data sources are discussed in the table footnotes.

The series of Russian GDP for 1913–1928, as reported in Bolt and van Zanden (2013),

was constructed by Markevich and Harrison (2011). They built upon numerous previous

studies, most notably Gregory (2004), to fill in the last gap in the Russian GDP series. An

important concern about this data is that the territory and population of the country changed

dramatically after the Revolution. However, Markevich and Harrison (2011) explicitly took

these changes into account in order to construct a consistent time series of GDP per capita

for the Russian Empire and then the Soviet Union.

As usually done in time series econometrics, I mainly use a natural logarithm of GDP

per capita in my analysis, because economists often believe that growth rates are stationary,

and hence the first differences of the logarithm of GDP per capita are stationary as well. As

for the forecast horizon, I focus on 1917–1940, because World War II started in 1939 and

Germany invaded the Soviet Union in 1941, so that any forecasts or comparisons that go

beyond that point do not make much sense.

To summarize, I have data on 1885–1940, with 1917 being the year when the treatment

took place, and my goal is to construct a counterfactual GDP per capita series for Russia

from 1917 on. As I mentioned above, I can use several methods to do it.

7

Table 1: Countries

Country Europe Independent Population, 1913 WW1 Losses WW1 Losses(thousands) (thousands) (% of population)

Russia 1 1 156,192 1,700.0 1.088Austria 1 1 6,767 120.0# 1.773Belgium 1 1 7,666 13.7 0.179Denmark 1 1 2,983 0.0 0Finland 1 0 3,027 26.5# 0.876France 1 1 41,463 1,357.8 3.275Germany 1 1 65,058 1,773.7 2.726Italy 1 1 37,248 650.0 1.745Netherlands 1 1 6,164 0.0 0Norway 1 1 2,447 0.0 0Sweden 1 1 5,621 0.0 0Switzerland 1 1 3,864 0.0 0UK 1 1 45,649 702.4* 1.539Greece 1 1 5,425 5.0 0.092Portugal 1 1 5,972 7.2 0.121Spain 1 1 20,263 0.0 0Australia 0 1 4,821 59.9* 1.242New Zealand 0 1 1,122 16.7* 1.489Canada 0 1 7,852 62.8* 0.800USA 0 1 97,606 116.5 0.119Argentina 0 1 7,653 0.0 0Brazil 0 1 23,660 0.0 0Chile 0 1 3,431 0.0 0Colombia 0 1 5,195 0.0 0Peru 0 1 4,295 0.0 0Uruguay 0 1 1,177 0.0 0Venezuela 0 1 2,874 0.0 0India 0 0 303,700 54* 0.017Indonesia 0 0 51,637 0.0 0Japan 0 1 51,672 0.3 0.001Sri Lanka 0 0 4,811 0.0 0

“Europe” equals 1 if a country is located in Europe and 0 otherwise. A majority of Russian population livedin Europe, so I treat Russia as an European country. “Independent” equals 1 if a country was independentthroughout the entire period and 0 otherwise. I treat Australia, Canada, and New Zealand as independentbecause they had so-called “responsible governments” and arguably enjoyed significant amount of indepen-dence from the UK. Sweden and Norway formed a union under the Swedish monarch until 1905, so formallyNorway was not an independent country until then. However, because these two countries still had sepa-rate laws, legislatures, armed forces, etc., I treat them as independent as well. Data on 1913 population isfrom Maddison (http://www.ggdc.net/maddison/oriindex.htm). “WW1 Losses” takes into account mili-tary losses only and comes from Encyclopedia Britannica (http://www.britannica.com/EBchecked/topic/648646/World-War-I/53172/Killed-wounded-and-missing). For the countries marked by “*” (that wereparts of the British Empire), the data is from Ellis and Cox (2001), except India, for which the data is fromUrlanis (1971). Austria and Finland, marked by “#”, did not exist in modern borders in 1914. For Austria,I use data from Erlikman (2004); for Finland, from International Labor Office (1923 – 25).

8

http://www.ggdc.net/maddison/oriindex.htm

http://www.britannica.com/EBchecked/topic/648646/World-War-I/53172/Killed-wounded-and-missing

http://www.britannica.com/EBchecked/topic/648646/World-War-I/53172/Killed-wounded-and-missing

As a baseline, I use an ARIMA(1,1,0) model, i.e. estimate an AR(1) model in first

differences:

∆ log(GDPpc)t = α + ∆ log(GDPpc)t−1 + εt,

where where t indexes time and then construct the predicted values from 1917 on. This

approach would not allow me to account for possible structural breaks or for global macroe-

conomic factors that might have affected Russian economy, because essentially it consists of

just continuing the trend that existed before the Revolution. However, if other methods can-

not beat this method in placebo tests, it might indicate that other methods do not perform

well.

Because I have data on other countries, I can use it to predict Russian GDP as well. The

first option is to run a cross-sectional regression of Russian GDP per capita on GDP per

capita of other countries before 1917 and then construct the predicted GDP per capita after

1917 simply as fitted values from that regression using actual data for other countries:

log(GDPpc)Russia,t = α +k∑j=1

βj log(GDPpc)j,t + εt,

where j indexes countries in the control group.

It is also possible to combine these two methods and to estimate an ARMA model like

log(GDPpc)Russia,t = α +

p∑j=1

ρj log(GDPpc)Russia,t−j +k∑j=1

βj log(GDPpc)j,t +

q∑j=1

θjεt−j + εt

The main challenge with this approach is that the number of the countries in the sample

is almost as large as the number of the pre-revolution observations, so estimating such model

does not make much sense: it is possible to fit the observed GDP almost perfectly with so

many regressors. But such overfitting may lead to extremely inaccurate forecasts. However,

I can overcome this challenge by constraining the regression coefficients.

The first option to solve the overfitting problem is to use a lasso regression, first developed

9

in Tibshirani (1996). A lasso estimator is defined as follows:

minβ

∑t

(log(GDPpc)Russia,t − α−

k∑j=1

βj log(GDPpc)j,t

)2

s.t.∑j

|βj| ≤ t

It is equivalent to solving

minβ

∑t

(log(GDPpc)Russia,t − α−

k∑j=1

βj log(GDPpc)j,t

)2

+ λ∑j

|βj|,

where λ is a penalty parameter. I choose the penalty parameter using leave-one-out cross-

validation for each country separately. A detailed discussion of this procedure is presented

in Appendix A.1.

The idea behind the lasso is to penalize for having too many regressors by zeroing some

of the coefficients: because the augmented objective function is not smooth, the problem

typically has a corner solution in which some of the parameters are set to zero. Hence, using

the lasso helps to solve the problem of having insufficient number of observations in the data

by zeroing out the coefficients on the variables with low explanatory power.

In this paper I apply the lasso as follows: first, I normalize the control variables so that

they all have zero mean and unit variance, then choose the penalty parameter as discussed in

Appendix A.1, then run the lasso with this selected penalty parameter using the observations

for 1885–1916. After that, I keep only the selected countries and run OLS with these countries

and a constant (again using 1885–1916) in order to avoid biasing the coefficients towards zero

(hence, I use the lasso for the selection but not for the estimation). Finally, I construct the

fitted values for 1917–1940 using the resulting estimates.

In addition to running the lasso with only the current levels of GDP per capita in other

countries as controls, I run it with other sets of controls as well. Table 2 lists all specification

that I use.

10

Table 2: Control Variables for Lasso

Specification Name Control Variables

Lasso–1 “Baseline”(

log(GDPpc)j,t

)31j=2

Lasso–2 “Lags & Russia"

(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2

)31j=2

,

log(GDPpc)Russia,t−1, log(GDPpc)Russia,t−2

Lasso–3 “Lags”(

log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2

)31j=2

Lasso–4 “Lags & Leads & Russia”

(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2,

log(GDPpc)j,t+1, log(GDPpc)j,t+2

)31j=2

,

log(GDPpc)Russia,t−1, log(GDPpc)Russia,t−2

Lasso–5 “Lags & Leads”

(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2,

log(GDPpc)j,t+1, log(GDPpc)j,t+2

)31j=2

The second option to solve the overfitting problem is to use the synthetic control method.

It is somewhat similar to the lasso, but uses different restrictions on the coefficients. It finds

a combination of control units (i.e other countries in the sample) with non-negative weights

that sum up to one in a way that minimizes the weighted sum of squared differences between

the treated and synthetic units during the pre-treatment period:

minβ

∑t

wt

(log(GDPpc)Russia,t −

k∑j=1

βj log(GDPpc)j,t

)2

s.t.∑j

βj = 1, βj ≥ 0∀j

The synthetic control method is similar to the lasso in the sense that it also imposes

constraints on the ordinary or weighted least squares coefficients, but the constraints are

more demanding. Since the synthetic control method requires all coefficients to be non-

negative and sum up to one, the resulting synthetic control unit can be interpreted as a

weighted average of the members of the control group.

The synthetic control method has its advantages and disadvantages. The main advantage

11

is that it is very intuitive and it clearly shows how the counterfactual would have looked

like. The cost of that, however, is that the method is completely atheoretical: it is not based

on any economic model, so it cannot take into account possible policy changes which could

have changed the predictive ralationship. For example, before World War I began, Russian

government started some important reforms, but it did not have time to implement them

fully, and it is not clear what the effect of those reforms would have been if they had been

completed. As for now, I abstract from the potential effect of the reforms, because this

question constitutes a separate research topic.

Given that the data for the end of the 19th and the beginning of the 20th century is

quite limited, I focus only on GDP per capita as the main explanatory variable, even though

I would like to use some variables that describe the sectoral composition of the economy as

well.

There is still a big question: how to choose a pre-treatment period over which to minimize

the difference between the treated and synthetic control units: the data is available starting

from 1885, and Russian Revolution took place in 1917, so I have only about 30 years for

matching. Ideally, I would want to do matching without using some observations just before

1917 in order to be able to use them as a placebo: to check that the synthetic control

approximates the behavior of the treated unit right before treatment well enough. However,

there are two issues with this approach: first, a relatively small number of observations before

treatment; second, the fact that Russia took part in World War I, and it caused its GDP per

capita to fall substantially in 1914–1916. The problem is that if I use pre-treatment years

as a placebo, I cannot account for this decline in GPD per capita when constructing the

synthetic control unit.

In order to solve these problems, I separate the tasks of selecting the best method and

constructing the counterfactual. In order to evaluate different methods and select the best

one, I start with placebo tests for the cases when I know how the true growth path without

a revolution looks like.

12

First, I use other countries in my sample for placebo tests: I use different methods to

construct the counterfactual series for these countries starting from 1917, and then I compare

the counterfactual and the actual series. Since other countries did not experience a revolution,

I evaluate the methods based on the difference between the actual and counterfactual series

after 1917.

Second, I use Russia before 1917 as a placebo test. I divide the pre-revolutionary period

in two parts, the matching part and the testing parts, using the Russian data to see how

well the methods I use predict Russian GDP per capita out-of-sample, but only before the

Revolution, when no treatment took place. Finally, based on the performance in placebo

tests, I choose the “best” methods and use them in the main analysis, now using the entire

pre-1917 period for matching.

I also consider using different subsamples of countries as controls. One may want to

match the counterfactual and actual units along other dimensions in addition to matching on

GDP per capita. I have data on whether a country was independent at that time, whether

it participated in World War I, and whether it is located in Europe.

Restricting the sample only to European countries seems problematic. First, even though

Russia is usually considered a European country, it does not entirely lie in Europe. Second,

since Russia was probably the poorest European country at that time, it might be hard to

find an appropriate synthetic unit. Asian or South American countries are likely to be a

better match.

As for World War I, it would be great to match on it, but there are several problems with

it as well. First, it is hard to come up with a reasonable and objective definition of being a

participant. Second, the countries that were mostly affected by World War I were European

countries, but, as discussed above, these countries did not match Russia well in terms of

economic development. Hence, using matching on World War I may be problematic, and I

discuss the challenges and possible solutions in the next section.

As for being independent, arguably, it is an important factor for economic development of

13

a country, and then restricting the sample to independent countries can improve the quality

of counterfactual series. At the same time it is relatively easy to determine when each country

became independent. There are two ambiguities here. First, it is somewhat difficult to define

exactly when Canada, Australia, and New Zealand became independent from Great Britain;

however, these three countries had so-called responsible governments by the end of the 19th

century and arguably enjoyed a significant amount of independence. Hence, I keep them in

the sample for the purposes of this exercise. Second, Norway and Sweden formed a union

under the Swedish monarch until 1905, so formally Norway was not an independent country

until then. However, since these two countries still had separate laws, legislatures, armed

forces, etc., I treat them as independent as well. The countries that I drop from the full

sample are Finland, India, Indonesia, and Sri Lanka.

Finally, I should note that one might want to match Russia with other countries based

on population to control for the overall size of the economy instead of just using GDP per

capita. The difficulty with using population in my analysis is that Russia was one of the most

populous countries in the world at that time and second most populous in my sample, which

makes it almost impossible to find a good match in terms of total population. Consequently,

I focus only on GDP per capita in my analysis, assuming that all economies were “scalable”

(or, in economic terms, that there were constant returns to scale), so that Russia can be

represented as a small country “replicated” many times, and that the population density

does not matter for economic growth.

4 Results

This section presents the findings of my paper. It consists of several subsections: Sub-

section 4.1 discusses which method to use, Subsection 4.2 presents the main results of my

paper, then Subsection 4.3 discusses some robustness checks and presents additional findings,

and finally Subsection 4.4 evaluates the performance of the preferred methods using selected

14

countries from the control group as placebos.

4.1 Choice of Preferred Specification

Before I present the main findings of the paper, I discuss how to pick the preferred method

from the methods considered above using placebo tests. I use two types of placebo tests.

The first one evaluates performance of various methods using the post-1917 period for the

other countries in my sample, while the second one uses the pre-1917 period for Russia by

breaking it into the matching and testing parts.

4.1.1 Other Countries as Placebo Tests

In this subsection, I drop Russia from the sample, construct counterfactual series for all

other countries using all methods under consideration, and then pick the method with the

“best” performance. I use the quadratic loss function and the absolute deviation loss function

in my analysis, so I concentrate on the sum of squared residuals and the sum of absolute

deviations of residuals as the measures of performance. The main underlying assumption of

this exercise is that other countries were similar to Russia, so that if a method performs well

for other countries it also performs well for Russia.

I have seven methods to choose from: the time series (ARIMA(1,1,0)), five different kinds

of the lasso (described in Table 2), and the synthetic control. Moreover, for every method, I

can use various pools of countries as controls: all countries in the dataset, only independent

countries, or only World War I participants.

In this subsection I do not match based on participation in World War I for several

reasons. First, it is hard to come up with a clear definition of being a participant. Second,

even if I could come up with such definition, not only the sample of potential controls, but

also the sample of countries for which I could run placebo tests would be limited to the war

participants. But such countries were mostly rich European countries which were significantly

different from Russia and do not necessarily form a good comparison group for Russia.

15

First, Table 3 reports the mean and median SSE across countries for each of the three

samples of countries. The detailed discussion of the construction of these measures is pre-

sented in Appendix A.2.1. As we can see from the table, when all countries are used as

controls (Panel A), “Lasso–2” yields the lowest mean SSE across countries, while “Lasso–4”

yields the lowest median. The synthetic control method yields both second-lowest mean and

median in this case, and also it has the best performance in terms of both mean and median

when I restrict the controls to independent (Panel B) or seriously affected by World War I

(Panel C) countries. Hence, I find that the results in Table 3 might suggest that the synthetic

control is the best method to use.

I should note that the results in Table 3 are based on the sample of countries including

Germany and Italy, which experienced dramatic social, political, and economic changes in

1920–1930s. Thus, using Germany and Italy for placebo tests can be problematic. However,

the ranking of the methods is not affected by dropping them from placebo tests.

Table 3: Comparison of MethodsPanel A: All Countries

MethodSSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 1.873 2.989 1.036 1.412 1.806 1.718 1.258Median 0.461 0.580 0.573 0.457 0.411 0.812 0.449

Panel B: Independent CountriesMethod

SSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.093 1.512 1.593 1.763 1.802 0.809Median 0.508 0.593 1.031 0.654 0.415 0.870 0.362

In Panel A I use all 30 countries as controls, compute the SSE for each of these countries, and report the meanand median across these countries. In Panel B I use only 26 independent countries as controls, compute theSSE for each of these countries, and report the mean and median across these countries. I mark the lowestmean and median in each panel in bold. I discuss how to compute the SSE in more detail in Appendix A.2.1.The penalty parameters for the lasso are chosen via cross-validation, as described in Appendix A.1.

Note, however, that because the samples of countries for which I compute the SSE differ

across two panels of the table, I cannot directly compare different panels of the table: it is

possible that it is easier to predict GDP per capita for some countries and harder for others.

16

Hence, if I want to compare methods across panels, I need to make sure that I compute the

SSE for the same countries even if I use different samples of countries as controls.

In order to solve this problem and compare the methods across various controls samples,

I next compare the performance of various methods with different samples of countries as

controls, but keeping the sample of countries for which I compute the SSE fixed. In Table 4

I use all countries or only independent countries as controls, while computing the SSE only

for independent countries. As we can see from the table, now the synthetic control method

is the best for both samples of controls, and restricting the sample of control countries to

independent countries only improves both the mean and median SSE.

Table 4: Comparison of MethodsPanel A: All Countries

MethodSSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.389 1.170 1.598 2.048 1.845 1.125Median 0.508 0.702 0.625 0.519 0.629 0.938 0.406

Panel B: Independent CountriesMethod

SSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.093 1.512 1.593 1.763 1.802 0.809Median 0.508 0.593 1.031 0.654 0.415 0.870 0.362

In Panel A I use all 30 countries as controls, compute the SSE only for 26 independent countries, and reportthe mean and median across these countries. In Panel B I use only 26 independent countries as controls,compute the SSE for each of these countries, and report the mean and median across these countries. I markthe lowest mean and median in each panel in bold, and the lowest mean and median across all samples inbold italic. I discuss how to compute the SSE in more detail in Appendix A.2.1. The penalty parametersfor the lasso are chosen via cross-validation, as described in Appendix A.1.

To summarize this section, overall I find the synthetic control to be probably the most

preferable method, but in some cases “Lasso–2” and “Lasso–4” also perform well. To shed

more light on the performance of the various methods, I next do the following: eliminate

all “strictly dominated” methods (i.e. the methods which are beaten by at least one other

method in all panels of Tables 3 and 4), which leaves me with “Lasso–2”, “Lasso–4”, and the

synthetic control method, and evaluate them using the pre-1917 data for Russia.

17

4.1.2 Pre-1917 Russia as Placebo Test

In this subsection I divide the pre-revolutionary period into two parts and use it for

placebo tests, as described in detail in Appendix A.2.2. I restrict my attention to the three

methods that performed well in the previous group of placebo tests, “Lasso–2”, “Lasso–4”,

and the synthetic control method, but I enrich the set of specifications by using matching on

World War I.

I have three methods and four choices of control groups. As before, the first pool of

controls includes all countries; the second one is restricted to independent countries. The

third and fourth use only World War I participants with different definitions of participation.

The first one includes countries that lost at least 0.05% of their population or 10,000 people

in World War I; the second one includes only countries that lost at least 0.05% of their

population. The only difference is India that satisfies the former criterion but not the latter,

but it might play an important role in matching because it was a poor country which might

enter the synthetic control unit with a relatively high weight.

Table 5 presents the results of these placebo tests. As we can see from the table, if we

compare methods for the same choice of control groups, the synthetic control method yields

the lowest SSE for almost every choice of the control sample. As for the comparison across

control groups but within a given matching period, the synthetic control yields the best

results in two cases out of three, but it is not clear whether restricting the control group

helps. As for the various lasso-based methods, it is difficult to rank them: they perform

very differently depending on the matching period and the controls choice. What is clear,

however, especially with “Lasso–2”, is that restricting the sample typically helps. It is quite

intuitive: the lasso is very flexible, and having too many potential controls may lead to

overfitting. Hence, even though the lasso tries to select appropriate controls in a data-driven

way, restricting the pool of controls beforehand may be helpful.

To conclude, I find that while the synthetic control method typically compares favorably

to the other methods under consideration, the placebo tests are inconclusive regarding the

18

preferable choice of the control group for this method. Hence, I will use the synthetic control

method with different control groups as the baseline, and I will present the results of the

lasso when I do robustness checks.

Table 5: Comparison of MethodsPanel A: Matching Period 1885–1904

MethodLasso–2 Lasso–4 Synthetic

All 0.700 0.553 0.195Independent 1.376 0.359 0.264WW1–1 1.109 1.338 0.028WW1–2 0.289 1.338 0.053

Panel B: Matching Period 1891–1910Method

Lasso–2 Lasso–4 SyntheticAll 0.274 0.075 0.047Independent 3.211 0.075 0.064WW1–1 0.075* 0.162 0.091WW1–2 0.062 0.162 0.195

Panel C: Matching Period 1897–1916Method

Lasso–2 Lasso–4 SyntheticAll 1.074 0.844 0.559Independent 1.356 0.844 0.559WW1–1 0.364* 0.333* 0.260WW1–2 0.353 0.406* 0.374

I discuss how to compute the SSE in detail in Appendix A.2.2. I mark the lowest SSE in each row inbold, and the lowest SSE in each panel in bold italic. The penalty parameters for the lasso are chosen viacross-validation when the entire 1885–1917 period is used for matching as described in Appendix A.1, and Iuse these parameters for all matching sub-periods. The only exceptions, marked by *, are when the cross-validated penalty leads to no controls being selected. “All” refers to the specification that uses all countriesin the pool of controls. “Independent” refers to the specification that uses only independent countries in thepool of controls. “WW1–1” refers to the specification that uses countries that lost at least 0.05% of theirpopulation or 10,000 people in World War I in the pool of controls. “WW1–2” refers to the specification thatuses countries that lost at least 0.05% of their population in World War I in the pool of controls.

19

4.2 Main Findings

In this section I present the results of the synthetic control method with different choices

of control groups, since the synthetic control method performs well in the placebo tests but

there is no clear ranking of choices of the control group.

I consider four variations of the synthetic control method. I have already discussed two

specifications in detail: the one that uses all countries as potential controls and the one that

uses only the independent countries as potential controls. The third specification uses all

countries but Finland as potential controls. The reason is that Finland, which is given a

high weight in the specification with all countries, was a part of Russian Empire before the

Revolution, so it likely was affected by the Resolution as well. Hence, to rule out possible

spillovers, I run this specification as a robustness check. The fourth specification uses only

World War I participants as controls, and it uses the first definition of participation: military

deaths equal to at least 0.05% of country’s population or at least 10,000 people.

Before discussing the counterfactual GDP per capita series, I first look at the properties

of the synthetic control units. Table 6 presents the composition of the synthetic control unit

in various specifications, while Table 7 compares the actual Russian country characteristics

with the characteristics of different synthetic control units.

Several things regarding the composition of the synthetic control units and their charac-

teristics are worth noting.

20

Table 6: Weights for Synthetic Control UnitsSpecification

Country All Countries No Finland Independent WW1Finland 0.453 – – 0.630Sweden 0 0.236 0 –Portugal 0 0 0.320 0.027Argentina 0 0.062 0.048 –Peru 0.084 0.080 0.163 –Venezuela 0 0 0.027 –India 0.217 0.299 – 0.343Japan 0 0 0.443 –Sri Lanka 0.246 0.323 – –

The table presents the composition of the synthetic control units in various specifications. The first one usesall countries as potential controls, the second one uses all countries but Finland as potential controls, and thethird one uses only the independent countries as potential controls. Entries with “0” mean that the countrywas in the pool of potential controls but was assigned zero weight; entries with “–” mean that the countrywas dropped from the pool of potential controls.

Table 7: Balancing of Actual and Synthetic Control Units

Characteristic

Unit Europe Independent Population, 1913 WW1 Deaths WW1 Deaths(thousands) (thousands) (% of population)

Actual 1 1 156,192 1,700.0 1.09All 0.453 0.084 68,818 17.4 0.40No Finland 0.236 0.378 94,505 7.5 0.00Independent 0.320 1 25,947 2.4 0.04WW1–1 0.657 0.027 106,237 25.5 0.56

The table compares the characteristics of Russia with the characteristics of the synthetic control units invarious specifications. The first one uses all countries as potential controls, the second one uses all countriesbut Finland as potential controls, and the third one uses only the independent countries as potential controls.All characteristics of the synthetic control units are computed simply as weighted averages of individualcharacteristics of the countries that compose the synthetic control unit.

21

First, in most specifications a relatively low weight is given to European countries. Second,

the first two specifications assign relatively low weight to independent countries. Third, all

specifications yield the synthetic control units with significantly lower World War I deaths

as a percentage of population than Russia actually experienced. This is consistent with

Russia being poorer than other European countries at that time, so that most European

countries do not serve as good controls for Russia. In fact, the only European country that

is assigned high weight in these specifications is Finland (with more than 45% weight in the

first specification), which was a part of the Russian Empire before the revolution and also was

relatively poor as compared to other European countries. At the same time it were European

countries that actively participated in World War I, and European countries constituted a

large share of independent countries as well.

In other words, there is a trade-off between making the synthetic control unit similar

to Russia in terms of economic development, i.e. GDP per capita, and making it similar to

Russia across the characteristics such as being European, being independent, or participating

in World War I. Russia was quite unique in that it was an independent and European country

and it played a significant role in World War I, but at the same time it was far poorer than

other European countries.

Now I move on to the main results of my paper: the performance of the counterfactual

unit in terms of GDP per capita. Figure 2 presents the results from four specifications of the

synthetic control method.

Even though there are differences between these specifications, the overall pattern of the

results is similar. They all match the pre-1917 behavior well, with the quality of matching

being higher when I use less restrictive pools of potential controls. As for the post-1917

behavior, the second scenario is “pessimistic”, the third one is “optimistic”, and the first

and fourth ones are “moderate”, but in general the behavior of the synthetic control unit is

similar: all synthetic control units steadily grow in 1920s, then experience a crisis associated

with the Great Depression, and then start recovering in the second half of 1930s. Depending

22

on the specification, actual Russian GDP per capita catches up with the counterfactual one

somewhere between 1933 and 1937.

Figure 2: Actual and Synthetic Russian GDP per Capita

The natural logarithm of the actual GDP per capita is plotted as a dashed line, the synthetic series is plotted

as a solid line. The vertical line marks 1917, the year when the Revolution took place. The upper-left graph

shows the results of the synthetic control method when all countries in the sample are included as potential

controls. The upper-right graph shows the results of the synthetic control method when all countries except

Finland are included as potential controls. The lower-left graph shows the results of the synthetic control

method when only the independent countries are included as potential controls. The lower-right graph shows

the results of the synthetic control method when only countries that lost at least 0.05% of their population

or 10,000 people in World War I are included as potential controls.

If we believe that these counterfactuals are plausible, without the Revolution Russia

23

would have grown at a rate comparable with, if not higher than, the developed countries.

For instance, in 1917–1929 (i.e. before the Great Depression) the GDP per capita annual

growth rates for the “optimistic” and “moderate” scenarios are 2.6–3%, which is higher than

the American (2.3%) or Japanese (1.6%) growth rates over the same period. Even for the

“pessimistic” scenario the annual growth rate is 1.5%, which is pretty close to the Japanese

one.

In 1930s, during the Great Depression, there is an economic crisis in all scenarios, with the

average annual growth rates being virtually 0 for the “pessimistic” scenario, about 0.45–0.8%

for the “moderate” one, and almost 1.5% for the optimistic. As a comparison, the American

average growth rate over the same period was 0.15%, and the Japanese was impressive 3.2%,

since Japan was virtually unaffected by the Great Depression.

The average annual growth rates over the entire period 1917–1940 based on the synthetic

control method is about 0.8% for the “pessimistic” scenario, 1.6–2% for the “moderate” one,

and about 2% for the “optimistic” one. Again, as a comparison, the American average annual

growth rate over the same period was 1.3%, and the Japanese was 2.4%.

Table 8 concisely summarizes the results of the comparison between the synthetic control

unit, USA, and Japan. It takes the lower of two “moderate” scenarios as a baseline.

Table 8: Growth Rates in ComparisonPeriod Counterfactual Russia USA Japan1917–1929 ≈2.6% 2.31% 1.65%1930–1940 ≈0.5% 0.15% 3.23%1917–1940 ≈1.6% 1.27% 2.40%

The table compares growth rates of counterfactual Russia with the growth rates of Japan and USA. Theresults for Russia are roughly based on the synthetic control method when all countries are included ascontrols. It leads to higher growth rates than the specification when Finland is omitted from the pool ofcontrols, but lower growth rates than when I restrict the sample to independent countries or to World WarI participants. The growth rates are computed as geometric means over the corresponding periods.

It is probably true that the Soviet industrialization of 1930s promoted faster growth and

helped to achieve higher levels of GDP per capita than would have been possible without

the Revolution. On top of that, the Soviet Union was virtually unaffected by the Great

24

Depression: even though the actual growth rates slowed a bit, the recovery was very fast, and

growth rates in 1930s were quite high. It is worth noting, however, that the industrialization

was also associated with the Soviet Famine of 1932-1933 and the Great Purge, during which

several million people died. It is worth noting that this decline in population in itself would

have increased GDP per capita even if GDP had stayed constant.

Overall, the Revolution probably allowed Russia to reach higher levels of economic de-

velopment by the end of 1930s, but it also was associated with a huge decrease in the GDP

per capita in 1920s, while if the Revolution had not happened, Russia would probably have

grown more consistently.

4.3 Additional Results

In this section, I present some robustness checks. I start by looking at the sensitivity of

the synthetic control method to the choice of the matching period. I use three alternative

choices: 1885–1904, 1891–1910, and 1897–1916. That is, I match Russia with the control

unit only using a particular subperiod of 1885–1917, and then construct the synthetic series

for the entire period 1885–1940.

Table 9 presents the weights that are obtained when I use different subperiods for matching

and compares them to the weights that I get when I use 1885–1916 for matching. Table 10

describes the characteristics of the various synthetic control units.

25

Table 9: Weights for Synthetic Control UnitsSpecification

Country 1885–1917 1885–1904 1891–1910 1897–1916Finland 0.453 0 0.459 0Netherlands 0 0 0 0.068Portugal 0 0 0.107 0.375Australia 0 0 0 0.062are Argentina 0 0.234 0.065 0Peru 0.084 0.436 0 0Uruguay 0 0 0 0.025are Venezuela 0 0 0 0.47India 0.217 0 0.368 0Japan 0 0.143 0 0Sri Lanka 0.246 0.187 0 0

The table presents the composition of the synthetic control units in various robustness checks specifications.The first one uses the entire 1885–1916 period for matching. The second one uses only 1885–1904 for matching;the third one uses only 1891–1910 for matching; and the fourth one uses only 1897–1916 for matching. Entrieswith “0” mean that the country was in the pool of potential controls but was assigned zero weight.

Table 10: Balancing of Actual and Synthetic Control Units

Characteristic

Unit Europe Independent Population, 1913 WW1 Deaths WW1 Deaths(thousands) (thousands) (% of population)

Actual 1 1 156,192 1,700.0 1.091885–1917 0.453 0.084 68,818 17.4 0.401885–1904 0 0.813 11,952 0.0 0.001891–1910 0.566 0.172 114,287 22.1 0.421897–1916 0.443 1 4,338 6.4 0.12

The table compares the characteristics of Russia with the characteristics of the synthetic control units invarious robustness checks specifications. The first one uses the entire 1885–1916 period for matching. Thesecond one uses only 1885–1904 for matching; the third one uses only 1891–1910 for matching; and the fourthone uses only 1897–1916 for matching. All characteristics of the synthetic control units are computed simplyas weighted averages of individual characteristics of the countries that compose the synthetic control unit.

26

Figure 3 plots the actual and counterfactual series. Each plot contains two counterfactu-

als: the one obtained when the entire 1885–1916 period is used for matching, and the one

obtained when only a particular sub-period is used for matching.

Figure 3: Actual and Synthetic Russian GDP per Capita, Robustness Checks

The natural logarithm of the actual GDP per capita is plotted as a dashed line, the synthetic series that

uses the entire 1885–1916 period for matching is plotted as a solid line, and the series that use different

sub-periods of the 1885–1916 period for matching are plotted as a dash-dotted line. The vertical line marks

1917, the year when the Revolution took place. The dash-dotted line in the upper-left graph uses 1885–1904

for matching; the dash-dotted line in the upper-right graph uses 1891–1910 for matching; and the dash-dotted

line in the upper-right graph uses 1897–1916 for matching.

What is quite clear is that the results are quite sensitive to the choice of the matching

27

period, both in terms of the composition of the synthetic control unit and in terms of the

counterfactual GDP per capita series. Only when I use the middle part of the pre-1917

period, i.e. 1891–1910, for matching, I do get very similar results to the original ones. But

when I use only the beginning of the pre-1917 period, i.e. 1885–1904, or only the end, i.e.

1897–1916, I get strikingly different results.

Now I move on to the lasso. Figure 4 presents the results from four specifications of

“Lasso–2” that differ in the choice of the control group. It looks like two first specifications

overfit in-sample, resulting in almost perfect fit in 1885–1916, but then predict a very strange

pattern of results for 1917–1940: first, very fast economic growth in 1920s, then very sharp

decline in GDP per capita in 1930s. This pattern is hardly explainable, since there no country

in the control group experienced such a decline in GDP per capita, and the only possible

explanation is that some countries that grew quickly in 1930s got negative weights. The third

and fourth specifications look much more plausible, and in general the pattern of the results

is similar to the synthetic control method described above.

28

Figure 4: Actual and “Lasso–2” Russian GDP per Capita

The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted

as a solid line. The vertical line marks 1917, the year when the Revolution took place. The upper-left

graph shows the results of “Lasso–2” method when all countries in the sample are included as potential

controls. The upper-right graph shows the results of “Lasso–2” method when only the independent countries

are included as potential controls. The lower-left graph shows the results of “Lasso–2” method when only

the countries that lost at least 0.05% of their population or 10,000 people in World War I are included as

potential controls. The lower-right graph shows the results of “Lasso–2” method when only the countries

that lost at least 0.05% of their population in World War I are included as potential controls. The penalty

parameters for the lasso are chosen via cross-validation, as described in Appendix A.1.

To summarize the results from different methods, the synthetic control yields plausible

counterfactuals for Russia, while the lasso predictions do not seem realistic. The synthetic

29

control method also pefrorms best in the placebo tests. This is because the lasso tends to

overfit in-sample, which, in turn, leads to worse out-of-sample properties. Since the synthetic

control method imposes more restrictions, it performs better in out-of-sample predictions, at

least according to the results of my placebo tests.

4.4 Placebo Tests: Illustrations

This subsection presents the results of the placebo tests for selected other countries to

illustrate whether the methods I use predict GDP per capita well for the countries that

actually did not experience the revolution.

I consider three countries: Finland, which was a part of Russian Empire before the

Revolution and arguably was the European country most similar to Russia; Japan, which

was a relatively poor country that experienced fast growth in the first half of the 20 century;

USA; and UK.

As I have said above, it is not clear if Finland is a good placebo test or not: it might have

been affected by the Revolution as well, but at the same time it did not switch to communism

or to a command economy, so its usefulness for placebo tests may depend on a particular

counterfactual scenario one has in mind. I do not distinguish between different scenarios

explicitly, but I consider Finland since it was an interesting case anyway: if we are interested

in a scenario in which there is a Civil war but the Bolsheviks lose, then Finland might be a

good case to look at.

Figure 5 plots the actual and synthetic control unit GDP per capita for Finland. Since

Finland was not independent, I do not present the case when I use only independent countries

as controls. As we can see from the figure, the synthetic control method does a decent job

predicting post-1917 GDP per capita when all countries are used as potential controls, though

it somewhat underpredicts the actual GDP in 1930s. Once the pool of controls is restricted

to World War I participants, the performance of the synthetic control unit deteriorates: now

it underpredicts the actual GDP per capita much more significantly.

30

Figure 5: Actual and Synthetic Finnish GDP per Capita

The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted as

a solid line. The vertical line marks 1917. The left graph shows the results of the synthetic control method

when all countries in the sample are included as potential controls. The right graph shows the results of the

synthetic control method when only countries that lost at least 0.05% of their population or 10,000 people

in World War I are included as potential controls.

Figure 6 plots the actual and synthetic control unit GDP per capita for Japan. Since

Japan does not satisfy my criteria for participating in World War I (even though formally

it participated in the war), I do not present the case when only World War I participants

are included as controls. When all countries are used as potential controls, the synthetic

control unit consistently uderperforms as compared to the actual series. When the pool of

controls is restricted to independent countries, the synthetic control unit still underpredicts

the actual GDP per capita in the beginning of 1920s, but it catches up with the actual series

by the beginning of 1930s, and overall does a good job predicting the pattern of economic

development in Japan.

31

Figure 6: Actual and Synthetic Japanese GDP per Capita


a solid line. The vertical line marks 1917. The left graph shows the results of the synthetic control method

when all countries in the sample are included as potential controls. The right graph shows the results of the

synthetic control method when only the independent countries are included as potential controls.

Figure 7 plots the actual and synthetic control unit GDP per capita for the USA. The

first two specifications, the one that uses all countries as potential controls and the one that

uses independent countries as potential controls, are very similar. They both underpredict

the actual GDP per capita in 1920s, but catch up with it after the Great Depression, in

1930s. The third specification, that restricts the pool of controls to World War I participants,

underperforms as compared to the actual series throughout the entire post-1917 period.

32

Figure 7: Actual and Synthetic American GDP per Capita


a solid line. The vertical line marks 1917. The upper-left graph shows the results of the synthetic control

method when all countries in the sample are included as potential controls. The upper-right graph shows

the results of the synthetic control method when only the independent countries are included as potential

controls. The lower graph shows the results of the synthetic control method when only countries that lost at

least 0.05% of their population or 10,000 people in World War I are included as potential controls.

Figure 8 plots the actual and synthetic control unit GDP per capita for the UK. As

opposed to previous three cases, all three methods fail to match the actual GDP per capita

well even before 1917. After 1917, the first two specifications, the one that uses all countries

as potential controls and the one that uses independent countries as potential controls, are

33

very similar and both overpredict the actual GDP per capita significantly: the actual series

catches up with the synthetic control ones only in the end of 1930s. The third specification,

that restricts the pool of controls to World War I participants, predicts the general pattern

of economic growth much better, but still is oscillates around the actual series lot, so the

discrepancy between the synthetic control unit and the actual unit in a given year is still

quite large.

34

Figure 8: Actual and Synthetic British GDP per Capita


a solid line. The vertical line marks 1917. The upper-left graph shows the results of the synthetic control

method when all countries in the sample are included as potential controls. The upper-right graph shows

the results of the synthetic control method when only the independent countries are included as potential

controls. The lower graph shows the results of the synthetic control method when only countries that lost at

least 0.05% of their population or 10,000 people in World War I are included as potential controls.

5 Conclusion

This paper analyzes the economic consequences of the Russian Revolution of 1917 using

modern econometric methods, such as the lasso and the synthetic control method. The

35

primary goal of this paper is to construct the counterfactual series of Russian GDP per

capita. Another goal of this paper is to consider various econometric methods that can be

useful in other settings when the number of controls is large relative to the sample size. I

use placebo tests to evaluate these methods, and the results of these tests suggest that the

synthetic control method is the most preferred one: by imposing more restrictions, it avoids

the overfitting problem and makes better out-of-sample predictions, while the lasso suffers

from the overfitting problem and makes unrealistic out-of-sample forecasts.

Different choices of potential controls yield different scenarios, but “on average” the annual

growth rates for the synthetic control unit are about 2.5% in 1917–1929, about 0.5% in

1930–1940, and about 1.6% over the entire period of 1917–1940. To put these results into

perspective, the synthetic control unit grows faster in 1917–1940 than USA, but slower than

Japan. The industrialization of 1930s probably allowed the Soviet Union to grow somewhat

faster than otherwise would have been possible, but at the same time the industrialization

was associated with high costs.

This paper complements the existing literature, which mostly relied on theory-based

simulations, in predicting the Russian economic growth after 1917 in a data-driven way.

Even though I do not analyze the causes of the Revolution and the methods that I use do

not allow me to consider various policy changes separately, my results present new evidence

on how Russia might have developed without the Revolution of 1917.

36

A Appendix

A.1 Using Cross-Validation for Penalty Parameter Selection

I have T = 32 observations in the pre-treatment period (1885–1916) for every country. I

do the following to choose the penalty parameter λ:

1. Choose a grid of values of λ: 0.001, 0.0025, 0.005, 0.0075, 0.01, 0.025, 0.05, 0.075, 0.1,

0.25.

2. For every value of λ, one-by-one drop one observation and run a lasso regression of

GDP per capita in the country of interest on GDP per capita in other countries using

the remaining T − 1 observations. I.e. drop observation t = 1 and run the lasso for

t = 2, ..., T ; then drop t = 2 and run the lasso for t = 1, 3, ..., T , and so on.

3. Select the variables with nonzero coefficients in the lasso and run a usual OLS regres-

sion with these variables as controls, still using the same N − 1 observations. Call

the resulting estimates βλ,−t. Construct fitted values for all N observations using the

resulting parameter estimates: yj,λ,−t = x′jβλ,−t, j = 1, ..., T .

4. For the observation t that was dropped, compute a prediction error: eλ,t = yt − yt,λ,−t.

5. Calculate the sum of squared errors: SSEλ =∑1916

t=1885 e2λ,t.

6. Choose the value of λ from (1) that minimizes SSEλ.

A.2 Construction of Prediction Accuracy Measures

A.2.1 Using Other Countries as Placebo Tests

For every methodm and every country c in the control group (that can vary depending on

which sample of countries I use as controls), construct a counterfactual GDP per capita series

log(GDPpc)m,c,t, t = 1917, ..., 1940, using the remaining countries from the control group as

37

controls. Compute the prediction errors em,c,t = log(GDPpc)m,c,t − log(GDPpc)c,t. Compute

the sum of squared errors: SSEm,c =∑1940

t=1917 e2m,c,t.

Then use the following four accuracy measures:

1. Average sum of squared errors: take the average over all countries in the placebo group,

i.e.

SSEm =1

N

N∑c=1

SSEm,c

2. Median sum of squared errors.

A.2.2 Using Pre-1917 Russia as Placebo Test

For every method m, divide the pre-1917 period into two parts: the matching and the

testing part. Use three such divisions: 1885–1904 for matching and 1905–1916 for testing,

1891–1910 for matching and 1885–1890 and 1911–1916 for testing, 1897–1916 for matching

and 1885–1896 for testing.

For every method m and every division, use the 20-year-long matching period to es-

timate the model and construct the counterfactual series for the entire period 1885–1917,

log(GDPpc)m,t, t = 1885, ..., 1916. If the method involves lags or both lags and leads, then

I make sure that I use the information from within the matching period by truncating the

number of observations accordingly. Moreover, if I use lags, I cannot construct the counter-

factual GDP for 1885 and 1886, and if I use both lags and leads, I cannot construct it for

1885, 1886, 1915, and 1916.

Next, construct prediction errors em,t = log(GDPpc)m,t − log(GDPpc)t. Then construct

the SSE as follows:

1. When the matching period is 1885–1904, compute SSEm =∑1914

t=1905 e2m,t.


t=1887 e2m,t+

∑1914t=1911 e

2m,t.


t=1987 e2m,t.

38

I do not take into account the first and last two years even when I can construct coun-

terfactual GDP per capita for these years for compatibility between different methods.

39

References

Abadie, A., A. Diamond, and J. Hainmueller (2010): “Synthetic Control Methods

for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Pro-

gram,” Journal of the American Statistical Association, 105, 493–505.

——— (2015): “Comparative politics and the synthetic control method,” American Journal

of Political Science, 59, 495–510.

Abadie, A. and J. Gardeazabal (2003): “The Economic Costs of Conflict: A Case Study

of the Basque Country,” The American Economic Review, 93, pp. 113–132.

Allen, R. C. (2003): Farm to Factory: A Reinterpretation of the Soviet Industrial Revo-

lution., Princeton Economic History of the Western World series. Princeton and Oxford:

Princeton University Press.

Bolt, J. and J. L. van Zanden (2013): “The First Update of the Maddison Project;

Re-Estimating Growth Before 1820.” Working Paper 4, Maddison Project.

Bradley, J. F. N. (1975): Civil War in Russia, 1917-1920, BT Batsford Limited.

Bullock, D. (2008): The Russian Civil War, 1918-22, vol. 69, Osprey Publishing.

Cheremukhin, A., M. Golosov, S. Guriev, and A. Tsyvinski (2017): “The Industri-

alization and Economic Development of Russia through the Lens of a Neoclassical Growth

Model,” The Review of Economic Studies, 84, 613.

Doudchenko, N. and G. W. Imbens (2016): “Balancing, Regression, Difference-In-

Differences and Synthetic Control Methods: A Synthesis,” Working Paper 22791, National

Bureau of Economic Research.

Ellis, J. and M. Cox (2001): The World War I databook: the essential facts and figures

for all the combatants, Aurum PressLtd.

40

Erlikman, V. (2004): Poteri narodonaseleniia v XX veke: spravochnik, vol. 1, Moscow.

Gregory, P. R. (2004): Russian National Income, 1885-1913, Cambridge University Press.

Hunter, H. and J. M. Szyrmer (2014): Faulty foundations: Soviet economic policies,

1928-1940, Princeton University Press.

Markevich, A. and M. Harrison (2011): “Great War, Civil War, and Recovery: Russia’s

National Income, 1913 to 1928,” The Journal of Economic History, 71, 672–703.

Mawdsley, E. (2007): The Russian Civil War, Pegasus Books.

International Labor Office (1923 – 25): Enquete sur la production: Rapport general,

vol. 1, Paris Berger-Levrault.

Pinotti, P. (2012): “The Economic Costs of Organized Crime: Evidence from Southern

Italy.” Temi di Discussione (Working Paper) 868, Bank of Italy.

Pipes, R. (2001): Communism: A History, Modern Library.

Tibshirani, R. (1996): “Regression Shrinkage and Selection via the Lasso,” Journal of the

Royal Statistical Society. Series B (Methodological), 58, pp. 267–288.

Urlanis, B. (1971): Wars and population, M.: Progress.

Wheatcroft, S. G. and R. Davies (2004): The Years of Hunger: Soviet Agriculture,

1931-1933, Palgrave Macmillan.

41

evaluating russian economic growth without the revolution ...€¦ · evaluating russian economic...

Documents