evaluating russian economic growth without the revolution ...€¦ · evaluating russian economic...
TRANSCRIPT
Evaluating Russian Economic Growthwithout the Revolution of 1917∗
Ivan Korolev†
July 5, 2017
Abstract
This paper uses modern econometric techniques, such as the lasso and the synthetic controlmethod, to construct the counterfactual GDP per capita series for Russia for 1917–1940. Thegoal of this paper is twofold: first, to predict how the Russian economy might have developedwithout the Revolution; second, to evaluate and compare various econometric methods forcomputing the counterfactual GDP per capita series. The counterfactuals based on the pre-ferred method, the synthetic control, suggest that without the Revolution Russia might havegrown at about 1.6% a year in 1917–1940.
∗I am grateful to Ran Abramitzky and Frank Wolak for continuous support and encouragement. I alsothank Simeon Djankov, Sergei Guriev, Caroline Hoxby, Andrei Markevich, and Hans-Joachim Voth for valu-able comments and discussions. I gratefully acknowledge the financial support from the Stanford GraduateFellowship Fund as a Koret Fellow and from the Stanford Institute for Economic Policy Research as a B.F.Haley and E.S. Shaw Fellow. All remaining errors are mine.
†Department of Economics, Stanford University, 579 Serra Mall, Stanford, CA, 94305. E-mail:[email protected]. Website: http://web.stanford.edu/~ikorolev/
1
1 Introduction
A major question in Russian economic history and a subject of ongoing public debate
in today’s Russia1 is how economic growth in the Soviet Union in 1920–1930s would have
compared to counterfactual growth of Russia if the Russian Revolution had not happened
in 1917. Some people believe that Soviet industrialization allowed the Soviet Union to grow
faster than ever before and to achieve what otherwise would have been impossible, while
others argue that fast economic growth of 1930s was just a return to the pre-revolution trend
and that Tsarist Russia would have achieved the same level of economic development.
There are several studies that try to understand the consequences of the various aspects
of the Revolution on economic development of Russia. In particular, Hunter and Szyrmer
(2014) evaluate Stalin’s economic policies in 1928–1940 using a multi-sector and multi-period
linear model; Allen (2003) looks at Russian industrialization that took place in the beginning
of 1930s and tries to understand what would have happened without centralized industrial-
ization; Cheremukhin et al. (2017) look at the effects of Stalin’s industrialization on economic
growth in Russia using a two-sector macroeconomic model. These studies mostly relied on
theory-based simulations, which has its advantages and disadvantages. On the one hand, this
approach allows the authors to analyze particular policy changes and consider a wide range
of scenarios; on the other hand, it might be difficult to model all factors that are important
for economic growth (e.g. productivity, technology, international trade, and so on) together.
This paper complements the existing literature by studying the consequences of the Rus-
sian Revolution of 1917 using a data-driven approach. The goal of this paper is twofold.
First, I construct the counterfactual series of Russian GDP per capita from 1917 on, and
hence I am able to predict how Russia might have developed without the Revolution of 1917.
Second, I evaluate various econometric methods that could be used to construct the coun-
terfactual series, and then I provide insights about their performance that can be useful in1For a brief summary of Russians’ opinion on Stalin see, e.g., https://www.forbes.com/2010/03/16/
russia-joseph-stalin-victory-day-opinions-contributors-cathy-young.html.
2
a variety of other settings, especially when the number of potential control variables is large
relative to the sample size.
My results can be viewed as a credible scenario of what might have happened without
the Revolution, but not as an estimate of the causal effect of the Revolution. Essentially, I
predict Russian GDP per capita after 1917 based on the pre-1917 data for Russia and both
pre-1917 and post-1917 data for other countries. Even though I will routinely use the notion
of “counterfactual series” for alternative scenarios and forecasts of Russian GDP per capita
throughout the paper, it should be clear these counterfactuals do not answer any particular
causal questions.
Since the Revolution took place in Russia because of certain political, economic, and
social issues, I cannot measure the causal effect of the Revolution by comparing Russia to
other countries. Moreover, the Revolution consisted of several events and included various
policy changes, and my methods do not allow me to analyze particular events separately.
Instead, my counterfactuals can be viewed as an attempt to find a stable predictive relation-
ship between Russia and other countries before 1917 and, assuming it would have remained
unchanged after 1917, construct predictions according to this relationship.
For instance, one might think that because of international trade, technology spillovers,
or other causes the growth rates in different countries are interconnected. Then I can recover
the relationship between the Russian economy and economies of other countries using econo-
metric methods, without modeling this relationship explicitly, and then I can make forecasts
based on this predictive relationship. In order to do so, I need to assume that the predictive
relationship was stable over time and would not have changed after 1917 if the Revolution
had not happened.
I try to validate this assumption by conducting placebo tests, in which I construct the
counterfactual GDP per capita series for the countries that did not experience a revolution
and show that these counterfactual series are reasonably close to the actual series.
I use two methods to estimate the predictive relationship between Russia and other coun-
3
tries. The first method is based on running a usual OLS regression of Russian GDP per
capita on GDP per capita of other countries using the pre-1917 data and then constructing
the forecast of Russian GDP per capita after 1917 as fitted values. However, the problem
with this approach is that I do not have enough years before the Revolution to run an unre-
stricted regression: I have 32 countries in the control group and only 32 observations before
the Revolution. Hence, I use the lasso to select the “best” regressors and then run an OLS
regression with only the selected regressors to construct the counterfactual series as the fitted
values from this OLS regression. From now on, I will call this method simply “the lasso”.
Another method is a recently developed synthetic control method. This method has
been used in Abadie and Gardeazabal (2003), Abadie et al. (2010), and Pinotti (2012).
For more a more theoretical discussion of the synthetic control method and its relation to
other approaches, see Abadie et al. (2015) and Doudchenko and Imbens (2016). The idea
of this approach is to use the pre-treatment data to find a so-called synthetic control, a
counterfactual observation that had been most similar to the treated observation before the
treatment took place, and then to look at how this counterfactual observation would have
behaved after the treatment. Here I am interested in counterfactual development of Tsarist
Russia without the Revolution, so the treatment is essentially the Russian Revolution that
took place in 1917.
I compare these two methods using placebo tests that involve other countries, where
revolutions did not happen, so that I know how the true growth path without revolutions
looks like, or Russia before 1917. I discuss these placebo tests in detail in Subsection 4.1.
The remainder of the paper is organized as follows. Section 2 gives historical background
of the Russian Revolution. Section 3 describes various approaches I use in this paper. Sec-
tion 4 describes the way to evaluate and compare various methods and then presents the
main findings. Section 5 concludes.
4
2 Background
In this section I give a brief overview of the historical context of the Russian Revolution of
1917. In the second half of the 19th century and beginning of the 20th century Russia was an
agrarian country, poorer than other European countries and approximately as rich as Japan.
Russia was facing economic and political challenges that, in particular, led to the Revolution
of 1905, which, however, did not result in dramatic political changes. To be precise, the
Revolution of 1905 led to a constitutional reform, but Russia remained a monarchy and
Tsar Nicholas II stayed in power. The government started to implement large-scale agrarian
reforms in 1906, but these efforts arguably were cut short by World War I in 1914. Despite
political and economic challenges, the economy was steadily growing before World War I.
In 1914 Russia entered World War I, which led to more serious economic, social, and
political problems, and in 1917 the Revolution took place. In fact, the Russian Revolution
of 1917 consisted of two revolutions. First, in March 1917, Tsar Nicholas II was forced to
abdicate and the provisional government was formed. Then, in November 1917 there was
another revolution, and as a result, the Bolsheviks came to power, replacing the provisional
government. The Civil War (loosely speaking, the war between the Bolsheviks’ Red Army
and the Anti-Bolsheviks White Army) started as a result of the Revolution. There is no
universal agreement among historians on the exact dates of the Civil War: historians believe
it started in November 1917 or in the summer of 1918 and ended between 1920 and 1922
(see Bradley (1975), Mawdsley (2007), Bullock (2008)). The Bolsheviks won the war, and
the Soviet Union was created in December 1922.
During 1920s, the economic and political systems in the Soviet Union became more and
more centralized and controlled by the government, and by the beginning of 1930s the Soviet
Union became a country with planned economy. Starting in 1928, the government attempted
a centralized industrialization in order to quickly turn an agrarian country into an industrial-
ized one. The cost of industrialization was high: according to Wheatcroft and Davies (2004),
between 5.5 and 6.5 million people died during the famine of 1932–1933; then, according
5
to Pipes (2001), about 680,000 people were executed during political repressions known as
the Great Purge in 1936–1938. The benefits of the industrialization are still unclear: some
historians (e.g. Allen (2003)) claim that it led to faster economic growth in the country, while
others (e.g. Cheremukhin et al. (2017)) argue that the Russian economy would have grown
at comparable rates even without the forced industrialization. Figure 1 plots the evolution
of Russian GDP per capita over time in order to illustrate the main patterns of economic
growth in Russia and the Soviet Union in 1885–1940. As a benchmark, it also plots the
evolution of American GDP per capita. I would like to stress that the GDP per capita data
for Russia and the Soviet Union was constructed by Markevich and Harrison (2011) retro-
spectively. Hence, I am not concerned about possible attempts from the Soviet government
to manipulate the data.
Figure 1: Russian and American GDP per Capita, 1885–1940
The natural logarithm of actual Russian GDP per capita is plotted as a solid line. The natural logarithm
of American GDP per capita is plotted as a dashed line. The vertical line marks 1917, the year when the
Revolution took place.
In this paper I try to estimate the counterfactual GDP per capita in Russia had the
Russian Revolution not taken place in 1917 and compare it with the actual GDP per capita.
It is worth noting that I do not discuss whether it was possible for the Russian government
6
to avoid the Revolution. It might be the case that in order to avoid the Revolution, it would
have been necessary to implement political or economic reforms that would have changed the
Russian growth path significantly.
3 Data and Empirical Strategy
The main source of data for my analysis is the updated version of the Maddison project,
i.e. Bolt and van Zanden (2013), which has data on Russian GDP per capita from 1885 on
as well as information about other countries. Table 1 lists all countries in the dataset. The
construction of country characteristics and data sources are discussed in the table footnotes.
The series of Russian GDP for 1913–1928, as reported in Bolt and van Zanden (2013),
was constructed by Markevich and Harrison (2011). They built upon numerous previous
studies, most notably Gregory (2004), to fill in the last gap in the Russian GDP series. An
important concern about this data is that the territory and population of the country changed
dramatically after the Revolution. However, Markevich and Harrison (2011) explicitly took
these changes into account in order to construct a consistent time series of GDP per capita
for the Russian Empire and then the Soviet Union.
As usually done in time series econometrics, I mainly use a natural logarithm of GDP
per capita in my analysis, because economists often believe that growth rates are stationary,
and hence the first differences of the logarithm of GDP per capita are stationary as well. As
for the forecast horizon, I focus on 1917–1940, because World War II started in 1939 and
Germany invaded the Soviet Union in 1941, so that any forecasts or comparisons that go
beyond that point do not make much sense.
To summarize, I have data on 1885–1940, with 1917 being the year when the treatment
took place, and my goal is to construct a counterfactual GDP per capita series for Russia
from 1917 on. As I mentioned above, I can use several methods to do it.
7
Table 1: Countries
Country Europe Independent Population, 1913 WW1 Losses WW1 Losses(thousands) (thousands) (% of population)
Russia 1 1 156,192 1,700.0 1.088Austria 1 1 6,767 120.0# 1.773Belgium 1 1 7,666 13.7 0.179Denmark 1 1 2,983 0.0 0Finland 1 0 3,027 26.5# 0.876France 1 1 41,463 1,357.8 3.275Germany 1 1 65,058 1,773.7 2.726Italy 1 1 37,248 650.0 1.745Netherlands 1 1 6,164 0.0 0Norway 1 1 2,447 0.0 0Sweden 1 1 5,621 0.0 0Switzerland 1 1 3,864 0.0 0UK 1 1 45,649 702.4* 1.539Greece 1 1 5,425 5.0 0.092Portugal 1 1 5,972 7.2 0.121Spain 1 1 20,263 0.0 0Australia 0 1 4,821 59.9* 1.242New Zealand 0 1 1,122 16.7* 1.489Canada 0 1 7,852 62.8* 0.800USA 0 1 97,606 116.5 0.119Argentina 0 1 7,653 0.0 0Brazil 0 1 23,660 0.0 0Chile 0 1 3,431 0.0 0Colombia 0 1 5,195 0.0 0Peru 0 1 4,295 0.0 0Uruguay 0 1 1,177 0.0 0Venezuela 0 1 2,874 0.0 0India 0 0 303,700 54* 0.017Indonesia 0 0 51,637 0.0 0Japan 0 1 51,672 0.3 0.001Sri Lanka 0 0 4,811 0.0 0
“Europe” equals 1 if a country is located in Europe and 0 otherwise. A majority of Russian population livedin Europe, so I treat Russia as an European country. “Independent” equals 1 if a country was independentthroughout the entire period and 0 otherwise. I treat Australia, Canada, and New Zealand as independentbecause they had so-called “responsible governments” and arguably enjoyed significant amount of indepen-dence from the UK. Sweden and Norway formed a union under the Swedish monarch until 1905, so formallyNorway was not an independent country until then. However, because these two countries still had sepa-rate laws, legislatures, armed forces, etc., I treat them as independent as well. Data on 1913 population isfrom Maddison (http://www.ggdc.net/maddison/oriindex.htm). “WW1 Losses” takes into account mili-tary losses only and comes from Encyclopedia Britannica (http://www.britannica.com/EBchecked/topic/648646/World-War-I/53172/Killed-wounded-and-missing). For the countries marked by “*” (that wereparts of the British Empire), the data is from Ellis and Cox (2001), except India, for which the data is fromUrlanis (1971). Austria and Finland, marked by “#”, did not exist in modern borders in 1914. For Austria,I use data from Erlikman (2004); for Finland, from International Labor Office (1923 – 25).
8
As a baseline, I use an ARIMA(1,1,0) model, i.e. estimate an AR(1) model in first
differences:
∆ log(GDPpc)t = α + ∆ log(GDPpc)t−1 + εt,
where where t indexes time and then construct the predicted values from 1917 on. This
approach would not allow me to account for possible structural breaks or for global macroe-
conomic factors that might have affected Russian economy, because essentially it consists of
just continuing the trend that existed before the Revolution. However, if other methods can-
not beat this method in placebo tests, it might indicate that other methods do not perform
well.
Because I have data on other countries, I can use it to predict Russian GDP as well. The
first option is to run a cross-sectional regression of Russian GDP per capita on GDP per
capita of other countries before 1917 and then construct the predicted GDP per capita after
1917 simply as fitted values from that regression using actual data for other countries:
log(GDPpc)Russia,t = α +k∑j=1
βj log(GDPpc)j,t + εt,
where j indexes countries in the control group.
It is also possible to combine these two methods and to estimate an ARMA model like
log(GDPpc)Russia,t = α +
p∑j=1
ρj log(GDPpc)Russia,t−j +k∑j=1
βj log(GDPpc)j,t +
q∑j=1
θjεt−j + εt
The main challenge with this approach is that the number of the countries in the sample
is almost as large as the number of the pre-revolution observations, so estimating such model
does not make much sense: it is possible to fit the observed GDP almost perfectly with so
many regressors. But such overfitting may lead to extremely inaccurate forecasts. However,
I can overcome this challenge by constraining the regression coefficients.
The first option to solve the overfitting problem is to use a lasso regression, first developed
9
in Tibshirani (1996). A lasso estimator is defined as follows:
minβ
∑t
(log(GDPpc)Russia,t − α−
k∑j=1
βj log(GDPpc)j,t
)2
s.t.∑j
|βj| ≤ t
It is equivalent to solving
minβ
∑t
(log(GDPpc)Russia,t − α−
k∑j=1
βj log(GDPpc)j,t
)2
+ λ∑j
|βj|,
where λ is a penalty parameter. I choose the penalty parameter using leave-one-out cross-
validation for each country separately. A detailed discussion of this procedure is presented
in Appendix A.1.
The idea behind the lasso is to penalize for having too many regressors by zeroing some
of the coefficients: because the augmented objective function is not smooth, the problem
typically has a corner solution in which some of the parameters are set to zero. Hence, using
the lasso helps to solve the problem of having insufficient number of observations in the data
by zeroing out the coefficients on the variables with low explanatory power.
In this paper I apply the lasso as follows: first, I normalize the control variables so that
they all have zero mean and unit variance, then choose the penalty parameter as discussed in
Appendix A.1, then run the lasso with this selected penalty parameter using the observations
for 1885–1916. After that, I keep only the selected countries and run OLS with these countries
and a constant (again using 1885–1916) in order to avoid biasing the coefficients towards zero
(hence, I use the lasso for the selection but not for the estimation). Finally, I construct the
fitted values for 1917–1940 using the resulting estimates.
In addition to running the lasso with only the current levels of GDP per capita in other
countries as controls, I run it with other sets of controls as well. Table 2 lists all specification
that I use.
10
Table 2: Control Variables for Lasso
Specification Name Control Variables
Lasso–1 “Baseline”(
log(GDPpc)j,t
)31j=2
Lasso–2 “Lags & Russia"
(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2
)31j=2
,
log(GDPpc)Russia,t−1, log(GDPpc)Russia,t−2
Lasso–3 “Lags”(
log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2
)31j=2
Lasso–4 “Lags & Leads & Russia”
(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2,
log(GDPpc)j,t+1, log(GDPpc)j,t+2
)31j=2
,
log(GDPpc)Russia,t−1, log(GDPpc)Russia,t−2
Lasso–5 “Lags & Leads”
(log(GDPpc)j,t, log(GDPpc)j,t−1, log(GDPpc)j,t−2,
log(GDPpc)j,t+1, log(GDPpc)j,t+2
)31j=2
The second option to solve the overfitting problem is to use the synthetic control method.
It is somewhat similar to the lasso, but uses different restrictions on the coefficients. It finds
a combination of control units (i.e other countries in the sample) with non-negative weights
that sum up to one in a way that minimizes the weighted sum of squared differences between
the treated and synthetic units during the pre-treatment period:
minβ
∑t
wt
(log(GDPpc)Russia,t −
k∑j=1
βj log(GDPpc)j,t
)2
s.t.∑j
βj = 1, βj ≥ 0∀j
The synthetic control method is similar to the lasso in the sense that it also imposes
constraints on the ordinary or weighted least squares coefficients, but the constraints are
more demanding. Since the synthetic control method requires all coefficients to be non-
negative and sum up to one, the resulting synthetic control unit can be interpreted as a
weighted average of the members of the control group.
The synthetic control method has its advantages and disadvantages. The main advantage
11
is that it is very intuitive and it clearly shows how the counterfactual would have looked
like. The cost of that, however, is that the method is completely atheoretical: it is not based
on any economic model, so it cannot take into account possible policy changes which could
have changed the predictive ralationship. For example, before World War I began, Russian
government started some important reforms, but it did not have time to implement them
fully, and it is not clear what the effect of those reforms would have been if they had been
completed. As for now, I abstract from the potential effect of the reforms, because this
question constitutes a separate research topic.
Given that the data for the end of the 19th and the beginning of the 20th century is
quite limited, I focus only on GDP per capita as the main explanatory variable, even though
I would like to use some variables that describe the sectoral composition of the economy as
well.
There is still a big question: how to choose a pre-treatment period over which to minimize
the difference between the treated and synthetic control units: the data is available starting
from 1885, and Russian Revolution took place in 1917, so I have only about 30 years for
matching. Ideally, I would want to do matching without using some observations just before
1917 in order to be able to use them as a placebo: to check that the synthetic control
approximates the behavior of the treated unit right before treatment well enough. However,
there are two issues with this approach: first, a relatively small number of observations before
treatment; second, the fact that Russia took part in World War I, and it caused its GDP per
capita to fall substantially in 1914–1916. The problem is that if I use pre-treatment years
as a placebo, I cannot account for this decline in GPD per capita when constructing the
synthetic control unit.
In order to solve these problems, I separate the tasks of selecting the best method and
constructing the counterfactual. In order to evaluate different methods and select the best
one, I start with placebo tests for the cases when I know how the true growth path without
a revolution looks like.
12
First, I use other countries in my sample for placebo tests: I use different methods to
construct the counterfactual series for these countries starting from 1917, and then I compare
the counterfactual and the actual series. Since other countries did not experience a revolution,
I evaluate the methods based on the difference between the actual and counterfactual series
after 1917.
Second, I use Russia before 1917 as a placebo test. I divide the pre-revolutionary period
in two parts, the matching part and the testing parts, using the Russian data to see how
well the methods I use predict Russian GDP per capita out-of-sample, but only before the
Revolution, when no treatment took place. Finally, based on the performance in placebo
tests, I choose the “best” methods and use them in the main analysis, now using the entire
pre-1917 period for matching.
I also consider using different subsamples of countries as controls. One may want to
match the counterfactual and actual units along other dimensions in addition to matching on
GDP per capita. I have data on whether a country was independent at that time, whether
it participated in World War I, and whether it is located in Europe.
Restricting the sample only to European countries seems problematic. First, even though
Russia is usually considered a European country, it does not entirely lie in Europe. Second,
since Russia was probably the poorest European country at that time, it might be hard to
find an appropriate synthetic unit. Asian or South American countries are likely to be a
better match.
As for World War I, it would be great to match on it, but there are several problems with
it as well. First, it is hard to come up with a reasonable and objective definition of being a
participant. Second, the countries that were mostly affected by World War I were European
countries, but, as discussed above, these countries did not match Russia well in terms of
economic development. Hence, using matching on World War I may be problematic, and I
discuss the challenges and possible solutions in the next section.
As for being independent, arguably, it is an important factor for economic development of
13
a country, and then restricting the sample to independent countries can improve the quality
of counterfactual series. At the same time it is relatively easy to determine when each country
became independent. There are two ambiguities here. First, it is somewhat difficult to define
exactly when Canada, Australia, and New Zealand became independent from Great Britain;
however, these three countries had so-called responsible governments by the end of the 19th
century and arguably enjoyed a significant amount of independence. Hence, I keep them in
the sample for the purposes of this exercise. Second, Norway and Sweden formed a union
under the Swedish monarch until 1905, so formally Norway was not an independent country
until then. However, since these two countries still had separate laws, legislatures, armed
forces, etc., I treat them as independent as well. The countries that I drop from the full
sample are Finland, India, Indonesia, and Sri Lanka.
Finally, I should note that one might want to match Russia with other countries based
on population to control for the overall size of the economy instead of just using GDP per
capita. The difficulty with using population in my analysis is that Russia was one of the most
populous countries in the world at that time and second most populous in my sample, which
makes it almost impossible to find a good match in terms of total population. Consequently,
I focus only on GDP per capita in my analysis, assuming that all economies were “scalable”
(or, in economic terms, that there were constant returns to scale), so that Russia can be
represented as a small country “replicated” many times, and that the population density
does not matter for economic growth.
4 Results
This section presents the findings of my paper. It consists of several subsections: Sub-
section 4.1 discusses which method to use, Subsection 4.2 presents the main results of my
paper, then Subsection 4.3 discusses some robustness checks and presents additional findings,
and finally Subsection 4.4 evaluates the performance of the preferred methods using selected
14
countries from the control group as placebos.
4.1 Choice of Preferred Specification
Before I present the main findings of the paper, I discuss how to pick the preferred method
from the methods considered above using placebo tests. I use two types of placebo tests.
The first one evaluates performance of various methods using the post-1917 period for the
other countries in my sample, while the second one uses the pre-1917 period for Russia by
breaking it into the matching and testing parts.
4.1.1 Other Countries as Placebo Tests
In this subsection, I drop Russia from the sample, construct counterfactual series for all
other countries using all methods under consideration, and then pick the method with the
“best” performance. I use the quadratic loss function and the absolute deviation loss function
in my analysis, so I concentrate on the sum of squared residuals and the sum of absolute
deviations of residuals as the measures of performance. The main underlying assumption of
this exercise is that other countries were similar to Russia, so that if a method performs well
for other countries it also performs well for Russia.
I have seven methods to choose from: the time series (ARIMA(1,1,0)), five different kinds
of the lasso (described in Table 2), and the synthetic control. Moreover, for every method, I
can use various pools of countries as controls: all countries in the dataset, only independent
countries, or only World War I participants.
In this subsection I do not match based on participation in World War I for several
reasons. First, it is hard to come up with a clear definition of being a participant. Second,
even if I could come up with such definition, not only the sample of potential controls, but
also the sample of countries for which I could run placebo tests would be limited to the war
participants. But such countries were mostly rich European countries which were significantly
different from Russia and do not necessarily form a good comparison group for Russia.
15
First, Table 3 reports the mean and median SSE across countries for each of the three
samples of countries. The detailed discussion of the construction of these measures is pre-
sented in Appendix A.2.1. As we can see from the table, when all countries are used as
controls (Panel A), “Lasso–2” yields the lowest mean SSE across countries, while “Lasso–4”
yields the lowest median. The synthetic control method yields both second-lowest mean and
median in this case, and also it has the best performance in terms of both mean and median
when I restrict the controls to independent (Panel B) or seriously affected by World War I
(Panel C) countries. Hence, I find that the results in Table 3 might suggest that the synthetic
control is the best method to use.
I should note that the results in Table 3 are based on the sample of countries including
Germany and Italy, which experienced dramatic social, political, and economic changes in
1920–1930s. Thus, using Germany and Italy for placebo tests can be problematic. However,
the ranking of the methods is not affected by dropping them from placebo tests.
Table 3: Comparison of MethodsPanel A: All Countries
MethodSSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 1.873 2.989 1.036 1.412 1.806 1.718 1.258Median 0.461 0.580 0.573 0.457 0.411 0.812 0.449
Panel B: Independent CountriesMethod
SSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.093 1.512 1.593 1.763 1.802 0.809Median 0.508 0.593 1.031 0.654 0.415 0.870 0.362
In Panel A I use all 30 countries as controls, compute the SSE for each of these countries, and report the meanand median across these countries. In Panel B I use only 26 independent countries as controls, compute theSSE for each of these countries, and report the mean and median across these countries. I mark the lowestmean and median in each panel in bold. I discuss how to compute the SSE in more detail in Appendix A.2.1.The penalty parameters for the lasso are chosen via cross-validation, as described in Appendix A.1.
Note, however, that because the samples of countries for which I compute the SSE differ
across two panels of the table, I cannot directly compare different panels of the table: it is
possible that it is easier to predict GDP per capita for some countries and harder for others.
16
Hence, if I want to compare methods across panels, I need to make sure that I compute the
SSE for the same countries even if I use different samples of countries as controls.
In order to solve this problem and compare the methods across various controls samples,
I next compare the performance of various methods with different samples of countries as
controls, but keeping the sample of countries for which I compute the SSE fixed. In Table 4
I use all countries or only independent countries as controls, while computing the SSE only
for independent countries. As we can see from the table, now the synthetic control method
is the best for both samples of controls, and restricting the sample of control countries to
independent countries only improves both the mean and median SSE.
Table 4: Comparison of MethodsPanel A: All Countries
MethodSSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.389 1.170 1.598 2.048 1.845 1.125Median 0.508 0.702 0.625 0.519 0.629 0.938 0.406
Panel B: Independent CountriesMethod
SSE Time Series Lasso–1 Lasso–2 Lasso–3 Lasso–4 Lasso–5 SyntheticMean 2.108 3.093 1.512 1.593 1.763 1.802 0.809Median 0.508 0.593 1.031 0.654 0.415 0.870 0.362
In Panel A I use all 30 countries as controls, compute the SSE only for 26 independent countries, and reportthe mean and median across these countries. In Panel B I use only 26 independent countries as controls,compute the SSE for each of these countries, and report the mean and median across these countries. I markthe lowest mean and median in each panel in bold, and the lowest mean and median across all samples inbold italic. I discuss how to compute the SSE in more detail in Appendix A.2.1. The penalty parametersfor the lasso are chosen via cross-validation, as described in Appendix A.1.
To summarize this section, overall I find the synthetic control to be probably the most
preferable method, but in some cases “Lasso–2” and “Lasso–4” also perform well. To shed
more light on the performance of the various methods, I next do the following: eliminate
all “strictly dominated” methods (i.e. the methods which are beaten by at least one other
method in all panels of Tables 3 and 4), which leaves me with “Lasso–2”, “Lasso–4”, and the
synthetic control method, and evaluate them using the pre-1917 data for Russia.
17
4.1.2 Pre-1917 Russia as Placebo Test
In this subsection I divide the pre-revolutionary period into two parts and use it for
placebo tests, as described in detail in Appendix A.2.2. I restrict my attention to the three
methods that performed well in the previous group of placebo tests, “Lasso–2”, “Lasso–4”,
and the synthetic control method, but I enrich the set of specifications by using matching on
World War I.
I have three methods and four choices of control groups. As before, the first pool of
controls includes all countries; the second one is restricted to independent countries. The
third and fourth use only World War I participants with different definitions of participation.
The first one includes countries that lost at least 0.05% of their population or 10,000 people
in World War I; the second one includes only countries that lost at least 0.05% of their
population. The only difference is India that satisfies the former criterion but not the latter,
but it might play an important role in matching because it was a poor country which might
enter the synthetic control unit with a relatively high weight.
Table 5 presents the results of these placebo tests. As we can see from the table, if we
compare methods for the same choice of control groups, the synthetic control method yields
the lowest SSE for almost every choice of the control sample. As for the comparison across
control groups but within a given matching period, the synthetic control yields the best
results in two cases out of three, but it is not clear whether restricting the control group
helps. As for the various lasso-based methods, it is difficult to rank them: they perform
very differently depending on the matching period and the controls choice. What is clear,
however, especially with “Lasso–2”, is that restricting the sample typically helps. It is quite
intuitive: the lasso is very flexible, and having too many potential controls may lead to
overfitting. Hence, even though the lasso tries to select appropriate controls in a data-driven
way, restricting the pool of controls beforehand may be helpful.
To conclude, I find that while the synthetic control method typically compares favorably
to the other methods under consideration, the placebo tests are inconclusive regarding the
18
preferable choice of the control group for this method. Hence, I will use the synthetic control
method with different control groups as the baseline, and I will present the results of the
lasso when I do robustness checks.
Table 5: Comparison of MethodsPanel A: Matching Period 1885–1904
MethodLasso–2 Lasso–4 Synthetic
All 0.700 0.553 0.195Independent 1.376 0.359 0.264WW1–1 1.109 1.338 0.028WW1–2 0.289 1.338 0.053
Panel B: Matching Period 1891–1910Method
Lasso–2 Lasso–4 SyntheticAll 0.274 0.075 0.047Independent 3.211 0.075 0.064WW1–1 0.075* 0.162 0.091WW1–2 0.062 0.162 0.195
Panel C: Matching Period 1897–1916Method
Lasso–2 Lasso–4 SyntheticAll 1.074 0.844 0.559Independent 1.356 0.844 0.559WW1–1 0.364* 0.333* 0.260WW1–2 0.353 0.406* 0.374
I discuss how to compute the SSE in detail in Appendix A.2.2. I mark the lowest SSE in each row inbold, and the lowest SSE in each panel in bold italic. The penalty parameters for the lasso are chosen viacross-validation when the entire 1885–1917 period is used for matching as described in Appendix A.1, and Iuse these parameters for all matching sub-periods. The only exceptions, marked by *, are when the cross-validated penalty leads to no controls being selected. “All” refers to the specification that uses all countriesin the pool of controls. “Independent” refers to the specification that uses only independent countries in thepool of controls. “WW1–1” refers to the specification that uses countries that lost at least 0.05% of theirpopulation or 10,000 people in World War I in the pool of controls. “WW1–2” refers to the specification thatuses countries that lost at least 0.05% of their population in World War I in the pool of controls.
19
4.2 Main Findings
In this section I present the results of the synthetic control method with different choices
of control groups, since the synthetic control method performs well in the placebo tests but
there is no clear ranking of choices of the control group.
I consider four variations of the synthetic control method. I have already discussed two
specifications in detail: the one that uses all countries as potential controls and the one that
uses only the independent countries as potential controls. The third specification uses all
countries but Finland as potential controls. The reason is that Finland, which is given a
high weight in the specification with all countries, was a part of Russian Empire before the
Revolution, so it likely was affected by the Resolution as well. Hence, to rule out possible
spillovers, I run this specification as a robustness check. The fourth specification uses only
World War I participants as controls, and it uses the first definition of participation: military
deaths equal to at least 0.05% of country’s population or at least 10,000 people.
Before discussing the counterfactual GDP per capita series, I first look at the properties
of the synthetic control units. Table 6 presents the composition of the synthetic control unit
in various specifications, while Table 7 compares the actual Russian country characteristics
with the characteristics of different synthetic control units.
Several things regarding the composition of the synthetic control units and their charac-
teristics are worth noting.
20
Table 6: Weights for Synthetic Control UnitsSpecification
Country All Countries No Finland Independent WW1Finland 0.453 – – 0.630Sweden 0 0.236 0 –Portugal 0 0 0.320 0.027Argentina 0 0.062 0.048 –Peru 0.084 0.080 0.163 –Venezuela 0 0 0.027 –India 0.217 0.299 – 0.343Japan 0 0 0.443 –Sri Lanka 0.246 0.323 – –
The table presents the composition of the synthetic control units in various specifications. The first one usesall countries as potential controls, the second one uses all countries but Finland as potential controls, and thethird one uses only the independent countries as potential controls. Entries with “0” mean that the countrywas in the pool of potential controls but was assigned zero weight; entries with “–” mean that the countrywas dropped from the pool of potential controls.
Table 7: Balancing of Actual and Synthetic Control Units
Characteristic
Unit Europe Independent Population, 1913 WW1 Deaths WW1 Deaths(thousands) (thousands) (% of population)
Actual 1 1 156,192 1,700.0 1.09All 0.453 0.084 68,818 17.4 0.40No Finland 0.236 0.378 94,505 7.5 0.00Independent 0.320 1 25,947 2.4 0.04WW1–1 0.657 0.027 106,237 25.5 0.56
The table compares the characteristics of Russia with the characteristics of the synthetic control units invarious specifications. The first one uses all countries as potential controls, the second one uses all countriesbut Finland as potential controls, and the third one uses only the independent countries as potential controls.All characteristics of the synthetic control units are computed simply as weighted averages of individualcharacteristics of the countries that compose the synthetic control unit.
21
First, in most specifications a relatively low weight is given to European countries. Second,
the first two specifications assign relatively low weight to independent countries. Third, all
specifications yield the synthetic control units with significantly lower World War I deaths
as a percentage of population than Russia actually experienced. This is consistent with
Russia being poorer than other European countries at that time, so that most European
countries do not serve as good controls for Russia. In fact, the only European country that
is assigned high weight in these specifications is Finland (with more than 45% weight in the
first specification), which was a part of the Russian Empire before the revolution and also was
relatively poor as compared to other European countries. At the same time it were European
countries that actively participated in World War I, and European countries constituted a
large share of independent countries as well.
In other words, there is a trade-off between making the synthetic control unit similar
to Russia in terms of economic development, i.e. GDP per capita, and making it similar to
Russia across the characteristics such as being European, being independent, or participating
in World War I. Russia was quite unique in that it was an independent and European country
and it played a significant role in World War I, but at the same time it was far poorer than
other European countries.
Now I move on to the main results of my paper: the performance of the counterfactual
unit in terms of GDP per capita. Figure 2 presents the results from four specifications of the
synthetic control method.
Even though there are differences between these specifications, the overall pattern of the
results is similar. They all match the pre-1917 behavior well, with the quality of matching
being higher when I use less restrictive pools of potential controls. As for the post-1917
behavior, the second scenario is “pessimistic”, the third one is “optimistic”, and the first
and fourth ones are “moderate”, but in general the behavior of the synthetic control unit is
similar: all synthetic control units steadily grow in 1920s, then experience a crisis associated
with the Great Depression, and then start recovering in the second half of 1930s. Depending
22
on the specification, actual Russian GDP per capita catches up with the counterfactual one
somewhere between 1933 and 1937.
Figure 2: Actual and Synthetic Russian GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the synthetic series is plotted
as a solid line. The vertical line marks 1917, the year when the Revolution took place. The upper-left graph
shows the results of the synthetic control method when all countries in the sample are included as potential
controls. The upper-right graph shows the results of the synthetic control method when all countries except
Finland are included as potential controls. The lower-left graph shows the results of the synthetic control
method when only the independent countries are included as potential controls. The lower-right graph shows
the results of the synthetic control method when only countries that lost at least 0.05% of their population
or 10,000 people in World War I are included as potential controls.
If we believe that these counterfactuals are plausible, without the Revolution Russia
23
would have grown at a rate comparable with, if not higher than, the developed countries.
For instance, in 1917–1929 (i.e. before the Great Depression) the GDP per capita annual
growth rates for the “optimistic” and “moderate” scenarios are 2.6–3%, which is higher than
the American (2.3%) or Japanese (1.6%) growth rates over the same period. Even for the
“pessimistic” scenario the annual growth rate is 1.5%, which is pretty close to the Japanese
one.
In 1930s, during the Great Depression, there is an economic crisis in all scenarios, with the
average annual growth rates being virtually 0 for the “pessimistic” scenario, about 0.45–0.8%
for the “moderate” one, and almost 1.5% for the optimistic. As a comparison, the American
average growth rate over the same period was 0.15%, and the Japanese was impressive 3.2%,
since Japan was virtually unaffected by the Great Depression.
The average annual growth rates over the entire period 1917–1940 based on the synthetic
control method is about 0.8% for the “pessimistic” scenario, 1.6–2% for the “moderate” one,
and about 2% for the “optimistic” one. Again, as a comparison, the American average annual
growth rate over the same period was 1.3%, and the Japanese was 2.4%.
Table 8 concisely summarizes the results of the comparison between the synthetic control
unit, USA, and Japan. It takes the lower of two “moderate” scenarios as a baseline.
Table 8: Growth Rates in ComparisonPeriod Counterfactual Russia USA Japan1917–1929 ≈2.6% 2.31% 1.65%1930–1940 ≈0.5% 0.15% 3.23%1917–1940 ≈1.6% 1.27% 2.40%
The table compares growth rates of counterfactual Russia with the growth rates of Japan and USA. Theresults for Russia are roughly based on the synthetic control method when all countries are included ascontrols. It leads to higher growth rates than the specification when Finland is omitted from the pool ofcontrols, but lower growth rates than when I restrict the sample to independent countries or to World WarI participants. The growth rates are computed as geometric means over the corresponding periods.
It is probably true that the Soviet industrialization of 1930s promoted faster growth and
helped to achieve higher levels of GDP per capita than would have been possible without
the Revolution. On top of that, the Soviet Union was virtually unaffected by the Great
24
Depression: even though the actual growth rates slowed a bit, the recovery was very fast, and
growth rates in 1930s were quite high. It is worth noting, however, that the industrialization
was also associated with the Soviet Famine of 1932-1933 and the Great Purge, during which
several million people died. It is worth noting that this decline in population in itself would
have increased GDP per capita even if GDP had stayed constant.
Overall, the Revolution probably allowed Russia to reach higher levels of economic de-
velopment by the end of 1930s, but it also was associated with a huge decrease in the GDP
per capita in 1920s, while if the Revolution had not happened, Russia would probably have
grown more consistently.
4.3 Additional Results
In this section, I present some robustness checks. I start by looking at the sensitivity of
the synthetic control method to the choice of the matching period. I use three alternative
choices: 1885–1904, 1891–1910, and 1897–1916. That is, I match Russia with the control
unit only using a particular subperiod of 1885–1917, and then construct the synthetic series
for the entire period 1885–1940.
Table 9 presents the weights that are obtained when I use different subperiods for matching
and compares them to the weights that I get when I use 1885–1916 for matching. Table 10
describes the characteristics of the various synthetic control units.
25
Table 9: Weights for Synthetic Control UnitsSpecification
Country 1885–1917 1885–1904 1891–1910 1897–1916Finland 0.453 0 0.459 0Netherlands 0 0 0 0.068Portugal 0 0 0.107 0.375Australia 0 0 0 0.062are Argentina 0 0.234 0.065 0Peru 0.084 0.436 0 0Uruguay 0 0 0 0.025are Venezuela 0 0 0 0.47India 0.217 0 0.368 0Japan 0 0.143 0 0Sri Lanka 0.246 0.187 0 0
The table presents the composition of the synthetic control units in various robustness checks specifications.The first one uses the entire 1885–1916 period for matching. The second one uses only 1885–1904 for matching;the third one uses only 1891–1910 for matching; and the fourth one uses only 1897–1916 for matching. Entrieswith “0” mean that the country was in the pool of potential controls but was assigned zero weight.
Table 10: Balancing of Actual and Synthetic Control Units
Characteristic
Unit Europe Independent Population, 1913 WW1 Deaths WW1 Deaths(thousands) (thousands) (% of population)
Actual 1 1 156,192 1,700.0 1.091885–1917 0.453 0.084 68,818 17.4 0.401885–1904 0 0.813 11,952 0.0 0.001891–1910 0.566 0.172 114,287 22.1 0.421897–1916 0.443 1 4,338 6.4 0.12
The table compares the characteristics of Russia with the characteristics of the synthetic control units invarious robustness checks specifications. The first one uses the entire 1885–1916 period for matching. Thesecond one uses only 1885–1904 for matching; the third one uses only 1891–1910 for matching; and the fourthone uses only 1897–1916 for matching. All characteristics of the synthetic control units are computed simplyas weighted averages of individual characteristics of the countries that compose the synthetic control unit.
26
Figure 3 plots the actual and counterfactual series. Each plot contains two counterfactu-
als: the one obtained when the entire 1885–1916 period is used for matching, and the one
obtained when only a particular sub-period is used for matching.
Figure 3: Actual and Synthetic Russian GDP per Capita, Robustness Checks
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the synthetic series that
uses the entire 1885–1916 period for matching is plotted as a solid line, and the series that use different
sub-periods of the 1885–1916 period for matching are plotted as a dash-dotted line. The vertical line marks
1917, the year when the Revolution took place. The dash-dotted line in the upper-left graph uses 1885–1904
for matching; the dash-dotted line in the upper-right graph uses 1891–1910 for matching; and the dash-dotted
line in the upper-right graph uses 1897–1916 for matching.
What is quite clear is that the results are quite sensitive to the choice of the matching
27
period, both in terms of the composition of the synthetic control unit and in terms of the
counterfactual GDP per capita series. Only when I use the middle part of the pre-1917
period, i.e. 1891–1910, for matching, I do get very similar results to the original ones. But
when I use only the beginning of the pre-1917 period, i.e. 1885–1904, or only the end, i.e.
1897–1916, I get strikingly different results.
Now I move on to the lasso. Figure 4 presents the results from four specifications of
“Lasso–2” that differ in the choice of the control group. It looks like two first specifications
overfit in-sample, resulting in almost perfect fit in 1885–1916, but then predict a very strange
pattern of results for 1917–1940: first, very fast economic growth in 1920s, then very sharp
decline in GDP per capita in 1930s. This pattern is hardly explainable, since there no country
in the control group experienced such a decline in GDP per capita, and the only possible
explanation is that some countries that grew quickly in 1930s got negative weights. The third
and fourth specifications look much more plausible, and in general the pattern of the results
is similar to the synthetic control method described above.
28
Figure 4: Actual and “Lasso–2” Russian GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted
as a solid line. The vertical line marks 1917, the year when the Revolution took place. The upper-left
graph shows the results of “Lasso–2” method when all countries in the sample are included as potential
controls. The upper-right graph shows the results of “Lasso–2” method when only the independent countries
are included as potential controls. The lower-left graph shows the results of “Lasso–2” method when only
the countries that lost at least 0.05% of their population or 10,000 people in World War I are included as
potential controls. The lower-right graph shows the results of “Lasso–2” method when only the countries
that lost at least 0.05% of their population in World War I are included as potential controls. The penalty
parameters for the lasso are chosen via cross-validation, as described in Appendix A.1.
To summarize the results from different methods, the synthetic control yields plausible
counterfactuals for Russia, while the lasso predictions do not seem realistic. The synthetic
29
control method also pefrorms best in the placebo tests. This is because the lasso tends to
overfit in-sample, which, in turn, leads to worse out-of-sample properties. Since the synthetic
control method imposes more restrictions, it performs better in out-of-sample predictions, at
least according to the results of my placebo tests.
4.4 Placebo Tests: Illustrations
This subsection presents the results of the placebo tests for selected other countries to
illustrate whether the methods I use predict GDP per capita well for the countries that
actually did not experience the revolution.
I consider three countries: Finland, which was a part of Russian Empire before the
Revolution and arguably was the European country most similar to Russia; Japan, which
was a relatively poor country that experienced fast growth in the first half of the 20 century;
USA; and UK.
As I have said above, it is not clear if Finland is a good placebo test or not: it might have
been affected by the Revolution as well, but at the same time it did not switch to communism
or to a command economy, so its usefulness for placebo tests may depend on a particular
counterfactual scenario one has in mind. I do not distinguish between different scenarios
explicitly, but I consider Finland since it was an interesting case anyway: if we are interested
in a scenario in which there is a Civil war but the Bolsheviks lose, then Finland might be a
good case to look at.
Figure 5 plots the actual and synthetic control unit GDP per capita for Finland. Since
Finland was not independent, I do not present the case when I use only independent countries
as controls. As we can see from the figure, the synthetic control method does a decent job
predicting post-1917 GDP per capita when all countries are used as potential controls, though
it somewhat underpredicts the actual GDP in 1930s. Once the pool of controls is restricted
to World War I participants, the performance of the synthetic control unit deteriorates: now
it underpredicts the actual GDP per capita much more significantly.
30
Figure 5: Actual and Synthetic Finnish GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted as
a solid line. The vertical line marks 1917. The left graph shows the results of the synthetic control method
when all countries in the sample are included as potential controls. The right graph shows the results of the
synthetic control method when only countries that lost at least 0.05% of their population or 10,000 people
in World War I are included as potential controls.
Figure 6 plots the actual and synthetic control unit GDP per capita for Japan. Since
Japan does not satisfy my criteria for participating in World War I (even though formally
it participated in the war), I do not present the case when only World War I participants
are included as controls. When all countries are used as potential controls, the synthetic
control unit consistently uderperforms as compared to the actual series. When the pool of
controls is restricted to independent countries, the synthetic control unit still underpredicts
the actual GDP per capita in the beginning of 1920s, but it catches up with the actual series
by the beginning of 1930s, and overall does a good job predicting the pattern of economic
development in Japan.
31
Figure 6: Actual and Synthetic Japanese GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted as
a solid line. The vertical line marks 1917. The left graph shows the results of the synthetic control method
when all countries in the sample are included as potential controls. The right graph shows the results of the
synthetic control method when only the independent countries are included as potential controls.
Figure 7 plots the actual and synthetic control unit GDP per capita for the USA. The
first two specifications, the one that uses all countries as potential controls and the one that
uses independent countries as potential controls, are very similar. They both underpredict
the actual GDP per capita in 1920s, but catch up with it after the Great Depression, in
1930s. The third specification, that restricts the pool of controls to World War I participants,
underperforms as compared to the actual series throughout the entire post-1917 period.
32
Figure 7: Actual and Synthetic American GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted as
a solid line. The vertical line marks 1917. The upper-left graph shows the results of the synthetic control
method when all countries in the sample are included as potential controls. The upper-right graph shows
the results of the synthetic control method when only the independent countries are included as potential
controls. The lower graph shows the results of the synthetic control method when only countries that lost at
least 0.05% of their population or 10,000 people in World War I are included as potential controls.
Figure 8 plots the actual and synthetic control unit GDP per capita for the UK. As
opposed to previous three cases, all three methods fail to match the actual GDP per capita
well even before 1917. After 1917, the first two specifications, the one that uses all countries
as potential controls and the one that uses independent countries as potential controls, are
33
very similar and both overpredict the actual GDP per capita significantly: the actual series
catches up with the synthetic control ones only in the end of 1930s. The third specification,
that restricts the pool of controls to World War I participants, predicts the general pattern
of economic growth much better, but still is oscillates around the actual series lot, so the
discrepancy between the synthetic control unit and the actual unit in a given year is still
quite large.
34
Figure 8: Actual and Synthetic British GDP per Capita
The natural logarithm of the actual GDP per capita is plotted as a dashed line, the lasso series is plotted as
a solid line. The vertical line marks 1917. The upper-left graph shows the results of the synthetic control
method when all countries in the sample are included as potential controls. The upper-right graph shows
the results of the synthetic control method when only the independent countries are included as potential
controls. The lower graph shows the results of the synthetic control method when only countries that lost at
least 0.05% of their population or 10,000 people in World War I are included as potential controls.
5 Conclusion
This paper analyzes the economic consequences of the Russian Revolution of 1917 using
modern econometric methods, such as the lasso and the synthetic control method. The
35
primary goal of this paper is to construct the counterfactual series of Russian GDP per
capita. Another goal of this paper is to consider various econometric methods that can be
useful in other settings when the number of controls is large relative to the sample size. I
use placebo tests to evaluate these methods, and the results of these tests suggest that the
synthetic control method is the most preferred one: by imposing more restrictions, it avoids
the overfitting problem and makes better out-of-sample predictions, while the lasso suffers
from the overfitting problem and makes unrealistic out-of-sample forecasts.
Different choices of potential controls yield different scenarios, but “on average” the annual
growth rates for the synthetic control unit are about 2.5% in 1917–1929, about 0.5% in
1930–1940, and about 1.6% over the entire period of 1917–1940. To put these results into
perspective, the synthetic control unit grows faster in 1917–1940 than USA, but slower than
Japan. The industrialization of 1930s probably allowed the Soviet Union to grow somewhat
faster than otherwise would have been possible, but at the same time the industrialization
was associated with high costs.
This paper complements the existing literature, which mostly relied on theory-based
simulations, in predicting the Russian economic growth after 1917 in a data-driven way.
Even though I do not analyze the causes of the Revolution and the methods that I use do
not allow me to consider various policy changes separately, my results present new evidence
on how Russia might have developed without the Revolution of 1917.
36
A Appendix
A.1 Using Cross-Validation for Penalty Parameter Selection
I have T = 32 observations in the pre-treatment period (1885–1916) for every country. I
do the following to choose the penalty parameter λ:
1. Choose a grid of values of λ: 0.001, 0.0025, 0.005, 0.0075, 0.01, 0.025, 0.05, 0.075, 0.1,
0.25.
2. For every value of λ, one-by-one drop one observation and run a lasso regression of
GDP per capita in the country of interest on GDP per capita in other countries using
the remaining T − 1 observations. I.e. drop observation t = 1 and run the lasso for
t = 2, ..., T ; then drop t = 2 and run the lasso for t = 1, 3, ..., T , and so on.
3. Select the variables with nonzero coefficients in the lasso and run a usual OLS regres-
sion with these variables as controls, still using the same N − 1 observations. Call
the resulting estimates βλ,−t. Construct fitted values for all N observations using the
resulting parameter estimates: yj,λ,−t = x′jβλ,−t, j = 1, ..., T .
4. For the observation t that was dropped, compute a prediction error: eλ,t = yt − yt,λ,−t.
5. Calculate the sum of squared errors: SSEλ =∑1916
t=1885 e2λ,t.
6. Choose the value of λ from (1) that minimizes SSEλ.
A.2 Construction of Prediction Accuracy Measures
A.2.1 Using Other Countries as Placebo Tests
For every methodm and every country c in the control group (that can vary depending on
which sample of countries I use as controls), construct a counterfactual GDP per capita series
log(GDPpc)m,c,t, t = 1917, ..., 1940, using the remaining countries from the control group as
37
controls. Compute the prediction errors em,c,t = log(GDPpc)m,c,t − log(GDPpc)c,t. Compute
the sum of squared errors: SSEm,c =∑1940
t=1917 e2m,c,t.
Then use the following four accuracy measures:
1. Average sum of squared errors: take the average over all countries in the placebo group,
i.e.
SSEm =1
N
N∑c=1
SSEm,c
2. Median sum of squared errors.
A.2.2 Using Pre-1917 Russia as Placebo Test
For every method m, divide the pre-1917 period into two parts: the matching and the
testing part. Use three such divisions: 1885–1904 for matching and 1905–1916 for testing,
1891–1910 for matching and 1885–1890 and 1911–1916 for testing, 1897–1916 for matching
and 1885–1896 for testing.
For every method m and every division, use the 20-year-long matching period to es-
timate the model and construct the counterfactual series for the entire period 1885–1917,
log(GDPpc)m,t, t = 1885, ..., 1916. If the method involves lags or both lags and leads, then
I make sure that I use the information from within the matching period by truncating the
number of observations accordingly. Moreover, if I use lags, I cannot construct the counter-
factual GDP for 1885 and 1886, and if I use both lags and leads, I cannot construct it for
1885, 1886, 1915, and 1916.
Next, construct prediction errors em,t = log(GDPpc)m,t − log(GDPpc)t. Then construct
the SSE as follows:
1. When the matching period is 1885–1904, compute SSEm =∑1914
t=1905 e2m,t.
2. When the matching period is 1891–1910, compute SSEm =∑1890
t=1887 e2m,t+
∑1914t=1911 e
2m,t.
3. When the matching period is 1897–1916, compute SSEm =∑1896
t=1987 e2m,t.
38
I do not take into account the first and last two years even when I can construct coun-
terfactual GDP per capita for these years for compatibility between different methods.
39
References
Abadie, A., A. Diamond, and J. Hainmueller (2010): “Synthetic Control Methods
for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Pro-
gram,” Journal of the American Statistical Association, 105, 493–505.
——— (2015): “Comparative politics and the synthetic control method,” American Journal
of Political Science, 59, 495–510.
Abadie, A. and J. Gardeazabal (2003): “The Economic Costs of Conflict: A Case Study
of the Basque Country,” The American Economic Review, 93, pp. 113–132.
Allen, R. C. (2003): Farm to Factory: A Reinterpretation of the Soviet Industrial Revo-
lution., Princeton Economic History of the Western World series. Princeton and Oxford:
Princeton University Press.
Bolt, J. and J. L. van Zanden (2013): “The First Update of the Maddison Project;
Re-Estimating Growth Before 1820.” Working Paper 4, Maddison Project.
Bradley, J. F. N. (1975): Civil War in Russia, 1917-1920, BT Batsford Limited.
Bullock, D. (2008): The Russian Civil War, 1918-22, vol. 69, Osprey Publishing.
Cheremukhin, A., M. Golosov, S. Guriev, and A. Tsyvinski (2017): “The Industri-
alization and Economic Development of Russia through the Lens of a Neoclassical Growth
Model,” The Review of Economic Studies, 84, 613.
Doudchenko, N. and G. W. Imbens (2016): “Balancing, Regression, Difference-In-
Differences and Synthetic Control Methods: A Synthesis,” Working Paper 22791, National
Bureau of Economic Research.
Ellis, J. and M. Cox (2001): The World War I databook: the essential facts and figures
for all the combatants, Aurum PressLtd.
40
Erlikman, V. (2004): Poteri narodonaseleniia v XX veke: spravochnik, vol. 1, Moscow.
Gregory, P. R. (2004): Russian National Income, 1885-1913, Cambridge University Press.
Hunter, H. and J. M. Szyrmer (2014): Faulty foundations: Soviet economic policies,
1928-1940, Princeton University Press.
Markevich, A. and M. Harrison (2011): “Great War, Civil War, and Recovery: Russia’s
National Income, 1913 to 1928,” The Journal of Economic History, 71, 672–703.
Mawdsley, E. (2007): The Russian Civil War, Pegasus Books.
International Labor Office (1923 – 25): Enquete sur la production: Rapport general,
vol. 1, Paris Berger-Levrault.
Pinotti, P. (2012): “The Economic Costs of Organized Crime: Evidence from Southern
Italy.” Temi di Discussione (Working Paper) 868, Bank of Italy.
Pipes, R. (2001): Communism: A History, Modern Library.
Tibshirani, R. (1996): “Regression Shrinkage and Selection via the Lasso,” Journal of the
Royal Statistical Society. Series B (Methodological), 58, pp. 267–288.
Urlanis, B. (1971): Wars and population, M.: Progress.
Wheatcroft, S. G. and R. Davies (2004): The Years of Hunger: Soviet Agriculture,
1931-1933, Palgrave Macmillan.
41