the gambler’s fallacy dominates the hot handdillon.dyson.cornell.edu/cv_papers/dillon lybbert -...
TRANSCRIPT
The Gambler’s Fallacy Dominates the Hot Hand
in Lottery Play∗†
Brian Dillon∗‡ Travis J. Lybbert‡
April 29, 2020
Working paper, comments welcome.
Abstract
We use a year of individually identifiable, administrative data from a mobile lottery inHaiti to examine how players react to the winning histories of numbers. The averageplayer avoids numbers that recently won. This gambler’s fallacy effect gradually fadesover several weeks. A small share of players exhibit the opposite tendency and alwaysbet a number that just won (the hot hand fallacy). We re-analyze the data from arecent paper that finds evidence of streak switching—betting with the gambler’s fal-lacy after short streaks and the hot hand fallacy after long streaks—in administrativelottery data from Denmark. We show that this is an artifact of an overly restrictiveparametric model. Relaxing simple assumptions in the Denmark analysis confirms thatthere is no general pattern of streak switching, and that the gambler’s fallacy prevailsin both settings.
Keywords: gambler’s fallacy; hot-hand fallacy; lottery; law of small numbers; Haiti.
∗†We are grateful to Hilary Wething and Ben Glasner for excellent research assistance, and to Dan Ben-jamin for comments on an earlier draft. Any errors are our responsibility.∗‡Cornell University. Email: [email protected].‡University of California, Davis. Email: [email protected].
1
1 Introduction
Without formal training, many people struggle to form correct statistical intuitions. This
paper focuses on two common types of mistaken intuition: the gambler’s fallacy (GF) and
the hot hand fallacy (HHF). The GF is the belief that a number drawn from an independent,
identically distributed (i.i.d.) process is less likely to be drawn immediately after it wins.
This belief in too little serial correlation is rooted in the mistaken sense that small samples
should look like large samples (Kahneman et al., 1982; Rabin, 2002). The HHF is the belief
in too much serial correlation, namely that a number is more likely to be drawn if it has
won in the recent past.
A large literature has documented the prevalence of these biases and examined their
consequences for choice and belief formation (see Benjamin (2019) for a review). Many
of those studies are in the realms of finance, law, or sports, where neither the agents nor
the econometricians know the true time series properties of the data generating process
(DGP) (Camerer, 1989; Xu and Harvey, 2014; Chen, Moskowitz and Shue, 2016). Laboratory
experiments remove uncertainty about the DGP, but suffer from the usual concerns about
the artificiality of the environment. Even when the DGP is known, the analysis of streaks
in a finite sample is subject to subtle but important challenges for inference (Miller and
Sanjurjo, 2018).
Lottery games in which players select their own numbers provide an ideal setting
for studying the GF and HHF, as they represent natural experiments with real stakes and
known DGPs. Foundational work by Clotfelter and Cook (1993) tests for the presence of
the GF and HHF in aggregate lottery data. They find that betting on a number falls after
it wins, and that this GF effect persists for many future rounds. With aggregate data they
cannot distinguish changes in the composition of the player pool from within-player changes
in beliefs, and cannot determine whether the average GF effect reflects a mixture of both GF
and HHF responses. Suetens, Galbo-Jørgensen and Tyran (2016) (hereafter SGT) overcome
these challenges by using individually identifiable administrative data from an online lottery
in Denmark to test for GF and HHF betting. SGT find evidence of the GF. They also claim
to find support for “streak switching,” a more complex bias that accommodates both the GF
2
and the HHF (Rabin and Vayanos, 2010). A streak switching player avoids numbers after
short streaks, but comes to believe after a sufficiently long streak that the DGP is biased in
favor of the streaking number, and then begins to place hot hand bets.
In this paper we use a year’s worth of individually identifiable, administrative data
from a twice-daily lottery in Haiti to address two research questions. First, for how long and
in what direction do past wins influence the amount of betting on a number? Second, do
lottery players react differently to short win streaks and long win streaks? Consistent with
the theoretical predictions of Rabin and Vayanos (2010) for an i.i.d. process, we find that
the GF dominates. The average player systematically avoids numbers that recently won.
The deterrent effect of a win decays gradually as the win fades into the past, but persists
for as long as a month. We find no evidence of streak switching. We then re-analyze the
lottery data used by SGT and find that the appearance of streak switching in Denmark is an
illusion created by questionable empirical choices. Alternative specifications, including one
that relaxes a linearity assumption in SGT but otherwise preserves their approach, indicate
that the GF dominates in Denmark as it does in Haiti.
Because our data set includes individual identifiers, we can detect betting patterns
that are hidden in aggregate data. While the average player bets in accordance with the
GF, we find in Haiti that 8% of players select a number that won in the previous round at
least 50% of the time that they bet, consistent with the HHF.1 This suggests the presence
of (at least) two player types in lottery play, and likely in other similar settings.
The broad similarity between findings from Denmark and Haiti indicates that the in-
tuition underlying the GF is deeply rooted in human cognition. The results from Haiti may
also have practical value. Economists have recently begun to take seriously the prominent
role of gambling in the financial lives of the poor in developing countries (Bernstein, 2015;
Herskowitz, 2016). This growing literature is opening promising new policy and behavioral
design possibilites (Brune, 2015; Cole, Iverson and Tufano, 2014; Gertler et al., 2018), in-
cluding lottery-linked savings in Haiti (Dizon and Lybbert, 2019). A deeper understanding
of the gambling proclivities of the poor is an important first step toward designing behav-
1We explain later how the structure of the Danish lottery makes it impossible to interpret the results ofa similar analysis.
3
ioral finance interventions that accommodate these longstanding habits while also building
pathways toward greater financial stability.
2 Theoretical Framework
The GF and the HHF may appear to be mutually exclusive, because they suggest opposite
reactions to recent events. However, a longstanding tradition in psychology and more recently
economics allows for the possibility that these fallacies may be related (Edwards, 1961;
Camerer, 1989; Rabin, 2002; Rabin and Vayanos, 2010). Rabin and Vayanos (2010) develop
a model that formalizes the possible connection between the two fallacies. The agent in the
model is dogmatically inclined to believe in the GF, possibly because she thinks that small
samples should look like large samples. She observes a sequence of draws from a DGP and
uses Bayesian inference to form beliefs about parameters. When the agent is uncertain about
the DGP or believes that draws are serially dependent, her reaction to a win streak may be
different for short and long streaks. Such streak switching occurs because when a number
exhibits a short winning streak, she believes it is even less likely to appear again in the near
future, due to her GF beliefs. If the streak continues, she over-interprets this as a signal
that the DGP favors the number in question, and updates her beliefs accordingly. She now
expects the number to keep winning. Beyond some threshold streak length, the agent flips
from the GF to the HHF.
Uncertainty about the DGP is central to the streak switching prediction. When the
DGP is fixed and draws are i.i.d.—which is an apt description of a lottery—the agent always
expects reversals (i.e., displays GF reasoning), leaving no room for the HHF to emerge. Rabin
and Vayanos (2010) make this explicit in their Prediction 1 (p. 751): “When individuals
observe i.i.d. signals and are told this information, they expect reversals after streaks of any
length. The effect is stronger for long streaks.” As long as lottery players believe that the
lottery is an i.i.d. process, our analysis is a test of Prediction 1, and hence a test of whether
the average lotto player exhibits dogmatic attachment to the GF.
Of course, some people may not believe that the lottery is i.i.d. Many players choose
numbers based on sentiment, premonition, or recent events, suggesting a belief that lottery
4
odds are mutable or governed by supernatural forces. For many Haitians, superstitions about
the lottery are part of a broader religious worldview in which divine intervention and fate
have direct bearing on daily life (Bhatia, 2010). Active engagement in number choice not
only makes lotteries more entertaining, but also may lead players to believe that the odds
change in response to their participation. Prior work shows that some lottery players exhibit
an illusion of control—a mistaken belief that a number is more likely to win if they actively
choose it rather than have it randomly assigned (Langer, 1975).
To accommodate possibile heterogeneity in players’ beliefs about the lottery DGP,
our analysis allows for different responses to short and long winning streaks. Prediction 1
of Rabin and Vayanos suggests that the GF will dominate and the average player will avoid
recent winners after streaks of any length. Evidence of streak switching would indicate either
that the prediction is wrong, or that the average player does not believe the lottery is i.i.d.
3 Setting and Data
3.1 The Mobile Phone Lottery in Haiti
The lottery is part of the rhythm of daily life in Haiti. Millions of Haitians play frequently,
and some of the working poor routinely wager a large share of their daily income on lottery
games (Bernstein, 2015). Players often select numbers based on superstitions and dreams.
Concordances known as the Tchala, which are available at every lottery stall and online,2
translate elements in one’s dreams into numbers. To ensure transparency and trust, the wide
array of lottery games in Haiti are all based on numbers drawn in the New York Lottery.
While most of these games are adminsitered by physical lotto stalls called borlettes, digital
lotteries played on mobile phones have gained popularity in recent years, particularly among
younger Haitians in urban and peri-urban areas.
We study a mobile phone lottery game called Boloto. Like all Haitian lottery games,
Boloto is played twice each day, corresponding to the midday and evening numbers drawn in
New York. To participate, a player places a bet consisting of three two-digit number pairs
2For an online version of the Tchala see http://lisa.ht/tchala/ (Accessed 21 November 2018).
5
(00-99) in a specified order. The cost of each bet is 25 Haitian gourdes (HTG), or about
0.60 USD in 2012. There is no limit to the number of bets a single player can make in each
round. To bet more money on a set of numbers, a player simply places additional bets.
The payout for Boloto is a function of which number matches the draw. Winning in
the first, second, or third position pays out 250 HTG (10x), 100 HTG (4x) or 50 HTG (2x),
respectively. If all three numbers are drawn, but not in order, the player wins 100,000 HTG
(4,000x). If all three numbers are drawn in order, the player wins the jackpot, which pays
out 2,000,000 HTG (80,000x). Payouts are independent across players—anyone playing a
winning number wins the full payout associated with the bet.3
3.2 Data and Descriptive Statistics
This section describes the administrative lottery data from Haiti we use for our main analysis.
In Section 6 we briefly describe SGT’s dataset from Denmark, which we re-analyze for
comparison to our Haitian results.
The private firm licensed to conduct the digital lottery in Haiti provided us with
access to data for the universe of bets placed in Boloto from February 1, 2012 to January
31, 2013.4 For each bet we observe a player ID, the date of the game, an indicator for the
midday or evening round, the ordered set of three two-digit numbers that constitute the bet,
the time and date that the bet was placed, and the winning numbers. Player IDs are unique
numerical codes linked to mobile phone accounts. Across the 730 rounds (2 per day, for a
year), a total of 4,505,519 bets were placed, with over 13.5 million separate number choices.
The Boloto data includes bets from 112,808 different players. The average player
makes 39.9 bets over the year, in 12.7 different rounds, on 9.1 different days. About 1 in 200
players (0.5%) makes a bet in all 12 months; the average player makes at least one bet in
two separate months.
Our analysis examines the relationship between a number’s winning history and the
3The only exception is for multiple jackpots, in which case the winners split the payout. In practice thisalmost never happens, and in the year’s worth of data that we study there are only a few jackpots and noshared jackpots. We assume throughout that the possibility of splitting a jackpot does not shape individualnumber choices.
4There are 366 days in that range, because 2012 was a leap year, but only 365 days with betting (thereis no data for Christmas Eve).
6
probability that it is bet. We first represent each number selection as 100 separate choices:
1 decision to play a number, and 99 decisions not to play all others. This allows us to take
advantage of both player and number fixed effects. Let dijnrp be a dummy variable equal
to 1 if player i in bet j plays number n in round r in position p, and 0 otherwise. The
position p refers to the first, second, or third number in the bet. The numbers n lie in the
set {0, 1, 2, . . . , 99}. The round, r, includes both the date and the time (midday or evening)
of the game. The bet indicator, j, captures the possibility that a player places multiple bets
per round. We use Jir to denote the number of bets placed by player i in round r.
In our analysis the dependent variable is Playedinr =∑Jir
j=1
∑3p=1{dijnrp}, which
is a count of the number of times in round r that i played n in any position and any
bet.5 This is approximately proportional to the amount wagered on the number. At the
player-number-round level, the full dataset contains 143 million observations. To make the
analysis tractable, we estimate regressions and some descriptive statistics using a random
10% subsample of players, fixed across specifications. This analysis sample consists of 11,348
players who place 140,904 bets, for a total of 14,090,400 observations after reshaping. In the
analysis sample, the mean value of Playedinr is 0.095.
Panel A of Figure 1 shows the number of bets placed, by round. The spike on October
17 coincides with Dessalines Day, a national holiday that commemorates the assisination of
Haiti’s founder. A weekly cycle of activity is clearly visible, as is a seasonal pattern, with
increased activity during the months July–October.6 Panel B of Figure 1 shows the histogram
of numbers played. The most popular choice, 10, represents 4.7% of all plays. The four next
most popular numbers are 0, 11, 33, and 13, all of which are played at more than twice the
random rate. Doubles—22, 44, 55, 77, etc.—are also popular.
There is no easy way to descriptively characterize GF betting in Boloto, because
betting with the GF implies not doing something, opting instead for one of a large set of
alternatives. It is easier to describe HHF betting. Figure 1, Panel C, shows a player-level
5Our findings are broadly similar if we define the dependent variable as PlayedDummyinr =maxJir
j=1{max3p=1{dijnrp}}, which measures the extensive margin choice to play a number at the player-
number-round level. See Appendix.6Greater play in summer could be due to return visits by Haitians living abroad, which peak in July–
August. These visitors may gamble or give cash gifts that prompt more gambling by others. In September-October, increased gambling activity is likely driven by the main harvest. Even in urban areas, economicactivity related to the harvest drives seasonal fluctuations in incomes.
7
histogram of the share of played rounds in which the player chooses a number that was a
winner in the previous round. Nearly two thirds (63.8%) of players never make a hot hand
bet. In contrast, roughly 4% of players make a hot hand bet every time they play, and 8%
make a hot hand bet at least half the times they play.
While the rate of hot hand play is roughly constant across the year, we do observe
occasional spikes in such betting after certain events. In one round nearly 40% of players
make a hot hand bet; the winning numbers from the previous round were 5-50-55. The next
five rounds with the highest shares of hot hand betting occur after a 0 or a 10 was a winner.
4 Empirical Approach
To provide a baseline characterization of how the amount bet on a number is related to
its recent success, we estimate OLS regressions of the following form using the analysis
subsample of the Boloto data:
Playedinr =R∑l=1
βlWinnern,r−l + ηControlsinr + εinr (1)
where Playedinr is as defined in the previous section; Winnern,r−l is a binary variable indi-
cating whether n was one of the drawn numbers in round r − l; Controlsinr includes fixed
effects for players, numbers, and rounds; and εinr is a statistical error term. With a suffi-
ciently large choice of R, a plot of the βl coefficients will non-parametrically trace out the
time path of effects of past wins on current betting. Evidence of βl < 0 (βl > 0) is consistent
with a GF effect (HHF effect) that persists for l rounds.
Specification (1) does not account for the probability that a number drawn in round
r− l will be drawn again prior to round r, which is increasing in l. The effect of winning, es-
pecially winning in the distant past, may be underestimated if a number wins multiple times.
To account for this, we also estimate OLS regressions based on the following specification:
Playedinr =R∑l=1
βlMostRecentWinn,r−l + ηControlsinr + εinr (2)
which is identical to (1), except the key independent variable MostRecentWinn,r−l takes a
8
value of 1 only if r− l is the most recent round in which n was drawn, and 0 otherwise. Once
again, a finding of βl < 0 (βl > 0) is consistent with the GF (HHF).
Estimation of specifications (1) and (2) provides the average effect of lagged wins on
current betting. To test predictions about players’ reactions to streaks, we need to allow for
more complex interactions between past events. Lengthy winning streaks are rare in Boloto;
the probability that a specific number is selected in a round is only 0.0297. Following SGT,
we define a streak as the co-occurence of a win in the previous round with a history of winning
in other recent rounds. Formally, let Hotnessnr be the number of times that n was a winner
during rounds r − 2 to r − S, for some integer S ≥ 2; and let Hotnrc be a dummy variable
equal to 1 if Hotnessnr = c, and 0 otherwise. For each r, the winning streak of number n is
given by Winnern,r−1 × {Winnern,r−1 + Hotnessnr} (e.g., the streak has length 3 if n was
drawn in the previous round and was drawn twice in rounds 2 . . . S). To semi-parametrically
estimate the average response to streaks of different length, we estimate OLS regressions of
the following form:
Playedinr = βWinnern,r−1 +C∑c=1
{δcHotnrc + γc(Winnern,r−1 ×Hotnrc)}+ ηControlsinr + εinr
(3)
where all variables are as defined above, C is sufficiently large to include all observed streaks,
and εinr is a statistical error term. We report results for S ∈ {6, 14, 60}, equivalent to defining
streaks over the previous 3 days, 7 days, and 30 days.
In equation (3), the player, number, and round fixed effects account for average
differences between players, average popularity of numbers, and temporal patterns in betting.
The total effect on current betting of a streak of length 1 is given by β. The total effect of a
streak of length d > 1 is νc = β + δc + γc, where c = d− 1. If players are not influenced by
either the GF or the HHF, we expect β = δc = γc = 0 for all c. The predictions of Rabin and
Vayanos (2010) are equivalent to (i) β < 0, (ii) νc < 0 for all c, and (iii) νd < νc for any d > c
(because the model predicts that longer streaks induce a larger GF effect). Alternatively, if
players bet with the GF after short streaks and the HHF after long streaks—which is not
the prediction of the Rabin and Vayanos model when the process is known to be i.i.d.—then
9
we expect β < 0 and νc > 0 for all c of sufficient length.
Estimates based on equation (3) provide average effects for streaks of a given length.
To allow for complete flexibility in the estimated response to any combination of past wins,
we also estimate a fully non-parametric model for the previous six rounds (S = 6). In
this model, the dependent variable is Playedinr, and the independent variables are dummy
variables for all observed combinations of wins during the previous 6 rounds. As always, we
include player, number, and round fixed effects.
For all models we report standard errors clustered at the player level. Because the
average player participates in just 12.7 out of 730 rounds, we do not impose balance on the
panel. Hence, our analysis takes as given the extensive margin decision to participate in the
lottery in any particular round.
5 Results
For our baseline estimates of equations (1) and (2) we use a set of 84 dummy variables
covering every round in the previous 6 weeks (R = 84). Figure 2 plots the coefficients on
the dummy variables representing a win each lagged period, with 95% confidence intervals.
Panel A reports estimates from specification (1), based on all recent wins; Panel B reports
estimates from specification (2), based on only the most recent win for a number.
The average effect of a number being drawn in lagged rounds 2-84 indicates a sur-
prisingly persistent GF response. Winning never leads to an increase in betting, on average.
Players avoid numbers that have won recently, but the effect is attenuated as the win fades
into the past. In both panels, the deterrent effect of a recent win is statistically significant
for almost 60 rounds (30 days). A win in lagged round 2, 3, or 4 decreases the number of
times a number is selected by 0.032–0.035, a reduction of over a third from the mean of
0.095. The effects are even larger in magnitude when we restrict attention to only the most
recent win (Panel B).
The exception to the pattern of diminishing effect size is from a win in the immediately
preceding round. The point estimate for a win in the preceding round is approximately −0.01
in Panel A, less than a third of the magnitude of the effect of a win in lagged rounds 2–4.
10
This attenuation is likely driven by the small share of hot hand players that bet a number
immediately after it wins (Figure 1, Panel C).
Table 1 shows estimates of equation (3). Columns 1, 2, and 3 report the findings for
streaks defined over the previous 3 days, 7 days, and 30 days, respectively. Panel A reports
coefficient estimates, and Panel B reports the estimated marginal effects (νc). Across all
streak lengths and specifications, there are no positive marginal effects of a winning streak
on the probability that a number is bet. All 13 of the estimated effects in Panel B are
negative, and 10 are statistically different from zero.7 In column 1, the negative effect of a
win streak on the probability that a number is selected is increasing in streak length. Betting
on a number falls by 0.011 percentage points (11.6% of the mean), 0.021 percentage points
(22.1%), and 0.031 percentage points (32.6%) after streaks of length 1, 2, and 3, respectively.
The positive relationship between streak length and the magnitude of the deterrent effect
is consistent with Rabin and Vayanos (2010) when the DGP is i.i.d. (and participants are
aware of that fact). When we define streaks over periods of 7 or 30 days, the GF again
dominates (columns 2 and 3), but the marginal effects do not increase monotonically in
streak length (see Section 7).
Columns 1-3 of Table 2 show estimates from the non-parametric model, in which the
independent variables are dummy variables for all observed combinations of wins during the
previous 3 days (6 rounds). Column 3 reports the number of occurrences of each pattern
over the year of data. All combinations of three wins are observed 3 or fewer times (however,
thousands of people play the lottery after each occurrence). There is no pattern of recent wins
that increases betting on a number, on average. Out of 36 coefficients, 31 are statistically
different from zero, and all of those are negative.
6 Re-analysis of the Danish Lottery Data in SGT
SGT use administrative data from a lottery in Denmark to study the GF and HHF. Their
dataset, like ours, includes individual player identifiers. SGT frame their analysis as a test
7The coefficients that are not statistically different from zero are for the least commonly observed streaklengths. There are only 63 instances of a streak of length 4 in column 2; 45 instances of a streak of length 5in column 3; and 11 instances of a streak of length 6 in column 3.
11
of the streak switching prediction from Rabin and Vayanos (2010), namely, that players bet
with the GF after short streaks and the HHF after long streaks. They claim to find evidence
of this pattern in the behavior of Danish lottery players. That finding is inconsistent with
our findings from Haiti, where the GF predominates after streaks of any length. After
re-analyzing the Danish data, we believe this apparent inconsistency is an illusion.
We have two main concerns about the approach in SGT. The first is conceptual.
Uncertainty about the underlying DGP is a necessary condition for streak switching in the
model of Rabin and Vayanos (2010). Yet, there is no uncertainty about the DGP in the
Danish lottery, just as there is no uncertainty about the DGP in the Boloto lottery anaylzed
here.8 The SGT analysis is premised on testing a prediction of Rabin and Vayanos (2010)
that does not apply to the setting.
Second, SGT make a number of puzzling specification choices. They omit player and
round fixed effects, include a lagged dependent variable, impose linearity on the relationship
between streak length and betting outcomes, and impute zeroes to strictly balance the panel
(in some specifications). One can quibble with these choices to varying degrees. At a
minimum, it is difficult to understand why player fixed effects would be excluded from the
analysis, when a key innovation relative to prior work is the ability to identify players.
To shed light on the apparent discrepancy in findings from Haiti and Denmark, we
re-analyze the data from SGT. The Danish lottery data covers all plays in an online, weekly
lottery game called System Lotto, over a period of 28 weeks in 2005. Seven winning numbers
are drawn each week, without replacement, from the positive integers 1, . . . , 36. The data
set includes at least one choice by 25,807 players. To participate, players select between
8 and 31 numbers, and the online system randomly selects 7 of those numbers to be the
player’s bet. To increase the wager on a number, players can purchase more tickets and/or
choose fewer numbers per bet. SGT calculate the dependent variable “Money bet”, at the
player-round-number level, as the implied value of the bet on each number.9
Using the Danish data, we estimate a model that is almost identical to one of SGT’s
8The exception, as we noted in Section 2, could be that players do not believe that the lottery is i.i.d.(a possibility not mentioned in SGT).
9Money Bet on number i is the total value of the player’s bets that round, times i’s share of all numbersselected. See SGT for more details about the lottery and data.
12
main specifications. They regress Money Bet on Winnern,r−1, Hotnessnr, the interaction
term Winnern,r−1 × Hotnessnr, number fixed effects, a lagged dependent variable, and a
dummy variable for weeks with larger jackpots. We estimate the same specification, but
replace Hotnessnr with the levels and interactions of the variables Hotnrc, to allow for non-
linearity in streak length. The time period for defining streaks is the previous six rounds (as
in SGT). We also estimate equations (1), (2), and (3), and the fully non-parametric model
using all observed combinations of wins over the previous six rounds. All regressions are at
the player-number-round level, with standard errors clustered by player, and Money Bet as
the dependent variable (Money Bet is roughly comparable to the dependent variable in the
Haiti analysis).
Baseline estimates of equations (1) and (2) for Denmark are shown in panels C and
D of Figure 2. In both panels there is evidence of the GF after recent wins. The average
effect of a win in the previous 1-2 rounds is to reduce the amount bet by 0.013-0.022 Danish
krone (DKK). These are much smaller effects than we found in Haiti, representing less than
one percent of the mean bet of 2.76 DKK. The deterrent effect of a win disappears after two
rounds, suggesting that players do not react systematically to wins that occurred three or
more weeks previously.
Estimates of the slightly modified SGT specification are reported in column 4 of Table
1. Three of the five estimated marginal effects are negative and statistically significant; the
other two are not statistically significant (Panel B). When these same marginal effects are
estimated in SGT, the assumption of linearity in Hotness creates the spurious appearance
of a trend that leads to hot hand betting after longer streaks. After relaxing the linear-
ity assumption, we find no statistically significant evidence of hot hand betting after long
streaks. When we go a step further and estimate our preferred specification using the Danish
data (specification (3)), all of the estimated marginal effects are negative (Table 1, column
5). Only the effects of streaks of length 1 and 2 are statistically different from zero. The
magnitude of the deterrent effect does not increase in streak length (see Section 7).
Columns 4-6 of Table 2 show the non-parametric estimates for Denmark. There are
six statistically significant effects for combinations that include a win in the previous round
(lag 1). All are negative, except for the weakly positive effect of winning in periods {1, 3, 5,
13
6}. This appears to be spurious, as the point estimates for wins in lags {1, 2, 3, 5}, {1, 2, 4,
5}, and {1, 2, 3, 4, 6} are all negative (if imprecise), and we see in Table 1 (column 5) that
the average effects of streaks of any length are never positive and statistically significant.
7 Discussion
Our analysis of lottery data from Haiti and Denmark shows that the average player falls prey
to the GF by avoiding numbers that recently won. This is consistent with the predictions of
Rabin and Vayanos (2010) when the DGP is known to be i.i.d. and players are dogmatically
inclined to believe that small samples should look like large samples. We also find evidence
of ideological attachment to the HHF by a small share of players. In Haiti, 8% of players
bet a recent winner at least half the times they play, and 4% always bet a recent winner.
In the Haiti analysis, when we define streaks as collections of wins over the previous
3 days (6 lottery rounds), we find support for another prediction of Rabin and Vayanos,
namely, that betting on a winner decreases with streak length (Table 1, column 1, Panel B).
We do not find the same pattern for streaks defined over 7 or 30 days. This could be due
to systematic error in streak definition. If past wins are salient for less than 7 or 30 days
for some players, our estimates of the average deterrent effect of wins over those periods will
be attenuated, because some players are reacting to streaks that they perceive to be shorter
than those defined by us. Panels A and B of Figure 2 suggest that in Haiti there is a slight
GF effect for wins as long ago as 60 rounds (30 days), but the magnitudes are small for
distant wins, indicating a gradual weakening of the overall GF response.
In our re-analysis of the Danish lottery data, we also find that the magnitude of the
GF effect does not increase in streak length. Again, this could be due to the use of an
overly long recall period to define streaks. We define streaks over the previous six rounds
(following SGT), yet, simply regressing money bet on lagged wins indicates that the GF
effect in Denmark disappears after two rounds (Figure 2, panels C and D).
There are other challenges to inference rooted in the structure of the Danish System
Lotto game. System Lotto players choose between 8 and 31 numbers, out of a possible 36.
Seven numbers are winners each week, so that on average, 26.16 out of the 36 numbers will
14
be winners over six consecutive rounds. With limited number choices and a high share of
numbers winning, it is computationally difficult to react consistently to all wins over six
rounds.10 This structure also makes it difficult to identify hot hand tendencies. Panel D of
Figure 1 shows a player-level histogram of the share of played rounds in which the player
chooses a number that was a winner in the previous round. Most Danish players make a hot
hand bet at least 80% of the time. Yet, we have seen that winning slightly reduces betting on
a number for the next two rounds. The small choice set and high winning probability at the
number level make it difficult to distinguish genuine hot hand tendencies from a preference
for picking many numbers, some of which will mechanically be recent winners. Of course,
we cannot rule out other explanations for this high rate of hot hand play in Denmark. But
what it highights is that although the overall takeaway from the two settings is similar, the
design of the Haitian Boloto game allows for cleaner testing of GF and HHF beliefs than the
Danish System Lotto game.
Our analysis confirms the prominence of the GF in how people understand repeated
draws from a random process. Underneath this average GF response is a minority of players
that doggedly adheres to the opposing HHF. This discovery of two distinct player types in
high-frequency administrative data may provide new foundations for models in which players
of finite types make (potentially biased) choices that have important influence on aggregate
outcomes. Although it may be too early to use these specific behavioral insights to inform
the design of financial services and products (e.g., Gertler et al., 2018; Cole, Iverson and
Tufano, 2014; Dizon and Lybbert, 2019), they contribute to a more complete picture of
gambling tendencies, which may ultimately enable more productive financial management
and greater financial inclusion among the poor.
10We see some evidence of this in Panels C and D of Figure 2, where Danish players significantly increasetheir bets on numbers that won 6-7 rounds ago. This is likely mechanical. If players are avoiding recentwinners, they can pick a high share of all numbers, and there are 7/36 winners each week, then players willbe forced to pick some not-so-recent winners.
15
References
Benjamin, Daniel J. 2019. “Errors in probabilistic reasoning and judgment biases.” In Handbook of Be-havioral Economics: Applications and Foundations 1. Vol. 2, 69–186. Elsevier.
Bernstein, Rachel L. 2015. “In Pursuit of the Transformational Sum: Lottery and Savings in Haiti.”University of California, Davis.
Bhatia, Pooja. 2010. “Dream Ticket.” The National, Friday, April 2: 3–5.
Brune, Lasse. 2015. “The Effect of Lottery-Incentives on Labor Supply: A Firm Experiment in Malawi.”Economic Growth Center, Yale University.
Camerer, Colin F. 1989. “Does the Basketball Market Believe in theHot Hand,’?” The American EconomicReview, 79(5): 1257–1261.
Chen, Daniel L, Tobias J Moskowitz, and Kelly Shue. 2016. “Decision making under the gambler’sfallacy: Evidence from asylum judges, loan officers, and baseball umpires.” The Quarterly Journal ofEconomics, 131(3): 1181–1242.
Clotfelter, Charles T, and Philip J Cook. 1993. “Notes: The “gambler’s fallacy” in lottery play.”Management Science, 39(12): 1521–1525.
Cole, Shawn Allen, Benjamin Charles Iverson, and Peter Tufano. 2014. “Can gambling increasesavings? empirical evidence on prize-linked savings accounts.” Empirical Evidence on Prize-Linked SavingsAccounts (August 8, 2014). Saıd Business School WP, 10.
Dizon, Felipe, and Travis J Lybbert. 2019. “Leveraging the lottery for financial inclusion: Lotto-linkedsavings accounts in Haiti.” Economic Development and Cultural Change.
Edwards, Ward. 1961. “Probability learning in 1000 trials.” Journal of Experimental Psychology,62(4): 385.
Gertler, Paul, Sean Higgins, Aisling Scott, and Enrique Seira. 2018. “The Long-Term E ects ofTemporary Incentives to Save: Evidence from a Prize-Linked Savings Field Experiment.”
Herskowitz, Sylvan. 2016. “Gambling, Saving, and Lumpy Expenditures: Sports Betting in Uganda.”
Kahneman, Daniel, Stewart Paul Slovic, Paul Slovic, and Amos Tversky. 1982. Judgment underuncertainty: Heuristics and biases. Cambridge university press.
Langer, Ellen J. 1975. “The illusion of control.” Journal of personality and social psychology, 32(2): 311.
Miller, Joshua B, and Adam Sanjurjo. 2018. “Surprised by the hot hand fallacy? A truth in the lawof small numbers.” Econometrica, forthcoming.
Rabin, Matthew. 2002. “Inference by believers in the law of small numbers.” The Quarterly Journal ofEconomics, 117(3): 775–816.
Rabin, Matthew, and Dimitri Vayanos. 2010. “The gambler’s and hot-hand fallacies: Theory andapplications.” The Review of Economic Studies, 77(2): 730–778.
Suetens, Sigrid, Claus B Galbo-Jørgensen, and Jean-Robert Tyran. 2016. “Predicting lotto num-bers: a natural experiment on the gambler’s fallacy and the hot-hand fallacy.” Journal of the EuropeanEconomic Association, 14(3): 584–607.
Xu, Juemin, and Nigel Harvey. 2014. “Carry on winning: The gamblers’ fallacy creates hot hand effectsin online gambling.” Cognition, 131(2): 173–180.
16
8 Figures0
5000
1000
015
000
Num
ber o
f bet
s
01feb
2012
01may
2012
01au
g201
2
01no
v201
2
20jan
2013
Date
0: 2.7%
10: 4.7%
11: 2.5% 33: 2.4%
13: 2.1%
0
.01
.02
.03
.04
.05
Den
sity
0 20 40 60 80 100Number played
A. Number of bets by round, Haiti B. Histogram of numbers played, Haiti
0
5
10
15
20
25
Den
sity
0 .2 .4 .6 .8 1Share of played rounds with at least one hot play
0
10
20
30
Den
sity
0 .2 .4 .6 .8 1Share of played rounds with at least one hot play
C. Hot-hand betting, player level, Haiti D. Hot-hand betting, player level, Denmark
Figure 1: Distribution of lottery play and hot hand play
Notes: Authors’ calculations from administrative lottery data. Summer is defined as July-October. Panels
A, B, and D based on full datasets. Panel C based on 10% subsample from Haiti.
17
-.04
-.03
-.02
-.01
0
Estim
ated
coe
ffic
ient
w/ 9
5% C
.I.
0 20 40 60 80Lag
-.06
-.04
-.02
0
.02
Estim
ated
coe
ffic
ient
w/ 9
5% C
.I.
0 20 40 60 80Lag
A. Haiti: number selected B. Haiti: number most recently selected
-.03
-.02
-.01
0
.01
.02
Estim
ated
coe
ffic
ient
w/ 9
5% C
.I.
0 2 4 6 8 10Lag
-.04
-.02
0
.02
Estim
ated
coe
ffic
ient
w/ 9
5% C
.I.
0 2 4 6 8 10Lag
C. Denmark: number selected D. Denmark: number most recently selected
Figure 2: Coefficients on binary variables for wins during previous rounds
18
9 Tables
19
Table 1: The Effects of Winning Streaks on Betting
Dependent variable: Haiti number of bets placed by player i on number n in round r (Playedinr)Denmark amount of money bet by player i on number n in round r
HAITI DENMARKLag used to define streaks Specification
3 days 7 days 30 days Modified original Ours(1) (2) (3) (4) (5)
Panel A: Estimated CoefficientsWinner -0.011*** -0.013*** -0.015*** -0.061*** -0.033***
(0.004) (0.004) (0.005) (0.010) (0.010)Hot 1 -0.032*** -0.026*** -0.022*** -0.007* -0.009
(0.002) (0.002) (0.002) (0.004) (0.006)Hot 2 -0.040*** -0.035*** -0.034*** 0.003 -0.005
(0.003) (0.002) (0.002) (0.005) (0.007)Hot 3 -0.034*** -0.040*** -0.040*** -0.015** -0.006
(0.012) (0.003) (0.003) (0.006) (0.010)Hot 4 -0.037*** -0.034*** 0.010 -0.022
(0.007) (0.003) (0.019) (0.019)Hot 5 -0.031***
(0.005)Winner × Hot 1 0.022*** 0.010*** 0.007* 0.051*** 0.020**
(0.003) (0.003) (0.004) (0.009) (0.008)Winner × Hot 2 0.020* 0.027*** 0.016*** 0.053*** 0.020**
(0.010) (0.006) (0.003) (0.010) (0.009)Winner × Hot 3 0.039*** 0.029*** 0.019 0.032**
(0.015) (0.006) (0.019) (0.016)Winner × Hot 4 0.038** 0.106** 0.006
(0.016) (0.045) (0.038)Winner × Hot 5 0.027*
(0.015)Observations 1.39e+07 1.39e+07 1.39e+07 1.01e+07 1.04e+07R2 0.044 0.044 0.044 0.485 0.347Mean of dep. variable 0.095 0.095 0.095 2.756 2.762Panel B: Marginal EffectsStreak length 1 -0.011*** -0.013*** -0.015*** -0.061*** -0.033***
(0.004) (0.004) (0.005) (0.010) (0.010)Streak length 2 -0.021*** -0.029*** -0.030*** -0.018** -0.021**
(0.004) (0.003) (0.004) (0.008) (0.010)Streak length 3 -0.031*** -0.021** -0.033*** -0.005 -0.017
(0.010) (0.009) (0.005) (0.009) (0.012)Streak length 4 -0.013 -0.026*** -0.058*** -0.007
(0.015) (0.006) (0.015) (0.017)Streak length 5 -0.010 0.055 -0.049
(0.019) (0.038) (0.039)Streak length 6 -0.018
(0.016)
Notes: Authors’ calculations from administrative lottery data. Regressions in columns 1, 2, 3, and 5 includeplayer, number, and round fixed effects, with standard errors clustered at the player level. The regression incolumn 4 is identical to the
20
Table 2: Effects of Win Streaks: Non-Parametric Approach
Dependent variable: Haiti number of bets placed by player i on number n in round r (Playedinr)Denmark amount of money bet by player i on number n in round r
HAITI DENMARK
CoefficientStandarderror
Number ofoccurrences Coefficient
Standarderror
Number ofoccurrences
(1) (2) (3) (4) (5) (6)Win in lag 1 -0.014*** 0.004 1867 -0.032*** 0.010 61Win in lag 2 -0.040*** 0.004 1870 -0.030*** 0.009 65Win in lag 3 -0.037*** 0.003 1870 -0.009 0.007 60Win in lag 4 -0.035*** 0.003 1867 -0.003 0.006 63Win in lag 5 -0.029*** 0.003 1864 -0.002 0.006 65Win in lag 6 -0.026*** 0.002 1852 0.002 0.006 60Win in lag 1, 2 -0.014** 0.006 52 -0.012 0.013 16Win in lag 1, 3 -0.026*** 0.005 56 -0.030** 0.012 21Win in lag 1, 4 -0.033*** 0.008 54 -0.033** 0.013 17Win in lag 1, 5 -0.020*** 0.006 58 -0.024* 0.014 15Win in lag 1, 6 -0.028*** 0.006 58 -0.008 0.013 21Win in lag 2, 3 -0.038*** 0.004 51 -0.014 0.012 15Win in lag 2, 4 -0.042*** 0.005 55 -0.004 0.013 18Win in lag 2, 5 -0.039*** 0.006 52 -0.026** 0.012 19Win in lag 2, 6 -0.033*** 0.005 59 -0.007 0.012 16Win in lag 3, 4 -0.042*** 0.004 50 -0.013 0.012 15Win in lag 3, 5 -0.037*** 0.007 55 0.012 0.011 20Win in lag 3, 6 -0.041*** 0.005 54 -0.010 0.010 23Win in lag 4, 5 -0.031*** 0.005 49 0.001 0.011 14Win in lag 4, 6 -0.040*** 0.006 58 0.022** 0.010 17Win in lag 5, 6 -0.028*** 0.005 49 -0.002 0.011 15Win in lag 1, 2, 3 0.001 0.044 1 -0.018 0.020 4Win in lag 1, 2, 4 0.030 0.034 2 -0.019 0.017 6Win in lag 1, 2, 5 -0.051** 0.023 2 -0.013 0.020 3Win in lag 1, 2, 6 -0.016 0.034 1Win in lag 1, 3, 4 -0.032** 0.014 3 -0.021 0.023 3Win in lag 1, 3, 5 -0.021 0.019 4Win in lag 1, 3, 6 -0.029 0.046 1 0.013 0.033 2Win in lag 1, 4, 5 -0.043** 0.018 3 -0.003 0.019 6Win in lag 1, 4, 6 -0.022 0.018 5Win in lag 1, 5, 6 -0.110*** 0.022 1 -0.049** 0.023 3Win in lag 2, 3, 4 -0.054*** 0.010 1 -0.041* 0.022 3Win in lag 2, 3, 5 -0.019 0.041 2 0.004 0.023 3Win in lag 2, 3, 6 -0.089*** 0.015 2 -0.015 0.020 3Win in lag 2, 4, 5 -0.057*** 0.011 3 -0.028 0.026 2Win in lag 2, 4, 6 0.011 0.020 5Win in lag 2, 5, 6 -0.069*** 0.011 3 -0.002 0.018 7Win in lag 3, 4, 5 -0.032** 0.013 1 0.012 0.018 5Win in lag 3, 4, 6 -0.027 0.041 2 -0.043** 0.019 4Win in lag 3, 5, 6 -0.073*** 0.010 3 0.053 0.034 1Win in lag 4, 5, 6 -0.055*** 0.013 1 0.000 0.018 5Win in lag 1, 2, 3, 5 -0.028 0.023 3Win in lag 1, 2, 4, 5 -0.017 0.034 1Win in lag 1, 2, 4, 6 0.003 0.030 1Win in lag 1, 3, 5, 6 0.065* 0.035 1Win in lag 1, 4, 5, 6 -0.013 0.038 1Win in lag 2, 3, 4, 5 0.015 0.044 1Win in lag 2, 3, 4, 6 0.003 0.025 2Win in lag 2, 3, 5, 6 -0.097*** 0.035 1Win in lag 3, 4, 5, 6 -0.015 0.037 1Win in lag 1, 2, 3, 4, 6 -0.049 0.040 1Observations 14024200 10379916R2 .044 .347
Notes: Authors’ calculations from administrative lottery data. All regressions include player, number, and round fixed effects.No coefficient is listed for those combinations of lagged wins that are not present in the Haiti data.
21