the gambler’s fallacy dominates the hot handdillon.dyson.cornell.edu/cv_papers/dillon lybbert -...

The Gambler’s Fallacy Dominates the Hot Hand

in Lottery Play∗†

Brian Dillon∗‡ Travis J. Lybbert‡

April 29, 2020

Working paper, comments welcome.

Abstract

We use a year of individually identifiable, administrative data from a mobile lottery inHaiti to examine how players react to the winning histories of numbers. The averageplayer avoids numbers that recently won. This gambler’s fallacy effect gradually fadesover several weeks. A small share of players exhibit the opposite tendency and alwaysbet a number that just won (the hot hand fallacy). We re-analyze the data from arecent paper that finds evidence of streak switching—betting with the gambler’s fal-lacy after short streaks and the hot hand fallacy after long streaks—in administrativelottery data from Denmark. We show that this is an artifact of an overly restrictiveparametric model. Relaxing simple assumptions in the Denmark analysis confirms thatthere is no general pattern of streak switching, and that the gambler’s fallacy prevailsin both settings.

Keywords: gambler’s fallacy; hot-hand fallacy; lottery; law of small numbers; Haiti.

∗†We are grateful to Hilary Wething and Ben Glasner for excellent research assistance, and to Dan Ben-jamin for comments on an earlier draft. Any errors are our responsibility.∗‡Cornell University. Email: [email protected].‡University of California, Davis. Email: [email protected].

1

1 Introduction

Without formal training, many people struggle to form correct statistical intuitions. This

paper focuses on two common types of mistaken intuition: the gambler’s fallacy (GF) and

the hot hand fallacy (HHF). The GF is the belief that a number drawn from an independent,

identically distributed (i.i.d.) process is less likely to be drawn immediately after it wins.

This belief in too little serial correlation is rooted in the mistaken sense that small samples

should look like large samples (Kahneman et al., 1982; Rabin, 2002). The HHF is the belief

in too much serial correlation, namely that a number is more likely to be drawn if it has

won in the recent past.

A large literature has documented the prevalence of these biases and examined their

consequences for choice and belief formation (see Benjamin (2019) for a review). Many

of those studies are in the realms of finance, law, or sports, where neither the agents nor

the econometricians know the true time series properties of the data generating process

(DGP) (Camerer, 1989; Xu and Harvey, 2014; Chen, Moskowitz and Shue, 2016). Laboratory

experiments remove uncertainty about the DGP, but suffer from the usual concerns about

the artificiality of the environment. Even when the DGP is known, the analysis of streaks

in a finite sample is subject to subtle but important challenges for inference (Miller and

Sanjurjo, 2018).

Lottery games in which players select their own numbers provide an ideal setting

for studying the GF and HHF, as they represent natural experiments with real stakes and

known DGPs. Foundational work by Clotfelter and Cook (1993) tests for the presence of

the GF and HHF in aggregate lottery data. They find that betting on a number falls after

it wins, and that this GF effect persists for many future rounds. With aggregate data they

cannot distinguish changes in the composition of the player pool from within-player changes

in beliefs, and cannot determine whether the average GF effect reflects a mixture of both GF

and HHF responses. Suetens, Galbo-Jørgensen and Tyran (2016) (hereafter SGT) overcome

these challenges by using individually identifiable administrative data from an online lottery

in Denmark to test for GF and HHF betting. SGT find evidence of the GF. They also claim

to find support for “streak switching,” a more complex bias that accommodates both the GF

2

and the HHF (Rabin and Vayanos, 2010). A streak switching player avoids numbers after

short streaks, but comes to believe after a sufficiently long streak that the DGP is biased in

favor of the streaking number, and then begins to place hot hand bets.

In this paper we use a year’s worth of individually identifiable, administrative data

from a twice-daily lottery in Haiti to address two research questions. First, for how long and

in what direction do past wins influence the amount of betting on a number? Second, do

lottery players react differently to short win streaks and long win streaks? Consistent with

the theoretical predictions of Rabin and Vayanos (2010) for an i.i.d. process, we find that

the GF dominates. The average player systematically avoids numbers that recently won.

The deterrent effect of a win decays gradually as the win fades into the past, but persists

for as long as a month. We find no evidence of streak switching. We then re-analyze the

lottery data used by SGT and find that the appearance of streak switching in Denmark is an

illusion created by questionable empirical choices. Alternative specifications, including one

that relaxes a linearity assumption in SGT but otherwise preserves their approach, indicate

that the GF dominates in Denmark as it does in Haiti.

Because our data set includes individual identifiers, we can detect betting patterns

that are hidden in aggregate data. While the average player bets in accordance with the

GF, we find in Haiti that 8% of players select a number that won in the previous round at

least 50% of the time that they bet, consistent with the HHF.1 This suggests the presence

of (at least) two player types in lottery play, and likely in other similar settings.

The broad similarity between findings from Denmark and Haiti indicates that the in-

tuition underlying the GF is deeply rooted in human cognition. The results from Haiti may

also have practical value. Economists have recently begun to take seriously the prominent

role of gambling in the financial lives of the poor in developing countries (Bernstein, 2015;

Herskowitz, 2016). This growing literature is opening promising new policy and behavioral

design possibilites (Brune, 2015; Cole, Iverson and Tufano, 2014; Gertler et al., 2018), in-

cluding lottery-linked savings in Haiti (Dizon and Lybbert, 2019). A deeper understanding

of the gambling proclivities of the poor is an important first step toward designing behav-

1We explain later how the structure of the Danish lottery makes it impossible to interpret the results ofa similar analysis.

3

ioral finance interventions that accommodate these longstanding habits while also building

pathways toward greater financial stability.

2 Theoretical Framework

The GF and the HHF may appear to be mutually exclusive, because they suggest opposite

reactions to recent events. However, a longstanding tradition in psychology and more recently

economics allows for the possibility that these fallacies may be related (Edwards, 1961;

Camerer, 1989; Rabin, 2002; Rabin and Vayanos, 2010). Rabin and Vayanos (2010) develop

a model that formalizes the possible connection between the two fallacies. The agent in the

model is dogmatically inclined to believe in the GF, possibly because she thinks that small

samples should look like large samples. She observes a sequence of draws from a DGP and

uses Bayesian inference to form beliefs about parameters. When the agent is uncertain about

the DGP or believes that draws are serially dependent, her reaction to a win streak may be

different for short and long streaks. Such streak switching occurs because when a number

exhibits a short winning streak, she believes it is even less likely to appear again in the near

future, due to her GF beliefs. If the streak continues, she over-interprets this as a signal

that the DGP favors the number in question, and updates her beliefs accordingly. She now

expects the number to keep winning. Beyond some threshold streak length, the agent flips

from the GF to the HHF.

Uncertainty about the DGP is central to the streak switching prediction. When the

DGP is fixed and draws are i.i.d.—which is an apt description of a lottery—the agent always

expects reversals (i.e., displays GF reasoning), leaving no room for the HHF to emerge. Rabin

and Vayanos (2010) make this explicit in their Prediction 1 (p. 751): “When individuals

observe i.i.d. signals and are told this information, they expect reversals after streaks of any

length. The effect is stronger for long streaks.” As long as lottery players believe that the

lottery is an i.i.d. process, our analysis is a test of Prediction 1, and hence a test of whether

the average lotto player exhibits dogmatic attachment to the GF.

Of course, some people may not believe that the lottery is i.i.d. Many players choose

numbers based on sentiment, premonition, or recent events, suggesting a belief that lottery

4

odds are mutable or governed by supernatural forces. For many Haitians, superstitions about

the lottery are part of a broader religious worldview in which divine intervention and fate

have direct bearing on daily life (Bhatia, 2010). Active engagement in number choice not

only makes lotteries more entertaining, but also may lead players to believe that the odds

change in response to their participation. Prior work shows that some lottery players exhibit

an illusion of control—a mistaken belief that a number is more likely to win if they actively

choose it rather than have it randomly assigned (Langer, 1975).

To accommodate possibile heterogeneity in players’ beliefs about the lottery DGP,

our analysis allows for different responses to short and long winning streaks. Prediction 1

of Rabin and Vayanos suggests that the GF will dominate and the average player will avoid

recent winners after streaks of any length. Evidence of streak switching would indicate either

that the prediction is wrong, or that the average player does not believe the lottery is i.i.d.

3 Setting and Data

3.1 The Mobile Phone Lottery in Haiti

The lottery is part of the rhythm of daily life in Haiti. Millions of Haitians play frequently,

and some of the working poor routinely wager a large share of their daily income on lottery

games (Bernstein, 2015). Players often select numbers based on superstitions and dreams.

Concordances known as the Tchala, which are available at every lottery stall and online,2

translate elements in one’s dreams into numbers. To ensure transparency and trust, the wide

array of lottery games in Haiti are all based on numbers drawn in the New York Lottery.

While most of these games are adminsitered by physical lotto stalls called borlettes, digital

lotteries played on mobile phones have gained popularity in recent years, particularly among

younger Haitians in urban and peri-urban areas.

We study a mobile phone lottery game called Boloto. Like all Haitian lottery games,

Boloto is played twice each day, corresponding to the midday and evening numbers drawn in

New York. To participate, a player places a bet consisting of three two-digit number pairs

2For an online version of the Tchala see http://lisa.ht/tchala/ (Accessed 21 November 2018).

5

(00-99) in a specified order. The cost of each bet is 25 Haitian gourdes (HTG), or about

0.60 USD in 2012. There is no limit to the number of bets a single player can make in each

round. To bet more money on a set of numbers, a player simply places additional bets.

The payout for Boloto is a function of which number matches the draw. Winning in

the first, second, or third position pays out 250 HTG (10x), 100 HTG (4x) or 50 HTG (2x),

respectively. If all three numbers are drawn, but not in order, the player wins 100,000 HTG

(4,000x). If all three numbers are drawn in order, the player wins the jackpot, which pays

out 2,000,000 HTG (80,000x). Payouts are independent across players—anyone playing a

winning number wins the full payout associated with the bet.3

3.2 Data and Descriptive Statistics

This section describes the administrative lottery data from Haiti we use for our main analysis.

In Section 6 we briefly describe SGT’s dataset from Denmark, which we re-analyze for

comparison to our Haitian results.

The private firm licensed to conduct the digital lottery in Haiti provided us with

access to data for the universe of bets placed in Boloto from February 1, 2012 to January

31, 2013.4 For each bet we observe a player ID, the date of the game, an indicator for the

midday or evening round, the ordered set of three two-digit numbers that constitute the bet,

the time and date that the bet was placed, and the winning numbers. Player IDs are unique

numerical codes linked to mobile phone accounts. Across the 730 rounds (2 per day, for a

year), a total of 4,505,519 bets were placed, with over 13.5 million separate number choices.

The Boloto data includes bets from 112,808 different players. The average player

makes 39.9 bets over the year, in 12.7 different rounds, on 9.1 different days. About 1 in 200

players (0.5%) makes a bet in all 12 months; the average player makes at least one bet in

two separate months.

Our analysis examines the relationship between a number’s winning history and the

3The only exception is for multiple jackpots, in which case the winners split the payout. In practice thisalmost never happens, and in the year’s worth of data that we study there are only a few jackpots and noshared jackpots. We assume throughout that the possibility of splitting a jackpot does not shape individualnumber choices.

4There are 366 days in that range, because 2012 was a leap year, but only 365 days with betting (thereis no data for Christmas Eve).

6

probability that it is bet. We first represent each number selection as 100 separate choices:

1 decision to play a number, and 99 decisions not to play all others. This allows us to take

advantage of both player and number fixed effects. Let dijnrp be a dummy variable equal

to 1 if player i in bet j plays number n in round r in position p, and 0 otherwise. The

position p refers to the first, second, or third number in the bet. The numbers n lie in the

set {0, 1, 2, . . . , 99}. The round, r, includes both the date and the time (midday or evening)

of the game. The bet indicator, j, captures the possibility that a player places multiple bets

per round. We use Jir to denote the number of bets placed by player i in round r.

In our analysis the dependent variable is Playedinr =∑Jir

j=1

∑3p=1{dijnrp}, which

is a count of the number of times in round r that i played n in any position and any

bet.5 This is approximately proportional to the amount wagered on the number. At the

player-number-round level, the full dataset contains 143 million observations. To make the

analysis tractable, we estimate regressions and some descriptive statistics using a random

10% subsample of players, fixed across specifications. This analysis sample consists of 11,348

players who place 140,904 bets, for a total of 14,090,400 observations after reshaping. In the

analysis sample, the mean value of Playedinr is 0.095.

Panel A of Figure 1 shows the number of bets placed, by round. The spike on October

17 coincides with Dessalines Day, a national holiday that commemorates the assisination of

Haiti’s founder. A weekly cycle of activity is clearly visible, as is a seasonal pattern, with

increased activity during the months July–October.6 Panel B of Figure 1 shows the histogram

of numbers played. The most popular choice, 10, represents 4.7% of all plays. The four next

most popular numbers are 0, 11, 33, and 13, all of which are played at more than twice the

random rate. Doubles—22, 44, 55, 77, etc.—are also popular.

There is no easy way to descriptively characterize GF betting in Boloto, because

betting with the GF implies not doing something, opting instead for one of a large set of

alternatives. It is easier to describe HHF betting. Figure 1, Panel C, shows a player-level

5Our findings are broadly similar if we define the dependent variable as PlayedDummyinr =maxJir

j=1{max3p=1{dijnrp}}, which measures the extensive margin choice to play a number at the player-

number-round level. See Appendix.6Greater play in summer could be due to return visits by Haitians living abroad, which peak in July–

August. These visitors may gamble or give cash gifts that prompt more gambling by others. In September-October, increased gambling activity is likely driven by the main harvest. Even in urban areas, economicactivity related to the harvest drives seasonal fluctuations in incomes.

7

histogram of the share of played rounds in which the player chooses a number that was a

winner in the previous round. Nearly two thirds (63.8%) of players never make a hot hand

bet. In contrast, roughly 4% of players make a hot hand bet every time they play, and 8%

make a hot hand bet at least half the times they play.

While the rate of hot hand play is roughly constant across the year, we do observe

occasional spikes in such betting after certain events. In one round nearly 40% of players

make a hot hand bet; the winning numbers from the previous round were 5-50-55. The next

five rounds with the highest shares of hot hand betting occur after a 0 or a 10 was a winner.

4 Empirical Approach

To provide a baseline characterization of how the amount bet on a number is related to

its recent success, we estimate OLS regressions of the following form using the analysis

subsample of the Boloto data:

Playedinr =R∑l=1

βlWinnern,r−l + ηControlsinr + εinr (1)

where Playedinr is as defined in the previous section; Winnern,r−l is a binary variable indi-

cating whether n was one of the drawn numbers in round r − l; Controlsinr includes fixed

effects for players, numbers, and rounds; and εinr is a statistical error term. With a suffi-

ciently large choice of R, a plot of the βl coefficients will non-parametrically trace out the

time path of effects of past wins on current betting. Evidence of βl < 0 (βl > 0) is consistent

with a GF effect (HHF effect) that persists for l rounds.

Specification (1) does not account for the probability that a number drawn in round

r− l will be drawn again prior to round r, which is increasing in l. The effect of winning, es-

pecially winning in the distant past, may be underestimated if a number wins multiple times.

To account for this, we also estimate OLS regressions based on the following specification:

Playedinr =R∑l=1

βlMostRecentWinn,r−l + ηControlsinr + εinr (2)

which is identical to (1), except the key independent variable MostRecentWinn,r−l takes a

8

value of 1 only if r− l is the most recent round in which n was drawn, and 0 otherwise. Once

again, a finding of βl < 0 (βl > 0) is consistent with the GF (HHF).

Estimation of specifications (1) and (2) provides the average effect of lagged wins on

current betting. To test predictions about players’ reactions to streaks, we need to allow for

more complex interactions between past events. Lengthy winning streaks are rare in Boloto;

the probability that a specific number is selected in a round is only 0.0297. Following SGT,

we define a streak as the co-occurence of a win in the previous round with a history of winning

in other recent rounds. Formally, let Hotnessnr be the number of times that n was a winner

during rounds r − 2 to r − S, for some integer S ≥ 2; and let Hotnrc be a dummy variable

equal to 1 if Hotnessnr = c, and 0 otherwise. For each r, the winning streak of number n is

given by Winnern,r−1 × {Winnern,r−1 + Hotnessnr} (e.g., the streak has length 3 if n was

drawn in the previous round and was drawn twice in rounds 2 . . . S). To semi-parametrically

estimate the average response to streaks of different length, we estimate OLS regressions of

the following form:

Playedinr = βWinnern,r−1 +C∑c=1

{δcHotnrc + γc(Winnern,r−1 ×Hotnrc)}+ ηControlsinr + εinr

(3)

where all variables are as defined above, C is sufficiently large to include all observed streaks,

and εinr is a statistical error term. We report results for S ∈ {6, 14, 60}, equivalent to defining

streaks over the previous 3 days, 7 days, and 30 days.

In equation (3), the player, number, and round fixed effects account for average

differences between players, average popularity of numbers, and temporal patterns in betting.

The total effect on current betting of a streak of length 1 is given by β. The total effect of a

streak of length d > 1 is νc = β + δc + γc, where c = d− 1. If players are not influenced by

either the GF or the HHF, we expect β = δc = γc = 0 for all c. The predictions of Rabin and

Vayanos (2010) are equivalent to (i) β < 0, (ii) νc < 0 for all c, and (iii) νd < νc for any d > c

(because the model predicts that longer streaks induce a larger GF effect). Alternatively, if

players bet with the GF after short streaks and the HHF after long streaks—which is not

the prediction of the Rabin and Vayanos model when the process is known to be i.i.d.—then

9

we expect β < 0 and νc > 0 for all c of sufficient length.

Estimates based on equation (3) provide average effects for streaks of a given length.

To allow for complete flexibility in the estimated response to any combination of past wins,

we also estimate a fully non-parametric model for the previous six rounds (S = 6). In

this model, the dependent variable is Playedinr, and the independent variables are dummy

variables for all observed combinations of wins during the previous 6 rounds. As always, we

include player, number, and round fixed effects.

For all models we report standard errors clustered at the player level. Because the

average player participates in just 12.7 out of 730 rounds, we do not impose balance on the

panel. Hence, our analysis takes as given the extensive margin decision to participate in the

lottery in any particular round.

5 Results

For our baseline estimates of equations (1) and (2) we use a set of 84 dummy variables

covering every round in the previous 6 weeks (R = 84). Figure 2 plots the coefficients on

the dummy variables representing a win each lagged period, with 95% confidence intervals.

Panel A reports estimates from specification (1), based on all recent wins; Panel B reports

estimates from specification (2), based on only the most recent win for a number.

The average effect of a number being drawn in lagged rounds 2-84 indicates a sur-

prisingly persistent GF response. Winning never leads to an increase in betting, on average.

Players avoid numbers that have won recently, but the effect is attenuated as the win fades

into the past. In both panels, the deterrent effect of a recent win is statistically significant

for almost 60 rounds (30 days). A win in lagged round 2, 3, or 4 decreases the number of

times a number is selected by 0.032–0.035, a reduction of over a third from the mean of

0.095. The effects are even larger in magnitude when we restrict attention to only the most

recent win (Panel B).

The exception to the pattern of diminishing effect size is from a win in the immediately

preceding round. The point estimate for a win in the preceding round is approximately −0.01

in Panel A, less than a third of the magnitude of the effect of a win in lagged rounds 2–4.

10

This attenuation is likely driven by the small share of hot hand players that bet a number

immediately after it wins (Figure 1, Panel C).

Table 1 shows estimates of equation (3). Columns 1, 2, and 3 report the findings for

streaks defined over the previous 3 days, 7 days, and 30 days, respectively. Panel A reports

coefficient estimates, and Panel B reports the estimated marginal effects (νc). Across all

streak lengths and specifications, there are no positive marginal effects of a winning streak

on the probability that a number is bet. All 13 of the estimated effects in Panel B are

negative, and 10 are statistically different from zero.7 In column 1, the negative effect of a

win streak on the probability that a number is selected is increasing in streak length. Betting

on a number falls by 0.011 percentage points (11.6% of the mean), 0.021 percentage points

(22.1%), and 0.031 percentage points (32.6%) after streaks of length 1, 2, and 3, respectively.

The positive relationship between streak length and the magnitude of the deterrent effect

is consistent with Rabin and Vayanos (2010) when the DGP is i.i.d. (and participants are

aware of that fact). When we define streaks over periods of 7 or 30 days, the GF again

dominates (columns 2 and 3), but the marginal effects do not increase monotonically in

streak length (see Section 7).

Columns 1-3 of Table 2 show estimates from the non-parametric model, in which the

independent variables are dummy variables for all observed combinations of wins during the

previous 3 days (6 rounds). Column 3 reports the number of occurrences of each pattern

over the year of data. All combinations of three wins are observed 3 or fewer times (however,

thousands of people play the lottery after each occurrence). There is no pattern of recent wins

that increases betting on a number, on average. Out of 36 coefficients, 31 are statistically

different from zero, and all of those are negative.

6 Re-analysis of the Danish Lottery Data in SGT

SGT use administrative data from a lottery in Denmark to study the GF and HHF. Their

dataset, like ours, includes individual player identifiers. SGT frame their analysis as a test

7The coefficients that are not statistically different from zero are for the least commonly observed streaklengths. There are only 63 instances of a streak of length 4 in column 2; 45 instances of a streak of length 5in column 3; and 11 instances of a streak of length 6 in column 3.

11

of the streak switching prediction from Rabin and Vayanos (2010), namely, that players bet

with the GF after short streaks and the HHF after long streaks. They claim to find evidence

of this pattern in the behavior of Danish lottery players. That finding is inconsistent with

our findings from Haiti, where the GF predominates after streaks of any length. After

re-analyzing the Danish data, we believe this apparent inconsistency is an illusion.

We have two main concerns about the approach in SGT. The first is conceptual.

Uncertainty about the underlying DGP is a necessary condition for streak switching in the

model of Rabin and Vayanos (2010). Yet, there is no uncertainty about the DGP in the

Danish lottery, just as there is no uncertainty about the DGP in the Boloto lottery anaylzed

here.8 The SGT analysis is premised on testing a prediction of Rabin and Vayanos (2010)

that does not apply to the setting.

Second, SGT make a number of puzzling specification choices. They omit player and

round fixed effects, include a lagged dependent variable, impose linearity on the relationship

between streak length and betting outcomes, and impute zeroes to strictly balance the panel

(in some specifications). One can quibble with these choices to varying degrees. At a

minimum, it is difficult to understand why player fixed effects would be excluded from the

analysis, when a key innovation relative to prior work is the ability to identify players.

To shed light on the apparent discrepancy in findings from Haiti and Denmark, we

re-analyze the data from SGT. The Danish lottery data covers all plays in an online, weekly

lottery game called System Lotto, over a period of 28 weeks in 2005. Seven winning numbers

are drawn each week, without replacement, from the positive integers 1, . . . , 36. The data

set includes at least one choice by 25,807 players. To participate, players select between

8 and 31 numbers, and the online system randomly selects 7 of those numbers to be the

player’s bet. To increase the wager on a number, players can purchase more tickets and/or

choose fewer numbers per bet. SGT calculate the dependent variable “Money bet”, at the

player-round-number level, as the implied value of the bet on each number.9

Using the Danish data, we estimate a model that is almost identical to one of SGT’s

8The exception, as we noted in Section 2, could be that players do not believe that the lottery is i.i.d.(a possibility not mentioned in SGT).

9Money Bet on number i is the total value of the player’s bets that round, times i’s share of all numbersselected. See SGT for more details about the lottery and data.

12

main specifications. They regress Money Bet on Winnern,r−1, Hotnessnr, the interaction

term Winnern,r−1 × Hotnessnr, number fixed effects, a lagged dependent variable, and a

dummy variable for weeks with larger jackpots. We estimate the same specification, but

replace Hotnessnr with the levels and interactions of the variables Hotnrc, to allow for non-

linearity in streak length. The time period for defining streaks is the previous six rounds (as

in SGT). We also estimate equations (1), (2), and (3), and the fully non-parametric model

using all observed combinations of wins over the previous six rounds. All regressions are at

the player-number-round level, with standard errors clustered by player, and Money Bet as

the dependent variable (Money Bet is roughly comparable to the dependent variable in the

Haiti analysis).

Baseline estimates of equations (1) and (2) for Denmark are shown in panels C and

D of Figure 2. In both panels there is evidence of the GF after recent wins. The average

effect of a win in the previous 1-2 rounds is to reduce the amount bet by 0.013-0.022 Danish

krone (DKK). These are much smaller effects than we found in Haiti, representing less than

one percent of the mean bet of 2.76 DKK. The deterrent effect of a win disappears after two

rounds, suggesting that players do not react systematically to wins that occurred three or

more weeks previously.

Estimates of the slightly modified SGT specification are reported in column 4 of Table

1. Three of the five estimated marginal effects are negative and statistically significant; the

other two are not statistically significant (Panel B). When these same marginal effects are

estimated in SGT, the assumption of linearity in Hotness creates the spurious appearance

of a trend that leads to hot hand betting after longer streaks. After relaxing the linear-

ity assumption, we find no statistically significant evidence of hot hand betting after long

streaks. When we go a step further and estimate our preferred specification using the Danish

data (specification (3)), all of the estimated marginal effects are negative (Table 1, column

5). Only the effects of streaks of length 1 and 2 are statistically different from zero. The

magnitude of the deterrent effect does not increase in streak length (see Section 7).

Columns 4-6 of Table 2 show the non-parametric estimates for Denmark. There are

six statistically significant effects for combinations that include a win in the previous round

(lag 1). All are negative, except for the weakly positive effect of winning in periods {1, 3, 5,

13

6}. This appears to be spurious, as the point estimates for wins in lags {1, 2, 3, 5}, {1, 2, 4,

5}, and {1, 2, 3, 4, 6} are all negative (if imprecise), and we see in Table 1 (column 5) that

the average effects of streaks of any length are never positive and statistically significant.

7 Discussion

Our analysis of lottery data from Haiti and Denmark shows that the average player falls prey

to the GF by avoiding numbers that recently won. This is consistent with the predictions of

Rabin and Vayanos (2010) when the DGP is known to be i.i.d. and players are dogmatically

inclined to believe that small samples should look like large samples. We also find evidence

of ideological attachment to the HHF by a small share of players. In Haiti, 8% of players

bet a recent winner at least half the times they play, and 4% always bet a recent winner.

In the Haiti analysis, when we define streaks as collections of wins over the previous

3 days (6 lottery rounds), we find support for another prediction of Rabin and Vayanos,

namely, that betting on a winner decreases with streak length (Table 1, column 1, Panel B).

We do not find the same pattern for streaks defined over 7 or 30 days. This could be due

to systematic error in streak definition. If past wins are salient for less than 7 or 30 days

for some players, our estimates of the average deterrent effect of wins over those periods will

be attenuated, because some players are reacting to streaks that they perceive to be shorter

than those defined by us. Panels A and B of Figure 2 suggest that in Haiti there is a slight

GF effect for wins as long ago as 60 rounds (30 days), but the magnitudes are small for

distant wins, indicating a gradual weakening of the overall GF response.

In our re-analysis of the Danish lottery data, we also find that the magnitude of the

GF effect does not increase in streak length. Again, this could be due to the use of an

overly long recall period to define streaks. We define streaks over the previous six rounds

(following SGT), yet, simply regressing money bet on lagged wins indicates that the GF

effect in Denmark disappears after two rounds (Figure 2, panels C and D).

There are other challenges to inference rooted in the structure of the Danish System

Lotto game. System Lotto players choose between 8 and 31 numbers, out of a possible 36.

Seven numbers are winners each week, so that on average, 26.16 out of the 36 numbers will

14

be winners over six consecutive rounds. With limited number choices and a high share of

numbers winning, it is computationally difficult to react consistently to all wins over six

rounds.10 This structure also makes it difficult to identify hot hand tendencies. Panel D of

Figure 1 shows a player-level histogram of the share of played rounds in which the player

chooses a number that was a winner in the previous round. Most Danish players make a hot

hand bet at least 80% of the time. Yet, we have seen that winning slightly reduces betting on

a number for the next two rounds. The small choice set and high winning probability at the

number level make it difficult to distinguish genuine hot hand tendencies from a preference

for picking many numbers, some of which will mechanically be recent winners. Of course,

we cannot rule out other explanations for this high rate of hot hand play in Denmark. But

what it highights is that although the overall takeaway from the two settings is similar, the

design of the Haitian Boloto game allows for cleaner testing of GF and HHF beliefs than the

Danish System Lotto game.

Our analysis confirms the prominence of the GF in how people understand repeated

draws from a random process. Underneath this average GF response is a minority of players

that doggedly adheres to the opposing HHF. This discovery of two distinct player types in

high-frequency administrative data may provide new foundations for models in which players

of finite types make (potentially biased) choices that have important influence on aggregate

outcomes. Although it may be too early to use these specific behavioral insights to inform

the design of financial services and products (e.g., Gertler et al., 2018; Cole, Iverson and

Tufano, 2014; Dizon and Lybbert, 2019), they contribute to a more complete picture of

gambling tendencies, which may ultimately enable more productive financial management

and greater financial inclusion among the poor.

10We see some evidence of this in Panels C and D of Figure 2, where Danish players significantly increasetheir bets on numbers that won 6-7 rounds ago. This is likely mechanical. If players are avoiding recentwinners, they can pick a high share of all numbers, and there are 7/36 winners each week, then players willbe forced to pick some not-so-recent winners.

15

References

Benjamin, Daniel J. 2019. “Errors in probabilistic reasoning and judgment biases.” In Handbook of Be-havioral Economics: Applications and Foundations 1. Vol. 2, 69–186. Elsevier.

Bernstein, Rachel L. 2015. “In Pursuit of the Transformational Sum: Lottery and Savings in Haiti.”University of California, Davis.

Bhatia, Pooja. 2010. “Dream Ticket.” The National, Friday, April 2: 3–5.

Brune, Lasse. 2015. “The Effect of Lottery-Incentives on Labor Supply: A Firm Experiment in Malawi.”Economic Growth Center, Yale University.

Camerer, Colin F. 1989. “Does the Basketball Market Believe in theHot Hand,’?” The American EconomicReview, 79(5): 1257–1261.

Chen, Daniel L, Tobias J Moskowitz, and Kelly Shue. 2016. “Decision making under the gambler’sfallacy: Evidence from asylum judges, loan officers, and baseball umpires.” The Quarterly Journal ofEconomics, 131(3): 1181–1242.

Clotfelter, Charles T, and Philip J Cook. 1993. “Notes: The “gambler’s fallacy” in lottery play.”Management Science, 39(12): 1521–1525.

Cole, Shawn Allen, Benjamin Charles Iverson, and Peter Tufano. 2014. “Can gambling increasesavings? empirical evidence on prize-linked savings accounts.” Empirical Evidence on Prize-Linked SavingsAccounts (August 8, 2014). Saıd Business School WP, 10.

Dizon, Felipe, and Travis J Lybbert. 2019. “Leveraging the lottery for financial inclusion: Lotto-linkedsavings accounts in Haiti.” Economic Development and Cultural Change.

Edwards, Ward. 1961. “Probability learning in 1000 trials.” Journal of Experimental Psychology,62(4): 385.

Gertler, Paul, Sean Higgins, Aisling Scott, and Enrique Seira. 2018. “The Long-Term E ects ofTemporary Incentives to Save: Evidence from a Prize-Linked Savings Field Experiment.”

Herskowitz, Sylvan. 2016. “Gambling, Saving, and Lumpy Expenditures: Sports Betting in Uganda.”

Kahneman, Daniel, Stewart Paul Slovic, Paul Slovic, and Amos Tversky. 1982. Judgment underuncertainty: Heuristics and biases. Cambridge university press.

Langer, Ellen J. 1975. “The illusion of control.” Journal of personality and social psychology, 32(2): 311.

Miller, Joshua B, and Adam Sanjurjo. 2018. “Surprised by the hot hand fallacy? A truth in the lawof small numbers.” Econometrica, forthcoming.

Rabin, Matthew. 2002. “Inference by believers in the law of small numbers.” The Quarterly Journal ofEconomics, 117(3): 775–816.

Rabin, Matthew, and Dimitri Vayanos. 2010. “The gambler’s and hot-hand fallacies: Theory andapplications.” The Review of Economic Studies, 77(2): 730–778.

Suetens, Sigrid, Claus B Galbo-Jørgensen, and Jean-Robert Tyran. 2016. “Predicting lotto num-bers: a natural experiment on the gambler’s fallacy and the hot-hand fallacy.” Journal of the EuropeanEconomic Association, 14(3): 584–607.

Xu, Juemin, and Nigel Harvey. 2014. “Carry on winning: The gamblers’ fallacy creates hot hand effectsin online gambling.” Cognition, 131(2): 173–180.

16

8 Figures0

5000

1000

015

000

Num

ber o

f bet

s

01feb

2012

01may

2012

01au

g201

2

01no

v201

2

20jan

2013

Date

0: 2.7%

10: 4.7%

11: 2.5% 33: 2.4%

13: 2.1%

0

.01

.02

.03

.04

.05

Den

sity

0 20 40 60 80 100Number played

A. Number of bets by round, Haiti B. Histogram of numbers played, Haiti

0

5

10

15

20

25

Den

sity

0 .2 .4 .6 .8 1Share of played rounds with at least one hot play

0

10

20

30

Den

sity

0 .2 .4 .6 .8 1Share of played rounds with at least one hot play

C. Hot-hand betting, player level, Haiti D. Hot-hand betting, player level, Denmark

Figure 1: Distribution of lottery play and hot hand play

Notes: Authors’ calculations from administrative lottery data. Summer is defined as July-October. Panels

A, B, and D based on full datasets. Panel C based on 10% subsample from Haiti.

17

-.04

-.03

-.02

-.01

0

Estim

ated

coe

ffic

ient

w/ 9

5% C

.I.

0 20 40 60 80Lag

-.06

-.04

-.02

0

.02

Estim

ated

coe

ffic

ient

w/ 9

5% C

.I.

0 20 40 60 80Lag

A. Haiti: number selected B. Haiti: number most recently selected

-.03

-.02

-.01

0

.01

.02

Estim

ated

coe

ffic

ient

w/ 9

5% C

.I.

0 2 4 6 8 10Lag

-.04

-.02

0

.02

Estim

ated

coe

ffic

ient

w/ 9

5% C

.I.

0 2 4 6 8 10Lag

C. Denmark: number selected D. Denmark: number most recently selected

Figure 2: Coefficients on binary variables for wins during previous rounds

18

9 Tables

19

Table 1: The Effects of Winning Streaks on Betting

Dependent variable: Haiti number of bets placed by player i on number n in round r (Playedinr)Denmark amount of money bet by player i on number n in round r

HAITI DENMARKLag used to define streaks Specification

3 days 7 days 30 days Modified original Ours(1) (2) (3) (4) (5)

Panel A: Estimated CoefficientsWinner -0.011*** -0.013*** -0.015*** -0.061*** -0.033***

(0.004) (0.004) (0.005) (0.010) (0.010)Hot 1 -0.032*** -0.026*** -0.022*** -0.007* -0.009

(0.002) (0.002) (0.002) (0.004) (0.006)Hot 2 -0.040*** -0.035*** -0.034*** 0.003 -0.005

(0.003) (0.002) (0.002) (0.005) (0.007)Hot 3 -0.034*** -0.040*** -0.040*** -0.015** -0.006

(0.012) (0.003) (0.003) (0.006) (0.010)Hot 4 -0.037*** -0.034*** 0.010 -0.022

(0.007) (0.003) (0.019) (0.019)Hot 5 -0.031***

(0.005)Winner × Hot 1 0.022*** 0.010*** 0.007* 0.051*** 0.020**

(0.003) (0.003) (0.004) (0.009) (0.008)Winner × Hot 2 0.020* 0.027*** 0.016*** 0.053*** 0.020**

(0.010) (0.006) (0.003) (0.010) (0.009)Winner × Hot 3 0.039*** 0.029*** 0.019 0.032**

(0.015) (0.006) (0.019) (0.016)Winner × Hot 4 0.038** 0.106** 0.006

(0.016) (0.045) (0.038)Winner × Hot 5 0.027*

(0.015)Observations 1.39e+07 1.39e+07 1.39e+07 1.01e+07 1.04e+07R2 0.044 0.044 0.044 0.485 0.347Mean of dep. variable 0.095 0.095 0.095 2.756 2.762Panel B: Marginal EffectsStreak length 1 -0.011*** -0.013*** -0.015*** -0.061*** -0.033***

(0.004) (0.004) (0.005) (0.010) (0.010)Streak length 2 -0.021*** -0.029*** -0.030*** -0.018** -0.021**

(0.004) (0.003) (0.004) (0.008) (0.010)Streak length 3 -0.031*** -0.021** -0.033*** -0.005 -0.017

(0.010) (0.009) (0.005) (0.009) (0.012)Streak length 4 -0.013 -0.026*** -0.058*** -0.007

(0.015) (0.006) (0.015) (0.017)Streak length 5 -0.010 0.055 -0.049

(0.019) (0.038) (0.039)Streak length 6 -0.018

(0.016)

Notes: Authors’ calculations from administrative lottery data. Regressions in columns 1, 2, 3, and 5 includeplayer, number, and round fixed effects, with standard errors clustered at the player level. The regression incolumn 4 is identical to the

20

Table 2: Effects of Win Streaks: Non-Parametric Approach

Dependent variable: Haiti number of bets placed by player i on number n in round r (Playedinr)Denmark amount of money bet by player i on number n in round r

HAITI DENMARK

CoefficientStandarderror

Number ofoccurrences Coefficient

Standarderror

Number ofoccurrences

(1) (2) (3) (4) (5) (6)Win in lag 1 -0.014*** 0.004 1867 -0.032*** 0.010 61Win in lag 2 -0.040*** 0.004 1870 -0.030*** 0.009 65Win in lag 3 -0.037*** 0.003 1870 -0.009 0.007 60Win in lag 4 -0.035*** 0.003 1867 -0.003 0.006 63Win in lag 5 -0.029*** 0.003 1864 -0.002 0.006 65Win in lag 6 -0.026*** 0.002 1852 0.002 0.006 60Win in lag 1, 2 -0.014** 0.006 52 -0.012 0.013 16Win in lag 1, 3 -0.026*** 0.005 56 -0.030** 0.012 21Win in lag 1, 4 -0.033*** 0.008 54 -0.033** 0.013 17Win in lag 1, 5 -0.020*** 0.006 58 -0.024* 0.014 15Win in lag 1, 6 -0.028*** 0.006 58 -0.008 0.013 21Win in lag 2, 3 -0.038*** 0.004 51 -0.014 0.012 15Win in lag 2, 4 -0.042*** 0.005 55 -0.004 0.013 18Win in lag 2, 5 -0.039*** 0.006 52 -0.026** 0.012 19Win in lag 2, 6 -0.033*** 0.005 59 -0.007 0.012 16Win in lag 3, 4 -0.042*** 0.004 50 -0.013 0.012 15Win in lag 3, 5 -0.037*** 0.007 55 0.012 0.011 20Win in lag 3, 6 -0.041*** 0.005 54 -0.010 0.010 23Win in lag 4, 5 -0.031*** 0.005 49 0.001 0.011 14Win in lag 4, 6 -0.040*** 0.006 58 0.022** 0.010 17Win in lag 5, 6 -0.028*** 0.005 49 -0.002 0.011 15Win in lag 1, 2, 3 0.001 0.044 1 -0.018 0.020 4Win in lag 1, 2, 4 0.030 0.034 2 -0.019 0.017 6Win in lag 1, 2, 5 -0.051** 0.023 2 -0.013 0.020 3Win in lag 1, 2, 6 -0.016 0.034 1Win in lag 1, 3, 4 -0.032** 0.014 3 -0.021 0.023 3Win in lag 1, 3, 5 -0.021 0.019 4Win in lag 1, 3, 6 -0.029 0.046 1 0.013 0.033 2Win in lag 1, 4, 5 -0.043** 0.018 3 -0.003 0.019 6Win in lag 1, 4, 6 -0.022 0.018 5Win in lag 1, 5, 6 -0.110*** 0.022 1 -0.049** 0.023 3Win in lag 2, 3, 4 -0.054*** 0.010 1 -0.041* 0.022 3Win in lag 2, 3, 5 -0.019 0.041 2 0.004 0.023 3Win in lag 2, 3, 6 -0.089*** 0.015 2 -0.015 0.020 3Win in lag 2, 4, 5 -0.057*** 0.011 3 -0.028 0.026 2Win in lag 2, 4, 6 0.011 0.020 5Win in lag 2, 5, 6 -0.069*** 0.011 3 -0.002 0.018 7Win in lag 3, 4, 5 -0.032** 0.013 1 0.012 0.018 5Win in lag 3, 4, 6 -0.027 0.041 2 -0.043** 0.019 4Win in lag 3, 5, 6 -0.073*** 0.010 3 0.053 0.034 1Win in lag 4, 5, 6 -0.055*** 0.013 1 0.000 0.018 5Win in lag 1, 2, 3, 5 -0.028 0.023 3Win in lag 1, 2, 4, 5 -0.017 0.034 1Win in lag 1, 2, 4, 6 0.003 0.030 1Win in lag 1, 3, 5, 6 0.065* 0.035 1Win in lag 1, 4, 5, 6 -0.013 0.038 1Win in lag 2, 3, 4, 5 0.015 0.044 1Win in lag 2, 3, 4, 6 0.003 0.025 2Win in lag 2, 3, 5, 6 -0.097*** 0.035 1Win in lag 3, 4, 5, 6 -0.015 0.037 1Win in lag 1, 2, 3, 4, 6 -0.049 0.040 1Observations 14024200 10379916R2 .044 .347

Notes: Authors’ calculations from administrative lottery data. All regressions include player, number, and round fixed effects.No coefficient is listed for those combinations of lagged wins that are not present in the Haiti data.

21

the gambler’s fallacy dominates the hot handdillon.dyson.cornell.edu/cv_papers/dillon lybbert -...

Documents