predictive analytics and the targeting of audits · this can be rationalized by introducing an...

27
Predictive Analytics and the Targeting of Audits Nigar Hashimzade Durham University and Tax Administration Research Centre Gareth D. Myles y University of Exeter and Tax Administration Research Centre Matthew D. Rablen Brunel University London and Tax Administration Research Centre October 7, 2015 Abstract The literature on audit strategies has focused on random audits or on audits condi- tioned only on income declaration. In contrast, tax authorities employ the tools of predictive analytics to identify taxpayers for audit, with a range of variables used for conditioning. The paper explores the compliance and revenue consequences of the use of predictive analytics in an agent-based model that draws upon a behavioral ap- proach to tax compliance. The taxpayers in the model form subjective beliefs about the probability of audit from social interaction, and are guided by a social custom that is developed from meeting other taxpayers. The belief and social custom feed into the occupational choice between employment and two forms of self-employment. It is shown that the use of predictive analytics yields a signicant increase in revenue over a random audit strategy. Keywords : tax compliance; social network; agent-based model. JEL: H26; D85. We thank two guest editors; an anonymous referee; the participants of the Taxation, Social Norms and Compliance Conference (Nuremberg) and the Shadow Economy, Tax Evasion and Governance Conference (Münster); and seminar audiences at Brunel and Tulane for their helpful comments. We are grateful to the ESRC for nancial support under grant ES/K005944/1. y Department of Economics, University of Exeter, Exeter EX4 4PU, UK. e-mail: [email protected]

Upload: others

Post on 23-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Predictive Analytics and the Targeting of Audits�

Nigar HashimzadeDurham University and Tax Administration Research Centre

Gareth D. Mylesy

University of Exeter and Tax Administration Research Centre

Matthew D. RablenBrunel University London and Tax Administration Research Centre

October 7, 2015

Abstract

The literature on audit strategies has focused on random audits or on audits condi-tioned only on income declaration. In contrast, tax authorities employ the tools ofpredictive analytics to identify taxpayers for audit, with a range of variables used forconditioning. The paper explores the compliance and revenue consequences of theuse of predictive analytics in an agent-based model that draws upon a behavioral ap-proach to tax compliance. The taxpayers in the model form subjective beliefs aboutthe probability of audit from social interaction, and are guided by a social custom thatis developed from meeting other taxpayers. The belief and social custom feed intothe occupational choice between employment and two forms of self-employment. It isshown that the use of predictive analytics yields a signi�cant increase in revenue overa random audit strategy.

Keywords: tax compliance; social network; agent-based model.JEL: H26; D85.

�We thank two guest editors; an anonymous referee; the participants of the Taxation, Social Norms andCompliance Conference (Nuremberg) and the Shadow Economy, Tax Evasion and Governance Conference(Münster); and seminar audiences at Brunel and Tulane for their helpful comments. We are grateful to theESRC for �nancial support under grant ES/K005944/1.

yDepartment of Economics, University of Exeter, Exeter EX4 4PU, UK. e-mail: [email protected]

Page 2: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

1 Introduction

The standard analysis of tax compliance in Allingham and Sandmo (1972) and Yitzhaki(1974), and much of the literature that has followed, is based on the assumption that tax-payers abide by the axioms of expected utility theory and that audits are random. Anexception is the literature on optimal auditing � including Reinganum and Wilde (1985,1986) and Chander and Wilde (1998) �which characterizes the equilibrium audit strategy asa function of reported income. In practice, the overwhelming majority of audits performedby tax authorities are �risk-based� (in which taxpayers are targeted for audit), with onlya small fraction of audits performed on a random basis for statistical purposes. Unlike thepresumption of the optimal auditing literature, however, the targeting of risk-based auditsis not based solely on the income report. Rather, tax authorities rely on the experience ofcase o¢ cers reviewing returns and, increasingly, on the basis of predictive analytics whichapplies statistical tools to the data on a set of taxpayers�characteristics, often in the formof qualitative variables (see Cleary, 2011; Hsu et al., 2015). The expected utility modelhas also been subject to signi�cant criticism and many alternatives models with behavioralfoundations have been proposed.The paper explores the compliance and revenue consequences of the use of predictive

analytics in an agent-based model that draws upon a behavioral approach to tax compliance.We use agent-based modelling because this allows us to explore a richer model than ispossible in a theoretical analysis but means we rely on simulation to generate our results.The model is constructed on the foundation of a social network that governs the interactionbetween taxpayers and the transmission of information between taxpayers. The informationconsists of attitudes towards compliance (in the form of a social custom) and beliefs aboutaudits (a subjective probability of being audited). Taxpayers must make an occupationalchoice between employment and two forms of self-employment. Employment provides a safeincome but because of the third-party reporting of income there is no possibility of non-compliance. The two self-employment occupations are risky, but non-compliance is possible.Risk averse taxpayers allocate into the occupations so as to maximize expected utility giventhe expected income and riskiness of each occupation. Given the di¤erent levels of risk in theoccupations, taxpayers divide among occupations on the basis of risk aversion. This resultsin self-selection of those who will exploit opportunities for non-compliance into occupationswhere such opportunities arise.The predictive analytics investigated in the model are based on Tobit and logit regression

models using data from tax returns and from the outcomes of past audits. The Tobitmodel targets audits on the basis of predicted evasion level and the logit model on thebasis of predicted likelihood of non-compliance. The predictive analytics are implementedby simulating the model with random audits for an initial period to acquire audit data, andthen using this data to target audits where non-compliance is predicted. It is shown thatpredictive analytics secure a signi�cant increase in revenue over a random audit strategy.To give the results validity it is necessary to build the agent-based model on a sound

underlying theory of the compliance decision. Our modelling starts from the assumption

1

Page 3: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

that taxpayers do not know the audit strategy of the tax authority but must form a be-lief about the probability of being audited. This is consistent with the idea of behavioraleconomics that individuals generally do not evaluate risky prospects using the objectiveprobabilities of events but form subjective probabilities (or transform objective probabilitiesusing a weighting function). The subjective probabilities (or, in our terminology, beliefs) candi¤er signi�cantly, and persistently, from the objective probability (Kahneman and Tversky,1979). There is also empirical (Spicer and Lundstedt, 1976) and experimental (Baldry, 1986)evidence that the individual compliance decision also takes into account social factors suchas the perceived extent of evasion in the population. We choose to summarize the range ofsocial factors as the attitude of the taxpayer toward compliance. This is essentially identicalto the concept of tax morale that is prominent in the empirical literature (e.g., Torgler,2002).A key feature of our modelling is to make explicit the processes through which the

attitude towards compliance and the belief about auditing are formed. Attitudes and beliefsare endogenous and result from the interaction of a taxpayer with other taxpayers andwith the tax authority. The importance of interaction makes it necessary to specify thesocial environment in which the interaction takes place. We do this by employing a socialnetwork with a given set of links between taxpayers to govern the �ow of information.After each round of audits some of the taxpayers who are linked will meet and exchangeinformation. The likelihood of information transmission is greater between taxpayers in thesame occupation.The paper is structured as follows. Section 2 describes the separate concepts that are built

into the model. Section 3 provides analytical details on how these concepts are implemented.Sections 4 and 5 describe the simulation results under a random audit rule and when theaudit rule is informed by predictive analytics. Section 6 concludes.

2 Conceptual Approach

This section describes the elements that constitute the agent-based model. The purposeof the discussion is to relate these elements to the extensive literature on the individualtax compliance decision. The seminal analyses of the compliance decision by Allinghamand Sandmo (1972) and Yitzhaki (1974) were built upon the application of expected utilitytheory. A standard criticism of this model is that it over-predicts the extent of evasion whenevaluated using the objective probability of audit which has motivated the application ofideas from behavioral economics.1 The behavioral models of the compliance decision aresurveyed in Hashimzade et al. (2013).The key elements of our agent-based model is that taxpayers make an occupational choice

decision prior to the compliance decision. The compliance decision is based on the attitudetoward compliance as summarized in a social custom and belief about audits captured in asubjective probability of being audited. The information used to form attitudes and beliefs

1It should be noted that Slemrod (2007) gives good reasons why this claim should be treated with caution.

2

Page 4: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

is transmitted through meetings between taxpayers governed by a social network. Each ofthese components is now described in greater detail.

2.1 Occupational choice

Occupational choice determines the possibility for engaging in tax evasion. Income fromemployment is often subject to a withholding tax and/or third-party reporting to the taxauthority. For example, the UK Pay-As-You-Earn system involves income tax being de-ducted by employers and remitted directly to the tax authority. This prevents evasion byemployees (unless there is collusion with the employer) and so non-compliance is only possiblefor taxpayers who are self-employed. Occupations also di¤er in their traditions concerningpayment in cash. Those in which cash payment is common provide the greater opportunityfor evasion. Occupational choice has not had a prominent role in the literature on tax eva-sion despite its clear importance. Exceptions to this are Pestieau and Possen (1981) whomodel occupational choice, Cowell (1981), Isachsen and Strøm (1980), and Trandel and Snow(1999) who analyze the choice between working in the regular and the informal economy.Occupational choice is also important for the connection it has with risk aversion. Individ-

uals allocate to occupations on the basis of their ability at that occupation and their attitudeto risk. Those who are least risk averse will choose to enter the riskiest occupations. Kanbur(1979) and Black and de Meza (1997) assume employment is safe but self-employment isrisky, and address the social e¢ ciency of aggregate risk-taking. Self-employment attractsthe least risk-averse taxpayers, who will evade the most when the opportunity arises. Hence,occupational choice has the e¤ect of self-selecting taxpayers who will evade into a situationin which they can evade. This observation should form part of any explanation of whynon-compliance can be so signi�cant within speci�c occupational groups.Our model includes a choice between employment and two forms of self-employment.

Employment is a safe activity that delivers a certain income. Self-employment is riskyso that each taxpayer only knows the probability distribution of income when making anoccupational choice. One of the self-employment occupations is riskier than the other, in asense we make precise below.2

2.2 Social customs

The experiments of Baldry (1986) provide compelling evidence that the evasion decision isnot just a simple gamble. This can be rationalized by introducing an additional cost intothe evasion decision. These costs can be �nancial (Bayer, 2006; Chetty, 2009; Lee, 2001)or psychic (Gordon, 1989). Psychic costs can arise from fear of detection or concern aboutthe shame of being exposed. The magnitude of the psychic cost can re�ect an individual�sattitude towards compliance. Attitudes are an important feature of psychological theoriesof tax compliance (Kirchler et al., 2008; Weigel et al., 1987). The psychic cost can also be

2Individuals di¤er in their level of skill in the occupations, and skills are one of the determinants ofincome. This makes it necessary to state the formal details before �riskier�can be explained in full.

3

Page 5: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

interpreted as the loss of the payo¤ from following a social norm for honest tax payment.Adopting this interpretation makes it natural to assume that the size of the loss in payo¤is generated by explicit social interaction, and that the size is larger when fewer taxpayersevade (Fortin et al., 2007; Kim, 2003; Myles and Naylor, 1996; Traxler, 2010).The additional costs have an important role in explaining some features of the tax evasion

decision. We model attitudes by including a social custom of honest tax payment in the modelso that there is a utility gain (relative to the state with non-compliance) when tax is paid infull. The importance attached to the social custom by each taxpayer is determined by theirinteraction with other taxpayers within the social network.

2.3 Subjective beliefs

We have already observed that if choice is based on objective probability of being auditedthen the standard model over-predicts the amount of evasion. This has led to the applicationof choice models based on non-expected utility theories. Non-expected utility models canpredict the correct level of evasion for reasonable parameter values. This is because theypermit the subjective probability of audit (the weighting on the payo¤ when audited) to begreater than the objective probability. They also open the possibility of designing compliancepolicy to manipulate the subjective nature of the decision (El¤ers and Hessing, 1997).It is standard to distinguish between a choice with risk (the agent knows the probability

distribution of future events) and a choice with uncertainty (the agent does not know theprobabilities). A �rst step away from expected utility theory is to consider a choice with riskbut to assume the probabilities are distorted into �decisions weights�that enter the expectedpayo¤. Rank-dependent expected utility (Quiggin, 1981, 1982; Quiggin and Wakker, 1994)uses a particular weighting scheme to transform the objective probability of events into sub-jective probabilities and has been applied to the evasion decision by Arcand and Graziosi(2005), Bernasconi (1998) and Eide (2001). Prospect theory (Kahneman and Tversky, 1979;Tversky and Kahneman, 1992) also uses a weighting scheme but payo¤s are determined bygains and losses relative to a reference point. It has been applied to compliance by al-Nowaihiand Dhami (2007), Bernasconi and Zanardi (2004), Rablen (2010), and Yaniv (1999). Un-certainty has been modelled by assuming the agent forms a probability distribution over thepossible probability distributions of outcomes (�second-order uncertainty�). This gives riseto the concept of ambiguity, surveyed in Camerer and Weber (1992) which has been appliedto tax compliance by Snow and Warren (2005).We incorporate these ideas into the model by assuming that each taxpayer forms a sub-

jective belief about the audit probability and by explicitly modelling the process for formingbeliefs. This allows the model to provide an explanation of how subjective probabilities canendogenously emerge and remain systematically di¤erent from the objective probabilities.

4

Page 6: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

2.4 Social network

The illegality of tax evasion and the incentive the tax authority has to conceal its auditstrategy imply that taxpayers cannot be fully informed. A natural assumption is that infor-mation will not be revealed publicly, but will be transmitted between taxpayers in a positionof mutual trust. The social network we adopt is a formalization of this assumption.The importance of social contacts is supported by empirical evidence on the positive

connection between the number of tax evaders a taxpayer knows and the extent of evasion ofthat taxpayer (De Juan et al., 1994; Geeroms andWilmots, 1985; Spicer and Lundstedt, 1976;Wallschutzky, 1984; Webley et al., 1988). This evidence demonstrates that the compliancedecision is not made in isolation but that each taxpayer makes reference to the observedbehavior of the society in which they operate.We capture this social interaction by applying network theory (Goyal, 2009; Jackson,

2004). Networks have previously been used in the analysis of evasion by Korobow et al.(2007) and Franklin (2009). They have also been applied to crime more generally (Glaeseret al., 1996).The social network in our model plays two roles. First, it transmits the social custom

from one person to another: if two non-evaders meet the importance of the social custom ofhonest payment is increased for both, but if a non-evader meets an evader then it is reducedfor the non-evader and increased for the evader. Second, the network transmits informationabout audit policy. Since the audit strategy is not public information, taxpayers have toinfer it from their own experience and from the receipt of information about the experiencesof others. Our simulations are an application of agent-based modelling (Bloomquist, 2004;Tesfatsion, 2006) with agent interaction controlled by the social network.

3 Network Model

In this section we model the formation of attitudes and beliefs as the outcome of socialinteraction, and opportunities as the outcome of occupational choice. This is achieved byapplying the theory of network formation to track the links between taxpayers and thetransmission of attitudes and beliefs, and combining this with agent-based modelling whichemploys a behavioral approach to describe individual choices.There are N individuals, indexed by j = 1; :::; N , whose lives extend throughout the

simulation. Although the lives are long, each individual makes a succession of single-perioddecisions and so is �myopic�. We discuss possible relaxation of this assumption in section6. Individuals interact repeatedly in discrete time, t = 0; :::; T , where each period t isunderstood to be the tax return period. Each individual, j, at time t is described by avector of characteristics �

wj; �j; s1j ; s

2j ; zj; p

1j;t; p

2j;t; �j;t

�: (1)

At the start of a simulation the values for all characteristics are randomly assigned to eachtaxpayer by making draws from independent distributions. Once drawn, the �rst �ve char-acteristics in (1) remain constant throughout the simulation and so represent the exogenous

5

Page 7: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

parameters describing the agents. These characteristics are wj, the wage in employment(occupation 0); �j, the coe¢ cient of relative risk aversion; s

�j , the skill in self-employed occu-

pation �, � = 1; 2; and zj, the payo¤ from following the social custom. To avoid the potentialfor misleading results that might follow from an unusual draw of these characteristics ourresults are calculated as the average of multiple independent simulations. In particular, wesimulate �ve times to obtain the baseline results in section 4 and ten times for the analysisin section 5.The remaining three characteristics in (1) are endogenous and so are updated period-

by-period through interaction with the tax authority and with other taxpayers in the socialnetwork. These characteristics are additionally indexed by time, t. These are: p�j;t 2 [0; 1],the perceived (subjective) probability of audit in occupation �, � = 1; 2, and �j;t 2 [0; 1],the weight attached to the payo¤ from following the social custom for honesty.We now describe how these characteristics enter into the choice problem of a taxpayer

and how the subjective probability and weight given to social custom are updated.In each period, t, every individual chooses their preferred occupation and, once income

is realized, the optimal level of evasion. Individual j has a choice between employment orentering one of two self-employment occupations. These are occupations 1 and 2, whichwe denote by SE1 and SE2.3 If employment is chosen the wage, wj, is obtained with cer-tainty, whereas in self-employment earnings are random. The outcome of self-employmentfor individual j in occupation � at time t is given by s�j y

�j;t, where s

�j is drawn once at the

beginning of the simulation, but y�j;t is drawn randomly at every time t from the probabilitydistribution function F� (�).4 The choice of occupation is taken on the basis of F� (�) butthe choice of evasion is made after the realization of y�j;t. It is assumed that E (y

1) < E (y2)and V ar (y1) < V ar (y2), so if s1j = s

2j SE2 is riskier than SE1 but o¤ers a higher expected

income. Both self-employment occupations are riskier than employment, in the sense thatfor each agent the wage in employment is certain, i.e., V ar (wj) = 0.It is not possible to evade tax in employment because income is subject to third-party

reporting or to a withholding tax. Evasion only becomes possible when self-employment ischosen. In occupation � taxpayer j has belief at time t that the probability of being auditedand evasion being detected5 is p�j;t. The belief about the audit probability is updated throughthe experience of the taxpayer with audits and through the exchange of information whenmeeting other taxpayers. The attitude of taxpayer j toward evasion is summarized in �j;t,the weight given to the social custom. This attitude is also updated through meetings withother taxpayers. We describe the processes for updating attitudes and beliefs in detail after

3It may seem unrealistic to have an occupational choice in every period but in the simulations only a verysmall proportion of taxpayers actually change occupation in any period.

4For tractability, we abstract in this setting from the possibility of bankruptcy for agents in the self-employment occupations. Were this feature allowed for (see, for example, Bloomquist, 2011), businesssurvival would become a function of the outcome of the tax evasion gamble, which generates a dynamice¤ect on compliance in addition to those we study in our model. We are grateful to an anonymous refereefor this point.

5Here we assume that detection is full. Alternatively, one can assume partial detection, which gives riseto a number of interesting issues which are beyond the scope of this paper.

6

Page 8: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

discussing the choice of occupation for given attitudes and beliefs.The choice of occupation and the choice to evade tax involve risk. Taxpayer j has a

(constant) degree of relative risk aversion measured by the risk aversion parameter, �j. Thetaxpayer chooses occupation and evasion level at time t to maximize subjective expectedutility given beliefs fp�j;tg. For analytical tractability, we assume throughout a CRRA formfor utility:

Uj (Y ) =Y 1��j � 11� �j

: (2)

The attitude toward evasion determines the utility value of following the social custom ofhonest tax payment. The payo¤ from the social custom is given by zj and the individualweight, or the importance, assigned to this payo¤ by the taxpayer is determined by �j;t.Hence, compliance with tax payment at time t generates an additional utility from followingthe social custom of �j;tzj.In employment there is no opportunity for evasion, and so the taxpayer obtains a payo¤

given by

V 0 =[(1� �)wj]1��j � 1

1� �j+ �j;tzj;

where � is the constant marginal tax rate. The possibility of tax evasion makes the choice ofself-employment a compound lottery: the income is random, as is the outcome of choosingto evade.De�ne the expected payo¤ from the optimal choice of evasion in self-employment occu-

pation � for a given realization y�j;t as

V �e�y�j;t�= max

E2[0;s�j y�j;t]

�p�j;tUj

�[1� � ] s�j y�j;t � f�E

�+(1� p�j;t)Uj

�[1� � ] s�j y�j;t + �E

�+ �j;tzj1[E=0]

;

where f > 1 is the �ne levied on unpaid tax if evasion is detected. The term 1[A] is anindicator function that takes the value of one if A is true and zero otherwise: the payo¤ fromthe social custom is obtained only if tax is paid in full. The level of evasion, E�j;t = E

�y�j;t�,

will be a function of the realized income y�j;t in occupation �. The expected payo¤ from thecompound lottery describing occupation � is then

V � =

ZV �e (y) dF

� (y) :

The choice of occupation is made by comparing the (expected) utility levels from employmentand from self-employment. Hence, the chosen occupation is given by selecting the maximumof fV 0; V 1; V 2g.After self-employment occupation � is chosen at time t an outcome ~y�j;t is realized accord-

ing to the probability distribution function F� (�). Given the outcome, the optimal evasiondecision is implemented, as described above. Denote the level of evasion that is realized by~E�j;t = E

�~y�j;t�. Tax returns are submitted, and a proportion of those in self-employment

7

Page 9: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

occupations are then audited, according to a rule chosen by the tax authority. If evasion isdiscovered, unpaid tax is reclaimed and the �nes on unpaid tax are paid.The social network is modelled as a set of bidirectional links described by an N � N

symmetric matrix of zeros and ones. For example, in the network described by matrix A

A =

26640 1 0 01 0 1 00 1 0 10 0 1 0

3775 :the �rst row, representing the links of individual 1, has a single 1 in column 2 which meansthat 1 is linked to 2. There is a corresponding 1 in the �rst column in the second rowrepresenting the link of individual 2 with 1. That is, the element in row i and column j ofmatrix A is de�ned as

Aij =

�1 if i and j are linked in the network,0 otherwise.

We use a random (Erdös�Rényi) network, where agents i and j are linked with a commonexogenous probability, Pr [Aij = 1] = �; the matrix is created at the outset and does notchange.6 The network determines who may meet whom to exchange information. In eachperiod a random selection of meetings occur described by a matrix Ct of zeros and onesrandomly drawn in every period. Individuals i and j meet during period t if AijCtij = 1; ata meeting they may or may not exchange information about their subjective probability ofaudit in each self-employed occupation and about whether each of them was compliant inthat period. The probability of information exchange depends on the occupational groupsto which i and j belong; the probability is highest when they are in the same occupation.Recall that individuals in all three occupations hold beliefs about the probability of

being audited in each of the two self-employment occupations; they also know that evasionis not possible in employment. We assume there are two ways in which beliefs are updated.Consider taxpayer j who has worked in occupation � 2 f0; 1; 2g in period t. If j is self-employed, after submission of the tax return this taxpayer may or may not be audited.On the basis of the outcome j�s belief about the audit probability, p�j;t, in that occupationis then adjusted. The beliefs about the audit probability in the other self-employmentoccupation, p�j;t; � 6= �, remain unchanged at this stage. Following this, the taxpayer maymeet with a contact in the network. Let the meeting be with a taxpayer i who is engaged inoccupation � 2 f0; 1; 2g. At the meeting information is exchanged with probability q��. Thisinformation is then used to update the belief about the audit probabilities, p j(i);t, = f1; 2g,in both self-employment occupations. If taxpayer j is in paid employment then his or her

6Here the network is �xed, but the probabilities of information exchange between the linked individualschange if they switch occupations, as we shall describe. Another possibility would be to have the networkitself revised as a consequence of chosen actions, i.e. agents in di¤erent occupations belonging to di¤erentsocial networks. The model may also be analysed under, for instance, the small-world or power law networksas alternatives to the random network (see, e.g., Andrei et al., 2014; Axtell et al., 2006; Souma et al., 2003.)

8

Page 10: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

belief about audit probabilities in self-employment, p j;t = f1; 2g, can only be updated atthe meeting.The choice of occupation in period t + 1 is made on the basis of the beliefs

�p1j;t; p

2j;t

updated after the audits and the information exchange. Two di¤erent processes for theupdating of subjective beliefs following an audit have been proposed in the literature. Asstudies have reliably demonstrated important deviations from Bayesian inference (see, forexample, Grether, 1980), we allow for non-Bayesian updating. The �rst process, which isqualitatively similar to a Bayesian process, is to assume that individuals feel marked astargets if they are audited, so that one audit is believed likely to be followed by another. Weterm this the target e¤ect. In contrast, those not audited in one period believe they are lesslikely to be audited in the next period. Formally, if taxpayer j in occupation � 2 f1; 2g isaudited in period t, his or her belief about being audited in the same occupation in the nextperiod is raised to probability P , otherwise it decays. The updating rule for the subjectiveprobability is therefore

~p�j;t+1 =

�P 2 [0; 1] if audited at t;

�p�j;t+1; d 2 [0; 1] otherwise.(3)

We refer to the case of P = 1 as the maximal target e¤ect. The alternative is the bomb-cratere¤ect that is documented experimentally by Guala and Mittone (2005), Kastlunger et al.(2009), Maciejovsky et al. (2007), and Mittone (2006). In this process a taxpayer who hasbeen audited in one period believes that they are less likely to be audited in the next, butthe belief gradually rises over time. The process is therefore described by

~p�j;t+1 =

�P 2 [0; 1] if audited at t;

p�j;t + ��1� p�j;t+1

�; � 2 [0; 1] otherwise, (4)

with P = 0 being the maximal bomb-crater e¤ect.Recent empirical analysis using administrative data for the entire population of UK

taxpayers who submit tax returns has produced results that are in agreement with thetarget e¤ect. Advani et al. (2015) report that compliance increases post-audit and remainshigh for several periods before beginning to decrease again. This matches the target e¤ectbut is the opposite of what the bomb-crater e¤ect predicts. Since the empirical results arebased on analysis of actual behavior of the entire population of self-reporting taxpayers theyare very convincing, and so we adopt the target e¤ect in the simulations that follow.7

After the audit process is completed a taxpayer may meet with a contact. The informationthat may (or may not) be exchanged at a meeting includes the subjective probabilities andwhether or not the agents were audited. If taxpayer j meets taxpayer i and if informationexchange occurs at the meeting, the subjective probability is updated according to the rule

p�j;t+1 = �~p�j;t + (1� �) ~p�i;t; � = f1; 2g :

7We considered the bomb-crater e¤ect in earlier versions of the paper (see Hashimzade et al., 2014). Themain results on the e¤ects of predictive analytics are qualitatively similar to those we present here under thetarget e¤ect.

9

Page 11: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

The importance assigned to the social custom is also determined by interaction in thesocial network. The weight, �j;t, is updated in period t if information exchange occursbetween j and some other taxpayer in that period. Assume individual j meets individual iin occupation � at time t and information exchange takes place. The updating process isdescribed by a function, �j;t+1 = g

��j;t;1[ ~E�i;t=0]

�, such that (i) �j;t+1 � �j;t if information is

exchanged with a compliant taxpayer, i.e., if 1[ ~E�i;t=0] = 1; and (ii) �j;t+1 < �j;t if informationis exchanged with an evader, i.e., if 1[ ~E�i;t=0] = 0. In the simulations we assume a partial

adjustment process, where j�s individual weight adjusts by fraction ' 2 (0; 1) to its lowerbound (zero) if i was an evader and to its upper bound (one) if i was compliant:

�j;t+1 = (1� ')�j;t + '1[ ~E�i;t=0]

=

�(1� ')�j;t if ~E�i;t > 0;

�j;t + '�1� �j;t

�if ~E�i;t = 0:

We assume that earnings in occupation �, � 2 f0; 1; 2g, are drawn from the lognormaldistribution, logN (��; �2�), and that skills in self-employment are given by 1

1� & , where &is drawn from the standard uniform distribution and 2 (0; 1) is a constant parameter.Each individual knows their own wage in employment, y0j;t = wj, their own skill, s

1;2j ; in

the self-employment occupations � = 1; 2, and the distribution of earnings, F� (�), in theself-employed occupations. At time t = 0 each individual is randomly assigned a vector ofsubjective beliefs,

�p1j;0; p

2j;0

, and the level of importance of social custom, �j;0, indepen-

dently from the standard uniform distribution. The objective probability of a random auditfor all self-employed is 0:05; the employed are not audited.8

A key feature of the simulation model is the choice of occupation by taxpayers. Wecalibrate the model so that the allocation of taxpayers across occupations with randomaudits matches the allocation in the UK. The UK Living Costs and Food Survey (LCFS) isused to obtain data on the occupations of all responding households in 2012. Since this is asubsample of the UK population we use the annual weights provided by the LCFS to correctfor sampling bias. The households reporting as self-employed are then separated into twooccupations using the Alm and Erard (2013) classi�cation of occupations into Non-Risky(our SE1) and Risky (our SE2) where there is known to be a higher presence of informalsuppliers. This process provides baseline �gures of 86 per cent in employment, eight per centin SE1, and six per cent in SE2.

4 Baseline Simulations

We �rst conduct simulations of the network model described above under the assumptionof random audits in order to obtain a baseline outcome. This allows an investigation of the

8According to the IRS, around 0.96 per cent of individual tax returns �led in calendar year 2012 wereexamined (IRS, 2014). The audit rate by the size of (adjusted gross) income varied from as low as 0.58 percent for incomes between $75,000 and $100,000, to as high as 24.16 per cent for incomes above $10,000,000.

10

Page 12: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

nature of the equilibrium, a comparison of the model outcome with evidence, and the con-sequences of the alternative assumptions on information transmission between occupations.The results we present assume the target e¤ect for audits as speci�ed in equation (3) as it

is supported by recent empirical work (Advani et al., 2015). The results for the bomb-cratermodel di¤er only in the pattern of compliance after audit, as described below; the outcomesfor the revenues are not qualitatively di¤erent from those under the target e¤ect. A completeset of results for the bomb-crater model are available from the authors upon request. Thevalues of the exogenous parameters and the distribution functions for the random variablesare given in the Appendix.9

Two types of model are simulated that di¤er in the probability of information exchangebetween di¤erent groups. The �rst model (denoted Foc) used focused information transmis-sion. That is, at a meeting information is exchanged with positive probability only betweenlinked agents in the same occupation. The second model (denoted Di¤) used di¤used in-formation transmission. In this case there is a positive probability that a meeting betweenlinked taxpayers in di¤erent occupations results in information exchange and the probabilityof information exchange at meetings between members of the same occupation is reducedcompared to that under the focused information transmission. We perform �ve independentsimulations of each model over a 50-period horizon. The results we report for each trans-mission mechanism are averages for the �nal 40 periods over the �ve simulations performed(we drop the �rst 10 rounds of each simulation to avoid startup e¤ects). The results underdi¤used information exchange are shown in Figure 1 and Tables 1 and 2 report summarystatistics under both focused and di¤used information transmission.The central message from Figure 1 is that sub-groups of the population (the occupa-

tional groupings) can endogenously form di¤erent attitudes to compliance. As expected, theoperation of self-selection sorts those who are most willing to accept risk into the riskiestoccupation (SE2). Self-employment gives them the opportunity to evade, and they make useof this opportunity to become the least compliant group. The updating process for beliefsand the transmission of information around the social network result in the subjective proba-bility of audit being above the true probability for the self-employed. The two self-employedgroups hold similar beliefs, but these are distinctly di¤erent from those of the employed.The non-zero belief for the employed re�ects their learning about audits from meeting withself-employed. The operation of the social custom results in the employed placing a highweight on following the custom for honesty. Taxpayers in the two self-employment occupa-tions place much less weight on the social custom but this e¤ect is not signi�cantly di¤erentbetween the two occupations under the di¤used information exchange. In contrast, withfocused information exchange a signi�cant di¤erence in beliefs and weights placed on thesocial custom for honesty can emerge between the two self-employment occupations.

9The value of the social custom z is measured in units of utility. Therefore, although z appears constrainedto take very small values, these values are commensurate with the values taken by the utility function in (2).Thus, with given parameterisation, a true report increases the utility of an �average� individual by about10 per cent.

11

Page 13: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Figure 1: Results for random auditing with di¤used information transmission.

Table 1 reports the sample means and the sample standard deviations for risk aversion,beliefs and compliance for each occupational group. The e¤ect of self-selection into occupa-tions is seen clearly in the mean level of risk aversion in each occupation. Risk aversion issigni�cantly lower in SE2 than in SE1, and both are lower than in employment. The tablealso con�rms that the subjective belief is above the true value of 0:05 of the probability ofaudit for those in self-employment, and that under focused information exchange it can di¤erbetween self-employment occupations. Compliance of those in employment is equal to oneby construction, for this is measured as the proportion of agents who report truthfully. Forboth forms of information exchange compliance is lower for taxpayers in SE2.Table 2 gives the population means and standard deviations (according to the assumed

distribution) and the sample means and sample standard deviations for each occupationalgroup for the wage in employment and skills in self-employment, as well as the sample meansand sample standard deviations for the true earnings and reported earnings. The sample,again, consisted of the �nal 40 periods of �ve independent simulations. In interpreting thetable it is important to note that the entries provide information on all three occupational

12

Page 14: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Employment SE1 SE2Foc Di¤ Foc Di¤ Foc Di¤

Risk Aversion (�j)2:8540(0.0140)

2:8304(0.0207)

1:4070(0.0629)

1:4205(0.0544)

0:9762(0.0549)

0:9904(0.0176)

Belief (�j;t)0:0002(0.0001)

0:0007(0.004)

0:1731(0.0159)

0:2103(0.0191)

0:2057(0.0136)

0:2075(0.0098)

Compliance1:0000(0.0000)

1:0000(0.0000)

0:7441(0.0267)

0:7199(0.0202)

0:6527(0.0363)

0:6305(0.0258)

Table 1: Descriptive statistics by occupation under focussed information exchange (Foc) anddi¤used information exchange (Di¤). The �gures reported are the sample means across thelast 40 periods of �ve independent simulations. The �gures in parentheses are the samplestandard deviations.

possibilities a taxpayer could have chosen, not only the one they actually chose. For example,the table tells us that if, counterfactually, those taxpayers who actually chose SE2 hadinstead chosen SE1 their average skill in SE1 would have been 1:277. It can be seen that,once the individuals self-select into a particular occupation, the average productivity in eachoccupation (i.e., the wage for those in employment and skill for those in self-employment) isabove the corresponding population mean, shown in the �rst column.Hamilton (2000) estimates that, on average, the self-employed report income that is 35

per cent lower than in equivalent employment. In the simulation the reported incomes of theself-employed are between 25 per cent and 35 per cent lower than the employed, and so �treasonably well with this observation. The evidence also suggests that reported income ofthe self-employed must be in�ated by between around 29 per cent (Feldman and Slemrod,2007), and 35 per cent (Pissarides and Weber, 1989) to account for under-reporting. It canbe in�ated to take account of the personal consumption of business goods by a further 34 percent (Bradbury, 1997). The data from the simulation are, again, approximately consistentwith these observations.

5 Random Audits and Predictive Analytics

The role of predictive analytics is to identify the best audit targets, in terms of the expectedlevel or the expected likelihood of non-compliance. Predictive analytics are used by taxauthorities in many countries, including the IRS and HMRC. The IRS, for instance, usesinformation from its random audit programme to design discriminant functions (DIF) thatare used to assign a score �called the DIF score �to each tax return for the likelihood thatit contains some irregularities or evasion. Various methods are used for identifying targetsfor risk-based audits. We wish to explore the e¤ects of predictive analytics on compliancebehavior and to examine the extent to which they can improve on other audit strategies.The method we employ to conduct the analysis is to embed predictive analytics within

13

Page 15: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Variable Population Employment SE1 SE2Foc Di¤ Foc Di¤ Foc Di¤

Wage (wj)13:045(2.00)

13:3347(0.0048)

13:2723(0.0054)

11:3906(0.0869)

11:4092(0.0777)

11:6563(0.0937)

11:7109(0.0959)

Skill in SE1 (s1j)1:387(0.28)

1:3518(0.0006)

1:3523(0.0007)

1:7533(0.0163)

1:7441(0.0168)

1:2966(0.0055)

1:277(0.004)

Skill in SE2 (s2j)1:387(0.28)

1:3631(0.0004)

1:3677(0.0005)

1:3222(0.0032)

1:3385(0.0038)

1:7282(0.0238)

1:655(0.004)

Declared earnings13:3347(0.0048)

13:2723(0.0054)

9:5351(0.2553)

10:1166(0.2837)

8:8906(0.3898)

9:0089(0.2962)

True earnings13:3347(0.0048)

13:2723(0.0054)

14:0124(0.2505)

13:9719(0.2874)

14:1783(0.4072)

14:4470(0.3579)

Table 2: Wages and skills in population and across occupations under focused informationexchange (Foc) and di¤used information exchange (Di¤). The �gures reported are the samplemeans across the last 40 periods of �ve independent simulations. The �gures in parenthesesare the sample standard errors.

the agent-based model. We then compare the outcome with predictive analytics based ontax return data to the outcome with random audits. The two forms of predictive analyticswe investigate involve econometric analysis for predicting the level of non-compliance (level-targeting) and the probability of non-compliance (rate-targeting) by each taxpayer on thebasis of the information provided on the tax return and by past audits.This process is implemented in the simulation by using random audits (with each self-

employed individual facing a 0:05 probability of being audited) for the �rst 50 periods toeliminate the e¤ect of the initial conditions and to accumulate audit data. Data from the�rst ten periods are dropped from the statistics we record to remove starting point e¤ects(but these ten periods are reported in the �gures). The outcomes from the �nal �ve randomaudits are collected and at the end of period 50 are used to estimate a regression equationwith the dependent variable being the amount of under-reported income in the �rst exerciseand a binary variable taking the value of one if an individual under-reported their incomeand zero if reported truthfully in the second exercise.10 The explanatory variables are theobserved characteristics and the audit history of an agent; in our model these are termedoccupation, declaration, and previous audit.11 The estimated equation is used to predict non-compliance given information collected in period 51. From this point onward, the estimatedregression equation in period t is used to predict non-compliance using tax return and audithistory data in period t + 1; the audit outcomes in period t + 1 are added to the data set,

10Alternatively, instead of under-reported income one can use unpaid tax as the variable of interest. Inour model these approaches are equivalent because of the �at tax schedule.11In the regression �previous audit�is a binary variable, one if the individual was audited in the previous

period and zero otherwise. This speci�cation can be easily extended to include a longer audit history and/orprevious audit outcomes.

14

Page 16: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

and the regression analysis is repeated to predict non-compliance in t+ 2, and so on.The simulations reported here compare a targeted regime in which audits are based

entirely on predictive analytics with the outcome of random auditing. In each period thatpredictive analytics are used agents are sorted according to their predicted level of evasion(largest evaders in the �rst exercise) or according to their predicted probability of evasion(most likely evaders in the second exercise). In the targeted regime the top �ve per cent oftaxpayers by predicted non-compliance are audited (this means that the number of targetedaudits is equal to the average number of random audits, so that the audit costs are, onaverage, equal between these two strategies). In Hashimzade et al. (2014) we also reportresults on mixed regimes in which a combination of predictive analytics and random auditsare used. The mixed regimes are used to explore the possibility that the agents learn aboutthe audit patterns by exchanging information in their networks: if the audits concentrate onone occupation, individuals in the other occupation may evade more, having learned thatthey are likely to get away with it.

5.1 Targeting largest evaders

Since expected non-compliance is bounded below by zero (we do not allow for mistakenover-declaration) a Tobit (censored) regression is employed to predict the expected level ofevasion. We report the outcomes for the model with di¤used information transmission; theresults for the focused transmission are qualitatively the same. The parameter values arethe same as in the baseline simulations (section 4) and are listed in the Appendix.Table 3 shows the marginal e¤ects of explanatory variables upon the predicted level of

under-reported income, calculated as the marginal e¤ect for the average data point (avg.data) and as the average over individuals (indiv. avg.).The estimated coe¢ cients changethroughout each simulation as new data are added. To capture the long-run outcome wereport the estimates based on the data accumulated up to the �nal period. To facilitatethe reporting of standard errors, here we report the estimates from the last independentsimulation performed, rather than an average across all the simulations performed.One can see that individuals with higher declaration, those audited in the previous period,

and those in SE1 are predicted to evade less tax (although only the declaration appearsstatistically signi�cant). The latter implies that, ceteris paribus, the targeted audits willtend to focus on individuals in SE2. The lower evasion level of the previously auditedtaxpayers re�ects the assumption of the target e¤ect, as opposed to the bomb-crater e¤ect(under the bomb-crater model the previous audit has a positive e¤ect on evasion).The simulation results when the largest evaders are targeted are shown in Figure 2.

The e¤ect upon revenue of the introduction of predictive analytics is the sharp increaseobserved at period 50 in the �rst panel. This higher level of revenue is sustained for theremaining periods of the simulation. With random audits the subjective beliefs of the twoself-employment occupations are the same. Once targeted audits are imposed the belief inSE2 is sustained just above the level under random auditing but falls in SE1. Complianceincreases signi�cantly in both self-employment occupations immediately after the imposition

15

Page 17: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Variable Coe¤. St. E. z�stat. ME (avg. data) ME (indiv. avg.)Const. 15:84 0:21 74:77 13:5338 13:9044Declared Income �4:89 0:67 �7:26 �4:1814 �4:2959Previous audit �35:73 144 �0:02 �30:5271 �31:3631Self-employment 1 �1:28 1:49 �0:86 �1:0949 �1:1249

Table 3: Estimated coe¢ cients and marginal e¤ects in the evasion level equation (Tobitmodel). The reported coe¢ cients and standard errors are the maximum likelihood estimatesfrom the data accumulated up to the �nal period of the last independent simulation (??observations).

of predictive analytics, but then subsequently begins to fall in SE1 as taxpayers in thisoccupation learn that they are not the target of audits. This process continues until thepoint is reached at which the compliance of the two groups is approximately equal. This isclose to optimal outcome: those in SE2 earn more on average than do those in SE1 so it isoptimal that their compliance level is somewhat higher also.The �nal panel of the �gure shows the empirical cumulative distribution function (cdf)

for revenue under random audits and under targeted audits. It can be seen that the cdf fortargeted audits �rst-order stochastically dominates the cdf for random audits. This impliesthat expected revenue under targeting is higher. Moreover, for any objective function of thetax authority that is an increasing function of revenue the expected value of the objectivefunction will be higher under targeting.

16

Page 18: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Figure 2: Results when the tax authority uses predictive analytics to target the largestevaders in period 50 onwards.

These results show clearly that the use of predictive analytics increases compliance andresults in higher revenue. The increase in compliance raises the chance of a meeting with acompliant taxpayer and thus leads to a steady increase in the importance of social customof honest reporting when predictive analytics are in operation. Compliance is not uniformlyincreased across occupational groups when predictive analytics are introduced because ofthe reduction in focus on the least compliant occupation, SE2. With targeted audits theproportion targeted towards taxpayers in SE2 is very high, as one can see in Figure 3; as aresult, compliance among taxpayers in SE1 drops. Despite this, the results still demonstratethat a policy of targeted audits outperforms random auditing. Our �ndings do though pointto the need to support predictive analytics with random audit programme to monitor thebehavior of taxpayers outside the targeted group.12

12Hashimzade et al. (2014) demonstrate that targeted audits also outperform mixed audit strategies forboth groups in terms of the tax yield and average compliance but not necessarily in terms of compliance. The

17

Page 19: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Figure 3: Proportion of audited taxpayers who belong to SE2 in each period. The taxauthority implements predictive analytics to target the largest evaders beyond period 50.

5.2 Targeting most likely evaders

In the previous sub-section the predictive analytics targeted audits at those individuals withthe highest predicted levels of non-compliance, or undeclared income. An alternative strategyis to target those predicted most likely to evade, or those with the highest evasion score.The evasion score is similar to the credit score used by credit-rating agencies, and can becalculated by estimating a logit or probit regression. For the explanatory variables we usethe same observed characteristics of the agents and their audit histories as in the previousexercise.In the data the evasion score is assigned the value of one if the individual evaded tax and

zero if declared truthfully. The predicted values of the evasion score from the regression havethe interpretation of the predicted probabilities that an individual will under-report theirincome. In every period the individuals are ranked according to their predicted evasion score,and those with the highest predicted score are audited. Again, the number of audits is equalto the average of the number of random audits, in order to equalize the audit costs acrossstrategies. The targeted audits are then compared with random audits. We present theresults obtained from the logit regression; the results of the probit regression are very similarand are available upon request. Here, again, we used the model with di¤used informationtransmission for the same parameter values as in previous simulations (sections 4 and 5.1),listed in the Appendix; the results for the focused transmission are qualitatively the same.The simulation results when the most likely evaders are targeted are shown in Figure 4.

By comparing Figure 4 with Figure 2 it may be seen that the evolution of revenue and beliefswhen the most likely evaders are targeted is very similar to that when the largest evaderare targeted. Compliance behavior, however, appears to become somewhat more sensitiveto the around the introduction of predictive analytics, with a bigger fall in compliance from

�mixed�regimes of targeted and random audits are akin to the random enquiry programmes run alongsidetargeted audits by the US IRS and the UK HMRC.

18

Page 20: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

taxpayers in SE1 and a bigger rise in compliance from taxpayers in SE2. The empirical cdffor revenue when targeting the largest evaders and when targeting the most likely evadersare very close, which suggests that these strategies can be equally successful in recoveringunpaid tax.

Figure 4: Results when the tax authority uses predictive analytics to target the most likelyevaders in period 50 onwards.

Table 4 shows the marginal e¤ects of explanatory variables on the predicted probabilityof evasion. As in section 5.1, the estimated coe¢ cients change as new data are added.We report the estimates based on the data accumulated up to the �nal period in the lastindependent simulation; this allows capturing the long-run outcome and using meaningfulstandard errors. The table shows that individuals with higher declared income, those auditedin the previous period, and those in SE2 are less likely to evade. These results are similar tothe predictions of the model for level-targeting; neither the e¤ect of the previous audit northe e¤ect of occupation are statistically signi�cant at conventional levels.

19

Page 21: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Variable Coe¤. St. E. z�stat. ME (avg. data) ME (indiv. avg.)Const. 8:82 1:89 4:66 � �Declared Income �1:50 0:32 �4:69 �0:001 �0:001Previous audit �14:10 171 �0:08 �0:998 �0:938SE1 �3:28 2:35 �1:39 �0:014 �0:005

Table 4: Estimated marginal e¤ects from the evasion score equation (logit model). Thereported coe¢ cients and standard errors are the maximum likelihood estimates from thedata accumulated up to the �nal period of the last independent simulation (?? observations).

6 Conclusions

The optimal design of audit strategy is important for tax authorities, whose aim is to designpolicy instruments to reduce the tax gap (the di¤erence between anticipated and actualtax revenue). In this paper we analyze two alternative strategies that use the concept ofpredictive analytics: targeting the largest evaders and targeting the most likely evaders. Wedo this is in a rich network model where taxpayers are heterogeneous in risk, beliefs, andattitude towards compliance, and where agents self-select into di¤erent occupational groups.In this model attitudes and beliefs endogenously emerge that di¤er across sub-groups of thepopulation, compliance behavior is di¤erent across occupational groups, and these e¤ects arereinforced by the development of group-speci�c attitudes and beliefs. Given this behavior,the tax authority may wish to condition its audit strategy not only on reported income, butalso on occupation.What does our model suggest for the optimal strategy of a tax authority? On the

one hand, given the objective of maximizing revenues, targeting the level, or the value,of evasion appears to be more important. On the other hand, the �strike rate�, or theproportion of audits that reveal evasion, is also important, if the tax authority wants toreduce the burden on the compliant population (James, 2011). Our results imply that thetwo strategies have the same quantitative e¤ect on the revenues; furthermore, the rates ofcompliance in population and in each occupation are not noticeably di¤erent between thetwo strategies.The robustness of this conclusion to the selection of model assumptions is, however,

an issue that could be addressed in future research. We have assumed that the agents liveinde�nitely long throughout the model. It would be possible to incorporate �nite lives withinthe simulation with some agents leaving the economy each period and new agents joining.What would be harder to incorporate is forward-looking agents because solving the �xed-point problem that would then arise would be exceptionally demanding on computationalresources.13 We have adopted a very simple process of random network formation. Othernetwork formation processes would be interesting, particularly networks with �celebrities�

13We run the simulation on a computer with Intel multi-core and 64-bits. The computing time for tenindepedent simulations of the model is approximately 48 hours. Incorporating forward-looking agents wouldrequire a considerable reduction in population size in order to make the analysis computationally feasible.

20

Page 22: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

who in�uence many other agents. Another audit strategy of interest to investigate is the�light-touch� audit strategies, where either random or targeted audit can reveal only afraction of concealed income but at a lower cost to the tax authority. The light-touchaudits allow a wider coverage of population, thereby raising subjective beliefs and improvingcompliance; however, partial detection increases expected payo¤from evasion and encouragesnon-compliance (Rablen, 2014). This trade-o¤, along with the cost considerations, can leadto the selection of an optimal mix of audits.

ReferencesAdvani, A., Elming, W., Shaw, J., 2015. How long lasting are the e¤ects of audits?, TARCDiscussion Paper, 011-15.

Allingham, M., Sandmo, A., 1972. Income tax evasion: A theoretical analysis. J. Pub. Econ.1, 323�338.

Alm, J., Erard, B., 2013. Using public information to estimate informal supplier income,unpublished, Tulane University.

al-Nowaihi, A., Dhami, S., 2007. Why do people pay taxes: Expected utility theory versusprospect theory. J. Econ. Behav. Org. 64, 171�192.

Andrei, A.L., Comer, K., Koelher, M., 2014. An agent-based model of network e¤ects on taxcompliance and evasion. J. Econ. Psychol. 40, 119�133.

Arcand, J.-L., Graziosi, G.R., 2005. Tax compliance and rank dependent expected utility.Geneva Risk Insur. Rev. 30, 57�69.

Ashby, J.S., Webley, P., Haslam, A.S., 2009. The role of occupational taxpaying cultures intaxpaying behaviour and attitudes. J. Econ. Psychol. 30, 216�227.

Axtell, R.L., Epstein, J.M., Young, H.P., 2006. The emergence of classes in a multi-agentbargaining model. In: Epstein, J.M. (Ed.), Generative Social Science: Studies in Agent-Based Computational Modeling. Princeton University Press, Princeton, NJ.

Baldry, J.C., 1986. Tax evasion is not a gamble. Econ. Lett. 22, 333�335.

Bernasconi, M., 1998. Tax evasion and orders of risk aversion. J. Pub. Econ. 67, 123�134.

Bernasconi, M., Zanardi, A., 2004. Tax evasion, tax rates and reference dependence. Finan-zArchiv 60, 422�445.

Black, J., de Meza, D., 1997. Everyone may bene�t from subsidising entry to risky occupa-tions. J. Pub. Econ. 66, 409�424.

Bloomquist, K.M., 2004. Multi-agent based simulations of the deterrent e¤ects of taxpayeraudits. In: Kalambokidis, L. (Ed.), Proceedings of the 97th Annual Conference on Taxation.National Tax Association, Washington, DC.

Bloomquist, K.M., 2011. Tax compliance as an evolutionary coordination game: An agent-based approach. Public Finan. Rev. 39, 25�49.

21

Page 23: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Bloomquist, K.M., 2013. Incorporating indirect e¤ects in audit case selection: An agent-based approach. IRS Res. Bull. 103�116.

Bradbury, B., 1997. The living standards of the low income self-employed. Australian Econ.Rev. 30, 374�389.

Camerer, C., Weber, M., 1992. Recent development in modelling preferences: Uncertaintyand ambiguity. J. Risk Uncertain. 5, 325�370.

Chander, P., Wilde, L.L., 1998. A general characterization of optimal income tax enforce-ment. Rev. Econ. Stud. 65, 165�183.

Chetty, R., 2009. Is the taxable income elasticity su¢ cient to calculate deadweight loss? Theimplications of evasion and avoidance. Amer. Econ. J.: Econ. Pol. 1, 31�52.

Christiansen, V., 1980. Two comments on tax evasion. J. Pub. Econ. 13, 389�393.

Cleary, D., 2011. Predictive analytics in the public sector: Using data mining to assist bettertarget selection for audit. Electron. J. e-Govern. 9, 132�140.

Clotfelter, C.T., 1983. Tax evasion and tax rates: An analysis of individual returns. Rev.Econ. Stat. 65, 363�373.

Cowell, F.A., 1981. Taxation and labour supply with risky activities. Economica 48, 365�379.

Cowell, F.A., 1985. Tax evasion with labour income. J. Pub. Econ. 26, 19�34.

Cowell, F.A., Gordon, J.P.F., 1988. Unwillingness to pay. J. Pub. Econ. 36, 305�321.

Crane, S.E., Nourzad, F., 1986. In�ation and tax evasion: An empirical analysis. Rev. Econ.Stat. 68, 217�223.

Davis, J.S., Hecht, G., Perkins, J.D., 2003. Social behaviours, enforcement, and tax compli-ance dynamics. Account. Rev. 78, 39�69.

De Juan, A., Lasheras, M.A., Mayo, R., 1994. Voluntary tax compliant behavior of Spanishincome tax payers. Public Finan. 49, 90�115.

Eide, E., 2001. Rank Dependent Expected Utility Models of Tax Evasion. Working PaperICER 27/2001, International Centre for Economic Research.

Eisenhauer, J.G., 2006. Conscience as a deterrent to free riding. Int. J. Social Econ. 33,534�546.

Eisenhauer, J.G., 2008. Ethical preferences, risk aversion, and taxpayer behavior. J. Socio.Econ. 37, 45�63.

El¤ers, H., Hessing, D.J., 1997. In�uencing the prospects of tax evasion. J. Econ. Psychol.18, 289�304.

Feldman, N.E., Slemrod, J., 2007. Estimating tax noncompliance with evidence from unau-dited tax returns. Econ. J. 117, 327�352.

Fortin, B., Lacroix, G., Villeval, M.-C., 2007. Tax evasion and social interactions. J. Pub.Econ. 91, 2089�2112.

22

Page 24: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Franklin, J., 2009. Networks and taxpayer non-compliance. London: H.M. Revenue andCustoms.

Geeroms, H., Wilmots, H., 1985. An empirical model of tax evasion and tax avoidance.Public Finan. 40, 190�209.

Glaeser, E., Sacerdote, B., Scheinkman, J., 1996. Crime and social interactions�Q. J. Econ.111, 507�548.

Gordon, J.P.F., 1989. Individual morality and reputation costs as deterrents to tax evasion.Eur. Econ. Rev. 33, 797�805.

Goyal, S., 2009. Connections: An Introduction to the Economics of Networks. PrincetonUniversity Press, Princeton, NJ.

Grether, D., 1980. Bayes rules as a descriptive model: The representativeness heuristic. Q.J. Econ. 95, 537�557.

Guala, F., Mittone, L., 2005. Experiments in economics: External validity and the robustnessof phenomena. J. Econ. Method. 12, 495�515.

Hamilton, B.H., 2000. Does entrepreneurship pay? An empirical analysis of the returns toself employment, J. Pol. Econ., 108, 604�631.

Hashimzade, N., Myles, G.D., Rablen, M.D., 2014. Predictive analytics and the targeting ofaudits, TARC Discussion Paper, 008-14.

Hashimzade, N., Myles, G.D., Tran-Nam, B., 2013. Applications of behavioural economicsto tax evasion, J. Econ. Surv. 27, 941�977.

Hsu, K.-W., Pathak, N., Srivastava, J., Tschida, G., Bjorklund, E., 2015. Data mining basedtax audit selection: A case study of a pilot project at the Minnesota Department of Revenue.Ann. Inform. Syst. 17, 221�245.

IRS, 2014. Internal Revenue Service Data Book, 2013. Publication 55B. IRS, Washington,DC.

Isachsen, A.J., Strøm, S., 1980. The hidden economy: The labor market and tax evasion.Scand. J. Econ. 82, 304�311.

Jackson, M.O., 2004. A survey of models of network formation: Stability and e¢ ciency. In:Demange, G., Wooders, M. (Eds.), Group Formation in Economics; Networks, Clubs andCoalitions. Cambridge University Press: Cambridge.

James, R., 2011. Using predictive analytics to help target error and fraud in tax credits.International Conference on Taxation Analysis and Research, London, December 2011.

Kahneman, D., Tversky, A., 1979. Prospect theory: An analysis of decision under risk.Econometrica 47, 263�293.

Kanbur, S.M., 1981. Risk taking and taxation: An alternative perspective. J. Pub. Econ. 15,163�184.

23

Page 25: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Kastlunger, B., Kirchler, E., Mittone, L., Pitters, J., 2009. Sequences of audits, tax compli-ance, and taxpaying strategies. J. Econ. Psychol. 30, 405�418.

Kim, Y., 2003. Income distribution and equilibrium multiplicity in a stigma-based model oftax evasion. J. Pub. Econ. 87, 1591�1616.

Kirchler, E., Hoelzl, E., Wahl, I., 2008. Enforced versus voluntary tax compliance: The�slippery slope�framework. J. Econ. Psychol. 29, 210�225.

Kolm, S.-C., 1973. A note on optimum tax evasion. J. Pub. Econ. 2, 265�270.

Korobow, A., Johnson, C., Axtell, R., 2007. An agent-based model of tax compliance withsocial networks. Natl. Tax J. 60, 589�610.

Lee, K., 2001. Tax evasion and self-insurance. J. Pub. Econ. 81, 73�81.

Maciejovsky, B., Kirchler, E., Schwarzenberger, H., 2007. Misperception of chance and lossrepair: On the dynamics of tax compliance. J. Econ. Psychol. 28, 678�691.

Mittone, L., 2006. Dynamic behaviour in tax evasion: An experimental approach. J. Socio.Econ. 35, 813�835.

Myles, G.D., 1995. Public Economics. Cambridge University Press, Cambridge.

Myles, G.D., Naylor, R. A., 1996. A model of tax evasion with group conformity and socialcustoms. Eur. J. Polit. Econ. 12, 49�66.

Pestieau, P., Possen, U.M., 1991. Tax evasion and occupational choice. J. Pub. Econ. 45,107�125.

Pissarides, C.A., Weber, G., 1989. An expenditure-based estimate of Britain�s black economy.J. Pub. Econ. 39, 17�32.

Plumley, A.H., Steuerle, C.E., 2004. Ultimate objectives for the IRS: Balancing revenue andservice. In: Aaron, H.J., Slemrod, J. (Eds.), The Crisis in Tax Administration. BrookingsInstitution Press, Washington DC.

Pyle, D.J., 1991. The economics of taxpayer compliance. J. Econ. Surv. 5, 163�198.

Quiggin, J., 1981. Risk perception and the analysis of risk attitudes. Australian J. Agric.Econ. 25, 160�169.

Quiggin, J., 1982. A theory of anticipated utility. J. Econ. Behav. Organ. 3, 323�343.

Quiggin, J., Wakker, P., 1994. The axiomatic basis of anticipated utility: A clari�cation. J.Econ. Theory 64, 486�499.

Rablen, M.D., 2010. Tax evasion and exchange equity: A reference-dependent approach.Public Finan. Rev. 38, 282�305.

Rablen, M.D., 2014. Audit probability versus e¤ectiveness: The Beckerian approach revis-ited. J. Public Econ. Theor. 16, 322�342.

Reinganum, J., Wilde, L., 1985. Income tax compliance in a principal-agent framework. J.Pub. Econ. 26, 1�18.

24

Page 26: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Reinganum, J., Wilde, L., 1986. Equilibrium veri�cation and reporting policies in a modelof tax compliance. Int. Econ. Rev. 27, 739�760.

Sandmo, A., 2005. The theory of tax evasion: A retrospective view. Natl. Tax J. 58, 643�663.

Schneider, F., Enste, D.H., 2000. Shadow economies: Size, causes, and consequences. J.Econ. Lit. 38, 77�114.

Slemrod, J., 2007. Cheating ourselves: The economics of tax evasion. J. Econ. Perspect. 21,25�48.

Snow, A., Warren, R.S. Jr., 2005. Ambiguity about audit probability, tax compliance, andtaxpayer welfare. Econ. Inq. 43, 865�871.

Souma, W., Fujiwara, Y., Aoyama, H., 2003. Complex networks and economics. Physica A324, 396�401.

Spicer, M.W., Lundstedt, S.B., 1976. Understanding tax evasion. Public Finan. 31, 295�305.

Tesfatsion, L., 2006. Agent-based computational economics: A constructive approach to eco-nomic theory. In: Tesfatsion, L., Judd, K.L. (Eds.), Handbook of Computational Economics.North-Holland, Amsterdam.

Thaler, R.H., 1994. The winner�s curse: Paradoxes and anomalies of economic life. PrincetonUniversity Press, Princeton, NJ.

Torgler, B., 2002. Speaking to theorists and searching for facts: Tax morale and tax compli-ance in experiments. J. Econ. Surv. 16, 657�683.

Trandel, G., Snow, A., 1999. Progressive income taxation and the underground economy.Econ. Lett. 62, 217�222.

Traxler, C., 2010. Social norms and conditional cooperative taxpayers. Eur. J. Polit. Econ.26, 89�103.

Tversky, A., Kahneman, D., 1992. Advances in prospect theory: Cumulative representationof uncertainty. J. Risk Uncertain. 5, 297�323.

Wallschutzky, I.G., 1984. Possible causes of tax evasion. J. Econ. Psychol. 5, 371�384.

Webley, P., Robben, H., Morris, I., 1988. Social comparison, attitudes and tax evasion in ashop simulation. Social Behav. 3, 219�228.

Weigel, R.H., Hessing, D.J., El¤ers, H., 1987. Tax evasion research: A critical appraisal andtheoretical model. J. Econ. Psychol. 8, 215�235.

Yaniv, G., 1999. Tax compliance and advance tax payments: A prospect theory analysis.Natl. Tax J. 52, 753�764.

Yitzhaki, S., 1974. A note on income tax evasion: A theoretical analysis. J. Pub. Econ. 3,201�202.

25

Page 27: Predictive Analytics and the Targeting of Audits · This can be rationalized by introducing an additional cost into the evasion decision. These costs can be –nancial (Bayer, 2006;

Parameter valuesProbability of being linked in the network: v = 0:5;Tax rate: � = 0:25;Fine rate: f = 1:5;Weight in information exchange: � = 0:75.Number of agents: N = 6000.Time horizon: T = 50.

Probability distributions (N = normal; U = continuous uniform)Wage in employment: logw � N (�0; �

20);

E [w] = 13:0425; V ar (w) = 4;Skill in SE1:s1j =

11�0:5~x , ~x = U [0; 1];

E�s1j�= 1:387; V ar

�s1j�= 0:078;

Income in SE1:s1jy

1; log y1 � N (�1; �21);

E [y1] = 8:01; V ar (y1) = 12:26;Skill in SE2:s2j =

11�0:5~x , ~x = U [0; 1];

E�s2j�= 1:387; V ar

�s2j�= 0:078;

Income in SE2:s2jy

2; log y2 � N (�2; �22);

E [y2] = 8:3; V ar (y1) = 22:8;Risk aversion: � � U [0:1; 5:1];Initial belief on audit probability: p0 � U [0; 1];Initial weight on the payo¤ to following the social custom: �0 � U [0; 1];Value of social custom: z � 3� 10�5 � U [0; 1].

26