creating a decentralised payment network: a study of bitcoin

Upload: coinometrics

Post on 12-Oct-2015

344 views

Category:

Documents


1 download

DESCRIPTION

Bitcoin provides the first case study of a decentralised payment network. With no central authority, participants have to agree upon a set of rules in order to process transactions. Delays in the transmission of information between participants create network partitions, where some participants have different information sets. This paper presents an empirical study of this phenomenon and a model of incentives facing network participants as a result.

TRANSCRIPT

  • Creating a decentralised payment network:A study of Bitcoin1

    Jonathan Levin2

    Department of Economics,

    University of Oxford,

    Oxford, OX1 3UQ, UK.

    May 14, 2014

    Abstract

    Bitcoin provides the first case study of a decentralised payment network. With

    no central authority, participants have to agree upon a set of rules in order to process

    transactions. Delays in the transmission of information between participants create

    network partitions, where some participants have different information sets. This

    paper presents an empirical study of this phenomenon and a model of incentives

    facing network participants as a result.

    1This thesis is submitted in partial fulfilment of the requirements for the degree of Master of Philos-ophy in Economics. Word Count: 19 095

    2I thank my supervisors Alexandre de Corniere and Mark Armstrong, for many helpful discussions.I am grateful to the Coinometrics team, in particular, Nadi Sarrar and the Organ of Corti, who helpedcollect data and provided a great sounding board throughout this assignment. I also thank manymembers of the Bitcoin community for responding to elementary questions. Any errors are my own.

  • 1 Introduction

    Bitcoin provides a case study of a new type of payment system. All existing payment

    systems have centralised validators that are trusted to process transactions and guard

    against fraud and double spending. In order to obtain the ultimate goal of having

    a decentralised payment system with no single point of failure, Bitcoin has to solve

    these security threats in more averse conditions than in traditional payment systems.

    There are no mechanisms to maintain trust between validators on the Bitcoin network

    and as a decentralised system, it suffers symptomatic delays in sending and receiving

    information. With no centralised authority on the network, the Bitcoin protocol sets the

    rules of engagement but has no way of enforcing them. Validators need to be provided

    with economic incentives so that it is in their own best interest to obey the rules of the

    protocol. This thesis combines empirical observations of activity on the Bitcoin network

    and the rules of the protocol to derive the optimal behaviours of validators (miners)

    on the Bitcoin network.

    The financial crisis has been the focus of both academic and media attention in the past

    five years. Many models of macroeconomics are being reworked to include the role of

    banks as financial intermediaries. Banks in these frameworks are simply parties that

    are able to exploit synergies between the provision of liquid deposits and equally illiquid

    loans (Diamond & Dybvig 1983, Kashyap et al. 2002). However, banks have always

    served as providers of payment services. In this role they transfer liquid claims quickly

    and cheaply, with minimal legal uncertainty (Kahn & Roberds 2009). Recent trends in

    financial innovation and deregulation have led to some of the core advantages of banks

    as financial intermediaries being eroded as new entrants and markets have replaced the

    need for their services. Banks, with privileged access to wholesale payment systems, still

    command great power in payment systems (Lacker 2006). However, this dominance is

    also being challenged by the entry of new software as service platforms such as Pay Pal

    2

  • and Transferwise and peer to peer markets. Bitcoin presents a new model of payment

    system, where peer to peer transactions are handled on top of a decentralised architecture

    with no central authority. The analysis of these new ways of transferring value sit within

    the mechanism design approach of payment economics.

    In retail payment systems, principal policy debates have raged over issues of competitive

    efficiency. Payment systems are a relatively expensive component of the financial infras-

    tructure, estimated to generate revenues of 3 per cent of GDP in the US (Humphrey

    et al. 2000). The industry is highly concentrated with only a few major players in the

    developed markets. In 2003, Wal-Mart settled for $3 billion in an antitrust case against

    Visa and MasterCard was settled. In Australia, card fees have been the focus of regu-

    latory efforts (Lowe & Macfarlane 2005). The electronic retail payment system has also

    come under scrutiny due to large amounts of data theft and privacy leaks resulting in

    outright fraud. In 2013, Target, a major retailer in the US, lost up to 110 million cus-

    tomer records with their associated credit card details, email addresses, names and other

    personal information (Harris & Perlroth 2014). Data leaks lead to further and more long

    term implications with phishing ploys and other extortive methods. As other industries

    have undergone technological transformation since the invention of the internet, there are

    many initiatives to radically transform legacy payments systems. The creator of Bitcoin,

    Satoshi Nakamoto (2008), had a decentralised vision of how the transfer of value can

    be achieved. In avoiding central authorities, Bitcoin, by design, would serve to alleviate

    the issues of market concentration and loss of consumer data as these responsibilities

    are placed squarely with participants. In its technical design Bitcoin achieves what it

    sets out for, however the challenge remains to incentivise a decentralised payment system

    that does not tend to an oligopolistic structure seen in the existing payments space. This

    paper looks at both the empirical reality of the Bitcoin network and creates a framework

    to understand the long run incentive structures that Bitcoin has created for itself.

    3

  • In this paper, I develop a model to study strategic incentives of bitcoin miners (the

    validators), with respect to how many transactions to process. Processing more trans-

    actions increases the miner revenues conditional on winning (see next section), but

    reduces the probability of winning. The main contributions of this thesis are (i) to de-

    rive an explicit formula for the probability of winning as a function of the number of

    transactions; (ii) to estimate this probability using data which I collected, (iii) to use

    this estimate in order to inform market design and security questions. In particular, I

    show that transaction fees should be increased by a factor of four in order to ensure

    that the market does not converge to a bad equilibrium in which no transactions are

    processed by profit-maximizing miners.

    2 Introduction to Bitcoin

    2.1 Creating a secure global state verification process

    The Bitcoin protocol is designed to create a safe way for a group of anonymous agents to

    update a public ledger of all past transactions. The same way that a bank uses a ledger

    to keep a record of flows of money, the Bitcoin network has a public ledger visible to

    all. In traditional privacy models, trusted third parties control the maintenance of such

    a ledger. In Bitcoin, making the ledger transparent allows voluntary members of the

    public to be the auditors of the ledger. The technological breakthrough of Bitcoin was

    to design a protocol that ensures that it is very costly to manipulate the public ledger

    and as a result, participants on the network can come to consensus on the current state

    of the ledger and thus the current owners of all the bitcoins in circulation.3

    Due to the public nature of the ledger, bitcoins are associated with unique identifiers,

    3Bitcoin with a capital B can refer to the currency as a unit of account, the P2P network or theprotocol, whereas bitcoins are the currency units.

    4

  • known as Bitcoin addresses, rather than the owners name or account number.4 Owner-

    ship of a Bitcoin address is proven by possession of a unique private key for a particular

    address. When transacting users have to digitally sign a transaction, using their pri-

    vate key, in order to reassign their coins to another user. Coins are never created or

    destroyed in transactions the rights to spend them in eyes of the protocol are simply

    reassigned.

    Bitcoin transactions are formatted according to double entry accounting. There are

    inputs to a transaction which must be previously unspent bitcoins and outputs, the

    quantities and corresponding addresses to where they are now sent. A valid transactions

    is defined by total inputs equaling total outputs.5 Every participant on the network

    can therefore verify transactions. Validation of transactions is done by every node on

    the network. The users do not place trust in any given validator but in the collective

    validation of many different participants, where none can unilaterally alter the rules of

    the game or the record of transactions. Since no validators are trusted, Bitcoin needs a

    mechanism to obtain a unanimous consensus of the state of the global ledger. As with

    other consensus mechanisms, Bitcoin achieves this through a voting mechanism.

    Voting on the bitcoin network is the ability to send an update of the global state of the

    ledger to other participants. An update of the global state of the ledger would include a

    list of transactions that have occurred transferring bitcoins between different addresses.

    Bitcoin in this way is similar to the stone money of the island of Yap (Gillilland 1975).

    On the island, ownership was primarily determined by consensus on the current state

    of ownership rather than physically moving objects of value. However, unlike the island

    of Yap, where there was trust between the different parties and an ability to settle

    disputes, Bitcoin operates in a low trust environment and needs to implement a rule for

    4Without cost, users can create new Bitcoin addresses to store their coins offering cheap possibilitiesof maintaining privacy.

    5In Bitcoin agents can spend fewer outputs than inputs, where this money is left for a validator topick up.

    5

  • how consensus is achieved.

    The network faces two main challenges when achieving unanimous consensus. First,

    with no centralised arbitrator, Bitcoin needs a method of creating consensus among

    pseudo-anonymous participants. Collecting individual votes and applying a majority

    voting rule is not feasible as there are no barriers to participation and an attacker may

    claim multiple fraudulent identities. In the computer science literature, this is commonly

    known as a Sybil attack (Douceur 2002). Second, information cannot be guaranteed to

    arrive at the same time and some messages may fail to reach their destination. In a

    decentralised network where every node is not connected to every other node, there are

    inevitably delays in how information spreads around the network. Each node only has

    local knowledge of his neighbours and no view of the global state of the network. The

    second challenge is a very well studied problem in the computer science literature known

    as the Byzantine Generals problem (Lamport et al. 1982).

    In order to minimise the risk of a Sybil attack, Bitcoin introduces a costly voting mecha-

    nism to deter malicious attackers. This costly voting mechanism is the a requirement to

    solve a computational puzzle and thereby supply a proof of work. In Bitcoin, the criteria

    for the solution to the computational puzzle is the form of random outputs generated

    from a one way hash function. A one-way hash function cannot be inverted, and pro-

    vides a method of mapping known inputs to random outputs. Since the hash function

    is a one way function, there is no more efficient way of solving the problem than brute

    force check and guess.6 Every attempt at the puzzle has an independent probability of

    being the valid solution to the puzzle. Once found, the solution is publicly verifiable by

    6A hash function takes an arbitrary block of data and returns a fixed size bit-string which is a specifiednumber of characters long. Any change in the input data provided to the hash function will change theresulting bit string. The hash of random data is essentially a random number. If the highest possiblevalue that it can take on is Y, then the probability that a random hash has a value less than X is just X

    Y.

    Bitcoin borrowed its computational puzzle and proof of work from an earlier invention called Hashcash(Back 2002). The proof-of-work involves scanning for a value that when hashed with an algorithm suchas SHA-256, used in Bitcoin, the hash begins with a number of zero bits. The average work required isexponential in the number of zero bits required and can be verified by executing a single hash.

    6

  • a simple computation. Similar approaches have been used in other initially anonymous

    environments (Aspnes et al. 2005).

    The costly voting system also is designed to lengthen the time between states of the global

    ledger. The computational difficulty of the problem is adjusted so that the state of the

    ledger is only updated on average every 10 minutes. The state of the ledger is updated

    by transactions being added to the history of all previous transactions. Taken as a full

    serialised history, the network achieves consensus on the current state of ownership of

    all the bitcoins in circulation. Since the state of the network is only updated on average

    every 10 minutes, individual transactions are put into batches which are processed by

    validators. These batches are known as blocks. Validators, usually referred to as miners,

    receive transactions, verify the digital signatures and attempt to alter the state of the

    network by adding these transactions to the existing history. Miners alter the state of

    the global ledger by publishing a block to the network which requires a proof of work.

    The publication of a new block is often referred to as finding a block due to the check

    and guess methodology of the proof of work.

    New blocks are added to all previous blocks forming a linear chain of blocks (figure 1).

    Each history is linked to the previous to create a fully serialised log of transactions. If it

    were possible to trust an authority on the network to timestamp transactions when they

    occurred, there would be no need to have this system of sequential histories. Imagine

    a scenario in which Alice is making a transaction with Bob for a coffee: could we trust

    Alice to report the truthful time of that transaction? Alice could simply lie and say that

    she spent the coins used to pay Bob yesterday and Bob would be out of pocket. The

    logic follows that if we did not rely on Alice but rather on some timestamp authority, it

    would fall prey to manipulation. Indeed removing the reliance on a central timestamp

    authority could be seen as one of Bitcoins attributes. Removing the a single point of

    failure, Bitcoin does not place trust in a timestamp facility, it creates a serialised list of

    7

  • transactions through an order of blocks. The creation of each new block requires using

    information from a specific previous block, usually referred to as a parent in the chain,

    in order to achieve this goal.

    A0

    Genesis Block

    A1 A2 A3 Main Chain

    B2 Orphaned Block

    Figure 1: The Blockchain

    The linear chain of histories is defined by the current length of the chain, referred to as

    the blockchain height. When a block is added, the blockchain height increases by one.

    Nodes always consider the longest chain to be the correct one and will work on extending

    it. figure 1 shows what happens if there are two conflicting blocks published with the

    same parent. Block A1, with parent A0, has two children A2 and B2. This event is

    known as a collision as only one block can eventually be included on the main chain.

    As we will see collisions can only occur when two miners find blocks in quick succession.

    Some nodes in the network will hear about block A2 first and some will hear about block

    B2. This creates a partition in the network as some miners will be looking for block

    A3 using A2 as a parent and some will be using B2 as the parent to find B3. During

    this time there is no consensus on which chain is the correct chain. The correct chain

    is decided when one of the two branches publishes a block. Once a new block has been

    found by the network, in figure 1 it is A3, the nodes that we working on finding B3 will

    now switch to mining on top of A3 since it is now the longest chain. The rate at which

    new blocks are found is determined by the amount of computing power attempting the

    cryptographic puzzle. So if there is more computing power on branch A then it is more

    likely that A3 will be found before B3. The longest chain is often referred to as the main

    8

  • chain.

    The rule of the longest chain being considered as the correct history serves as the pro-

    vision of security on the network. Once a block has been added to the longest chain,

    it cannot be removed or altered without re-doing the proof of work. To see why this

    is the case, consider a miner that publishes block A2 and then tries to rewrite history.

    In order to do so, they would have to find another valid solution to the computational

    problem and then supply it to other miners. If the miner managed to find a Block B2

    after they published A2 they would need to find B3 faster than the rest of the network

    finds A3. As long as at least 50% of the miners are working on the A branch it will

    grow at a faster rate than the B branch and the network will treat the A branch as the

    main chain. Through this method, Bitcoin is based on a novel solution to the Byzantine

    generals problem, requiring that an adversary needs to capture more than 50% of the

    computing power to disrupt the protocol (For a formal proof see Miller & LaViola Jr

    (2014), For numerical simulations see Rosenfeld (2012)). An attack where an adversary

    captures more than half the network is known as a 51% attack.

    The more computing power being contributed to the network, the more expensive it

    would be to pull off a 51% attack. In this dimension, the security of the ledger is deter-

    mined by the aggregate level of computing power on the network. Providing computer

    power to the network is costly, and therefore economic incentives are provided in the

    form of a lottery of freshly minted bitcoins. This has ensured that the amount of com-

    puting power has grown at a substantial rate making it expensive to acquire a majority

    share of computing power on the network.

    9

  • 0!

    5,000,000!

    10,000,000!

    15,000,000!

    20,000,000!

    25,000,000!

    2009! 2019! 2029! 2039!

    Num

    ber

    of b

    itco

    ins!

    Year!

    Figure 2: Distribution of bitcoins over time

    2.2 Evolution of incentives

    Although the process of validation is completely automated, the cost of solving the

    cryptographic puzzle is substantial.7 Miners are incentivised by the issuance of new

    bitcoins and transaction fees. The first Bitcoin block, known as the genesis block, was

    mined on 3rd January 2009. Each block for the first 210,000 blocks generated 50 new

    bitcoins in a transfer to the miner that found the block. The generation of new bitcoins

    halves every 210,000 blocks which is approximately every four years. After 34 reductions

    in the reward the number of bitcoins converges to 21 million and no new Bitcoins will

    be created (figure 2).

    The bootstrapping phase of Bitcoin is defined by the importance of the block reward and

    the distribution of the currency between participants. Initially, the Bitcoin network could

    not generate enough proof of work to secure the network with transaction fees as there

    7The cost of solving the cryptographic puzzle changes over time as more computing power is beingcontributed to the network. The difficulty of the problem dynamically adjusts so that a block is foundby the entire network every 10 minutes. Every two weeks the network assesses how many blocks thenetwork has produced in the last two weeks and sets the difficulty level to ensure that approximatelyone block will be solved every ten minutes. If the cost of solving the cryptographic puzzle were trivialthen it would not serve its purpose at making it expensive to take over the network

    10

  • were very few transactions.8 The protocol also used proof of work in order to distribute

    the coins with no centralised authority, as it creates a lottery for newly created coins in

    the form of block rewards. The users therefore do not directly contribute to the costs

    of running the network but pay for it through inflation. As the block rewards decrease,

    the users are required to pay to compensate the miners for their investment, electricity

    and maintenance costs. In this way Bitcoin will have to move from a model where

    security is paid for through the issuance of more bitcoins, independent of the number of

    transactions, to a system where transactions have to cover the cost of security.9

    The bootstrapping phase was successful in meeting this goal with thousands of partic-

    ipants worldwide and creating the largest distributed computing network in existence.

    The pace of hardware innovation has also been rapid with Application Specific Integrated

    Circuit-boards coming to the market in 2013 (Taylor 2013). These machines are only

    capable of doing the necessary calculations to find a valid solution to the cryptographic

    puzzle set by the Bitcoin network and are optimised to do so. As the ecosystem expands

    and more peoples money is at stake, it is important that the incentives keeping miners

    honest and the security of the network intact are well understood.

    There has been some literature covering the incentives of protecting Bitcoin against

    adversaries. In particular, Meni Rosenfeld (2012) studied the probabilities that an ad-

    versary would manage to overtake the main chain with less than 51% of the computing

    power. If an adversary were able to overtake the main chain, this would give them an

    ability to double spend coins at a merchant as they can effectively re-write their own

    spending history. Kroll et al. (2013) look at the incentives of a 51% attack on the net-

    work from an outside perspective, where an adversary may be looking to profit outside

    8At the beginning of the network in 2009, most of the blocks had zero transactions in them but minerswere still rewarded 50 bitcoins for every block they found. In total over the network history there havebeen 84469 blocks with no transactions (Blockr 2014).

    9For a model looking at the viability of an equilibrium where transaction fees cover the cost of runningthe network see (Houy 2014b).

    11

  • of the Bitcoin network by destroying it rather than by double spending inside it.

    The analysis that follows is predicated on the empirical data of information propagation

    in the Bitcoin network. As outlined above, one of the distinguishing features of the

    Bitcoin network is its decentralised design. Geographic dispersion and network latency

    are symptomatic of such a network. This combined with the requirement for each node

    to validate every transaction forms the economic incentives which are the focus of this

    study.

    2.3 Information Propagation

    Bitcoin is a peer to peer (P2P) protocol where volunteers collectively implement a set

    of procedures and rules achieve unanimous consensus on the state of ownership of all

    existing bitcoins. The protocol is implemented in a network where every full node is

    homogenous.10 Each full node independently verifies the validity of all the transactions,

    as a result of maintaining the assumption of minimal trust between nodes. Nodes spread

    information on transactions and blocks in order to update the network consensus. There

    are two main types of messages that are relayed in the network: transaction and block

    messages. In order to avoid sending transaction and block messages to nodes that have

    already received them from other nodes, they are not forwarded directly. A smaller inv

    message is broadcast when a transaction or block has been validated. This message is

    an invitation to other nodes announcing the presence of new information. It contains

    unique identifiers of transactions and blocks rather than all the information contained

    in the block or transactions. If the node receiving an inv message does not have the

    transaction or block in question, then it replies with a getdata message to the sender

    of the inv message. The sender then sends the information pertaining to the block or

    10There are now different types of lightweight nodes on the Bitcoin network with Simple PaymentVerification (SPV) clients but these are ignored in the analysis, since these follow heuristics and are notvalidators of transactions by default.

    12

  • transaction to the peer that requested that information. The reason this occurs is to

    reduce the data load on the network.11 The method used to spread information on the

    Bitcoin network resembles a push model of rumour spreading.12

    A node broadcasts transactions by sending inv messages to nodes they are directly

    connected to. Nodes that receive information about the transactions, ensure that they

    obey the protocol and then relay them to their peers. When information is transmitted

    between nodes there is a propagation delay. This is comprised mainly of the time that it

    takes to verify that a transaction is valid, but also the time to establish what data needs

    to be transferred and the delivery of the data. Given a fixed number of participants and

    a random network structure, eventually the whole network will become aware of that

    transaction occurring.13

    When a new node joins the network, it queries some DNS servers that keep track of the

    active nodes on the network.14 The DNS servers feed back a random list of nodes that the

    entering node can link to and receive an update on the state of the network. The result is

    that the Bitcoin network is a random graph Decker & Wattenhofer (2013). There is also

    no formal way of leaving the network. The addresses of the nodes that were connected to

    the network are stored by their neighbours until 90 minutes of inactivity has passed and

    subsequently other nodes purge them from the address set. Although there are different

    ways of connecting to the network, from the empirical analysis, custom implementations

    are rare and most nodes run the software supplied by the core development team. The

    random graph forms the basis for our assumptions of homogeneity in the model.

    11More detail on the way information is transmitted can be found in Decker & Wattenhofer (2013).12There is also an ability for nodes to send information requests without receiving inv messages first

    but this is uncommon.13There are proven results to show that random rumour spreading will reach the whole network within

    a given amount of time with a probability arbitrarily close to 1. See Nekovee et al. (2007).14New nodes join and leave the network constantly as people connect and disconnect to the network.

    13

  • 2.4 Discarded information

    Network latency causes information to be discarded. Bitcoin nodes do not have a full

    view of the current state of information in the network. This is captured by a single state

    variable, i.e. the height of the blockchain, h which takes on integer values starting at

    number one, usually referred to as the genesis block. When a block is added to the chain,

    the blockchain height increases by one. When miner i solves a block, A1, the information

    about the latest block in the chain is different for miner i over any other miner creating

    a partition. Partitions on the network are defined as sets of nodes that possess different

    beliefs about the current state of the network. The miners that find out A1 will join

    miner is partition of the network. All the other nodes who have not heard about A1 are

    now described as being in the uniformed partition, u. This partition attempts to find

    a block using A1 as the parent, whereas the partition that thinks that the blockchain

    height is still h is trying to create a block B1. If the miners in is partition hear about

    the creation of a conflicting block by uninformed partition, they will automatically reject

    it as they already have information about block A1 and they will not relay block B1 to

    their neighbours.

    Conflicting blocks cannot coexist in the chain of blocks as only one block at each height

    is allowed to join the main chain. Since there is no trusted central player, the choice

    between A1 and B1 is decided on which partition of the network finds the next block.

    If A2 is the next block to be found on the network then information about its existence

    starts to spread. When miners in the uninformed partition find out about A2 they

    immediately stop working on a conflicting problem since the blockchain height has now

    increased. This decreases the probability that a block that builds on B1 will ever be

    found since fewer miners are working on that problem. The race is therefore decided

    probabilistically. Due to the proof of work required to publish blocks to the network

    the probability of a perpetual race quickly falls to 0. This is always the case so long

    14

  • as information propagation for a block is shorter than the expected time of 10 minutes

    between blocks.15

    Work done on blocks that are not included in the main chain are not rewarded since

    any transactions in that block do not exist in the accepted serialised history of all

    transactions. When a valid block loses the race to another valid block, the block that is

    discarded is called an orphan block.

    3 Empirical section

    3.1 The Method

    Following a similar methodology employed by Decker & Wattenhofer (2013), I obtained

    more recent information propagation data which informed the formal modelling. Bitcoin

    as a decentralised network, has no central repository of data. In order to gather infor-

    mation about propagation times, one has to run a client that connects to the network.

    When connecting to the network, this client should connect to the highest proportion

    of the network as possible in order to capture information that extends beyond a local

    geography. The default setting for the software client is to connect to 8 nodes. A custom

    implementation of the client must be run in order to connect to a large portion of the

    network. Nadi Sarrar, a post-doctoral fellow at TU Berlin, built a custom version of the

    Satoshi Client to achieve this goal. During the observation, period we were able to con-

    15Miller & LaViola Jr (2014) prove that in the face of an adversary Bitcoin comes to unanimousconsensus as long as at least 50% of the computing power is on the honest chain provides a proofunder much more stringent conditions. In the adversarial case, the partition of the miners is fixed. Theadversary will never switch onto mining on the main chain, since its objective is to overtake its length. Inthe case of a fork due to a collision of A1 and A2, miners, in the event that a new block, A2 is published,switch to taking it as the valid chain. This is due to the rule that the longest chain is the correct chain.Hence the probability of a fork that is behind the main chain overtaking the main chain continuouslyfalls. The partition with the greater share of computing power finds more blocks, extending their leadresulting in more miners from uninformed partition to join the main chain until unanimous consensus isachieved.

    15

  • nect to over 90% of the nodes that were active on the network.16There were on average

    7000 nodes connected to the custom client and a total of 9,160 unique peers.

    Once connected to the network, the client collects invitations from its neighbours to hear

    about new blocks or transactions. Our implementation takes a fly on the wall approach.

    The custom client does not respond with the usual request to then get the data and

    does not relay any invitations to other nodes as it does not actually process blocks or

    transactions. The client is implemented as a passive observer to minimise the impact

    that the measurement node has on the performance of the network. This also has the

    added benefit of not storing any history of transactions and increases the performance

    of the client. Another consideration when implementing a custom client is complying

    with the network rules that can result in getting blacklisted by other nodes. Examples

    include: Sending an invalid transaction to a peer, sending transactions with duplicate

    inputs, sending a transaction with null output, sending a filterload request of more

    than 36000 bytes or 50 functions, or requesting to download more than 5000 blocks.

    Most of these behaviours are avoided when not relaying data.

    The client keeps a record of all the inv messages from its neighbours. It timestamps

    all of these with a local timestamp to ensure consistency.17 The client keeps the hash

    of the transaction or block as its unique identifier. This allows us to plot the spread

    of information of a particular block or transaction (figure 11). The client also captures

    the height of the blockchain and the size of each block to detect any collisions on the

    network and to compare propagation times across block sizes.

    The amount of data collected per block or transaction is very high since for every given

    16This used a combination of getaddr messages asking neighbours for the addresses of the nodesthey are connected to and simply accepting incoming connection requests.

    17There are timestamps embedded in the blocks on the Bitcoin network but these cannot be trustedas they can be very inaccurate. Some blocks published have dates set in the future. This could be theresult of strategic behaviour as miners would ideally not like to change the timestamps in their blocksprior to publishing and there is a set criteria that blocks that are more than 2 hours in the past will berejected by the network.

    16

  • transaction the client has approximately 7000 observations, one invitation for each of

    the connected nodes. The data requirements and storage limited the scope for a large

    observation period. The modified client was tested 19th April 2014 and full observation

    period chosen was 9 hours on May 5th 2014.

    The data from the modified client was annotated with two other datasets. Using Max-

    Minds GeoLite City database for IP geolocation, each node and the inv messages re-

    ceived from that node were encoded with the country of origin. Poese et al. (2011)

    found that country level data in the dataset is accurate. Using Organ of Cortis (2014)

    database, The Block-Spotters guide, blocks were identified by the mining pool which

    found them. Mining pools are groups of bitcoin miners who share revenues between them

    to smooth out their returns. Many of the pools sign the first transaction in a block that

    they find to credibly announce that they have found a block. Miners can then calculate

    their expected revenues and audit the pool operators.

    3.2 The Data

    During the observation period 49 blocks were propagated on the Bitcoin network. There

    was a wide range of the number of transactions included in each block. The smallest

    size block that was propagated contained 20 transactions, the largest, 871 transactions.

    The average time to reach 50% of the network was 5.4 seconds and it took on average

    24 seconds to reach 95% of nodes. There were significant outliers and anomalies in

    the raw data. Some of the invitations received from nodes were for blocks that were

    in the distant past which were removed from the data. As a result of nodes joining

    and leaving the network over the observation period, propagation times to the last 1%

    of nodes displayed highly non-linear trends and were therefore removed from the data.

    Furthermore the data displayed a bi-modal structure with many inv requests coming

    in the tail of the distribution. Nodes which had more than one inv message over 200

    17

  • Table 1: Summary Statistics of Block Propogation

    No. ofTxs

    Mean Size(Kb)

    MeanLatency Sd 25% 50% 75% 90% 95%

    Average 368 229.66 11.61 8.61 4.15 5.44 8.20 14.74 24.33

    Min 20 7.48 2.18 0.97 1.20 1.49 1.92 2.87 4.12

    Max 871 499.24 25.25 30.70 14.39 16.94 23.79 51.60 122.01

    Note: Mean Latency and the other summary statistics are measured in seconds

    seconds since the first propagation were removed from the dataset. This removed the

    bimodal structure and left a smoother distribution structure. The propagation of the

    49 blocks is charted in figure 11. This figure shows the amount of transactions included

    varied between pools but all demonstrate a similar trend of blocks with larger numbers

    of transactions propagate slower through the network.

    Using a linear model we obtain similar results to Decker & Wattenhofer (2013) who

    showed that there is a strong correlation between block size and propagation time. For

    any given percentile there is a stable linear relationship between propagation times and

    block size as evidence in table 2.

    Table 2: Linear models of seconds to reach percentiles of thenetwork and block size (kb)

    P10 P25 P50 P75 P95

    Size 0.011*** 0.013*** 0.017*** 0.029*** 0.141***(0.001) (0.001) (0.001) (0.002) (0.014)

    Cons 0.944** 0.034*** 1.567*** 1.663*** -4.557(0.372) (0.05) (0.04) (0.447) (3.754)

    N 49 49 49 49 49R2 0.55 0.64 0.74 0.86 0.67

    Standard errors in parentheses, ***p < 0.01, **p < 0.05, *p < 0.1

    18

  • These stable linear relationships can be used to derive a distribution for the spread

    of information on the Bitcoin network. Using distribution fitting packages in R a log-

    normal distribution is the best in the class of distributions. Other classes of distributions

    considered included: Pareto, Gamma, Weibull and Exponential distributions. The pa-

    rameterisation of the model is therefore specific to the observed period. If the network

    topology changes or another distribution is found to provide a better fit this could be

    used in its place. figure 3 shows the regression lines estimated in the linear model vs the

    parameterised lognormal distribution.

    l ll

    l

    l

    l

    l

    l

    l

    l

    l

    l l

    l

    l

    l

    l

    l

    l

    ll l

    l

    l

    l

    l

    ll

    l

    l

    l

    l

    ll

    l l

    lll

    ll

    l

    l

    l

    l

    l

    l

    l

    l l

    l

    l

    l

    l

    l

    ll

    l

    l

    l l

    l

    l

    l

    l

    l

    l

    ll l

    l

    l

    l

    l

    ll

    l

    ll

    l

    lll l

    l

    ll

    ll

    l

    l

    l

    l

    l

    l

    l

    l l

    l

    l

    l

    l

    l

    l

    l

    l

    l

    ll

    l

    l

    l

    l

    l

    lll l

    l

    l

    l

    l

    ll

    l

    ll

    l

    ll

    ll

    l

    ll

    l

    l

    l

    l

    l

    l

    l

    l

    l

    5

    10

    15

    20

    0 250 500 750Number of transactions in block

    Late

    ncy,

    seco

    nds

    Percentage of network informed l l l25% 50% 75%

    Empirical Lognormal model

    Fitted log normal model, linear scale

    Figure 3: Log Normal fit against linear regression and empirical data

    19

  • 3.3 Baseline Case

    Bitcoin, still in its bootstrapping phase, has an artificially imposed limit on the size

    of each block that is published on the Bitcoin network. This effectively also limits the

    number of transactions that can be processed a second. To run a full Bitcoin node and

    verify transactions, each node has to keep a history of all previous transactions to verify

    that any new transaction has inputs that have not already been spent. The limit of a

    one megabyte block was imposed to prevent the history of all bitcoin transactions from

    becoming too big, too quickly. There has been much discussion on whether to lift the

    artificial limit to allow for more transactions (Bitcointalk 2014).

    The current limit of 1 MB is equivalent to allowing 7 transactions a second at an average

    size. In the context of world payment systems this is very small. Visa and Mastercard do

    approximately 10,000 transactions per second. If Bitcoin is going to fulfil its role as the

    payment rails for many small transactions then it has to be tested in larger environments.

    The baseline case in the model, will allow miners to include up to 10,000 transactions

    in each block or approximately 16 transactions per second. While this is still far from

    other mainstream networks, this would be represent a ten fold increase in the number

    of Bitcoin transactions currently occurring on the network (Coinometrics 2014).

    4 A model of miners incentives arising from network

    latency

    Miners including more transactions in their blocks raise revenues but also lower the

    probability of winning. This section will study the conditions under which there is

    a bad Nash equilibrium where no miner includes any transactions in their blocks. For

    the bad equilibrium to occur miner i should have no incentive to include transactions

    20

  • given that all other miners include 0 transactions. The incentive to include transactions

    is determined by the tradeoff between the probability of winning and the transaction fee

    revenue. We will call this equilibrium strategy nonce only mining.18

    The transaction fee revenue is fixed and therefore the main challenge is to compute the

    probability of winning under the conditions where all other miners include no transac-

    tions. In order for Bitcoin to function as an incentive compatible payment system, we

    require that miners process at least some transactions. Informed by the empirical sec-

    tion, the model shows the conditions under which the bad equilibrium of miners having

    no incentive to include transactions is avoided.

    4.1 Literature Review

    Latency introduces a new domain of strategic interaction amongst miners. Including

    more transactions in a block means increasing its size and therefore the time that it

    takes to propagate.19 Motivated by the effect of information propagation, Nicolas Houy

    (2014a) and Gavin Andresen (2013) estimate the marginal cost of including a transaction

    into a block. Andresens estimate is a back of the envelope calculation. He uses the

    average time for a block to reach 50% of the network from Decker & Wattenhofer (2013)

    times the probability that the network finds a block in a given second, multiplied by

    the forgone block reward. Houy (2014a), uses a similar game theoretic approach to

    Lee & Wilde (1980) on innovation races. Under current conditions, Houy finds an

    equilibrium in which all miners include no transactions in their blocks. Both studies,

    while motivated by information propagation do not provide convincing accounts of how

    it actually manifests on the Bitcoin network.

    18Nonce only refers to just searching for a valid nonce which is the solution to the cryptographic puzzlerather than including more transactions.

    19Bitcoin allows for different types of transactions which have different sizes but in this model we willassume that all transactions have the same size.

    21

  • In Houys model all miners are mining on competing blocks and all start at time 0.

    Miners chose the number of transactions included in their block. The first miner to

    announce their block is not necessarily the winner as it can propagate slowly due to the

    number of transactions contained within it and can thus be beaten by a smaller block

    to the majority of the network. Houy makes an assumption that runs in contradiction

    to the rules of the Bitcoin protocol. Bitcoin stipulates that as a miner you should mine

    on the longest chain. Houy assumes that all miners continue to compete with miner i

    even though they might have heard about is block. In Andresens back of the envelope

    calculation the same assumption is employed (Andresen 2013).

    A simple example will be illustrative. If miner i finds a block, A1 and tells miners 1,2

    and 3 at time t, they immediately stop mining a competing block. The uninformed

    miners, 4,5 and 6 will continue mining a competing block, B1 while miners 1,2 and 3

    move on to finding A2 which uses A1 as a parent. When miners 4 and 5 hear about

    the block at time t + dt, they then stop mining at that point, so only miner 6 remains

    mining a competing block. Hence the share of competing mining power is changing

    over time. This means that both Houys approach and Andresens back of the envelope

    calculation both ignore the dynamics of information propagation that lies at the core

    of the problem. The analysis that follows will demonstrate that this in fact drives the

    equilibrium conditions.

    4.1.1 Some preliminaries

    Blocks are found by the network according to a poisson process. Every guess at the

    cryptographic puzzle could produce an valid solution as required by the protocol. Each

    guess has equal probability of being a valid solution and there is no learning taking place

    in finding the solution. Hence each guess is i.i.d. and hence the sequence of i.i.d. guesses

    is the definition of a poisson process. Assuming there is consensus on the network on

    22

  • what is the current height of the blockchain h. The random variable Y = Xh+1 Xh,the time difference in seconds between a block being found and its predecessor being

    found, is i.i.d, which yields the following probability:

    P[Y < t+ 1|Y t] = 1

    600(1)

    We will denote the rate at which the network finds a block as . Given that the whole

    network agrees on the current height of the blockchain, the expected time between blocks

    is given by E[Y]

    = 1 .

    In the analysis that follows we need to be able to split the poisson process as there will

    be partitions in the network. If N miners are contributing to the network and each has a

    share of the computing power j , wherejN j = 1 . We can think of miners finding

    blocks as splitting the network poisson process into subprocesses and randomly labelling

    the block found to an individual miner. Since this random selection is independent of

    the process, each miner will find a block at a rate i. Using again the fact that every

    guess at the cryptographic puzzle is i.i.d., each individual miners process is independent

    and we can sum over N independent Poisson processes to givejN j = .

    Let G = (V,E) be the Bitcoin network graph with delay diameter D(xi). This means

    that a block that contains xi transactions takes D seconds to fully propagate around

    the network. Decker & Wattenhofer (2013) found that the size of a block, measured

    in kilobytes, has a strong correlation with the amount of time that it takes for miners

    to hear about the block. Following Decker & Wattenhofer (2013) and using their data,

    Sompolinsky & Zohar (2013) find that the relationship between the size of a block has

    a linear effect on the median time that it takes to hear about a block. My estimates

    using data gathered in May 2014 also confirmed these results (See section 3). For the

    analysis that follows we are interested in the effect of including additional transactions

    on propagation times. We use the notation D(xi) = + xi to represent the number of

    23

  • seconds it takes a block to reach full consensus as a function of number of transactions

    xi where the parameters and estimated using a linear regression on blocks reaching

    the 95th percentile. The results from the regression (table 2) found the constant to not

    be significantly different from 0 and = 0.141.20

    During the time that a block is propagating around the network, miners switch from

    mining competing blocks to mining on top of the block being propagated. Thus a miner

    can only mine a competing block to one being propagated if they are unaware of its

    existence. The model of information propagation will define the number of miners that

    are aware about miner is block at time t given the number of transactions included in

    the block, xi. There are n nodes in the network, let tj,1 be the time in seconds at which

    node j learns about the existence of miner is block, A1. Let Ij,1(t) be the indicator

    function whether node j knows about block A1 at time t.

    Ij,1(t) =

    0 tj,1 > t

    1 tj,1 t

    I(t) =jN

    Ij,1(t)

    The ratio of informed nodes is given by:

    (xi, t) = E[I(t)] n1 (2)

    The ratio of informed nodes will be a function of xi since this affects the rate at which

    nodes become informed about the block in circulation. For a given xi, (xi, t) is a

    deterministic function of time. Based on fitting actual data to different distributions will

    make a distributional assumption that (xi, t) lnN (, ). From the empirical section20The 95th percentile was chosen due to the non-linear effects found after the 95th percentile arising

    from nodes connecting and disconnecting from the network. It is unlikely that these nodes are minerssince they have an incentive to minimise downtime in order to maximise their chances of finding blocks.

    24

  • we use the following formulas for and is estimated using maximum likelihood.

    = ln[Linear model for 50th percentile] (3)

    = ln[1.57 + 0.017xi] (4)

    = 0.944 (5)

    The relationship is consistent due to the process underlying the increase in delay. The

    propagation delay can be decomposed into two effects, firstly a verification process and

    a transmission delay (Decker & Wattenhofer 2013). When a node receives information

    about a block, it needs to validate the block. This includes checking the proof of work

    is valid and verifying all the digital signatures of each transaction. The most expensive

    element of the operation is checking the transaction input digital signature (Andresen

    2013). However this is still a trivial computation and the time taken is almost the same

    regardless of the underlying hardware. On average it takes 2ms to check a single digital

    signature (Andresen 2013). The transmission delay on top of this is the same between

    nodes. This is dependent on network topology and introduces heterogeneity in delay

    times. However there is still a strong linear effect found in the data as the verification

    process dominates the transmission delays (Decker & Wattenhofer 2013).

    4.1.2 The miners

    Miners are assumed to compete in monopolistic competition.21 No individual miners

    actions can have an influence on the outcomes of the other miners. Miners only care

    that the block they find is included in the main chain. As the previous section outlined,

    if a miner finds a valid block but it has been beaten to the main chain by another block

    then it is orphaned and the miner loses out on the potential revenue gained. From miner

    21This assumption could be interpreted as analysing the case that Satoshi Nakamoto (2008) laid outin the original white paper as one CPU one vote

    25

  • is viewpoint this can happen in one of two ways. Miner i finds a block, A1 first and

    another miner, unaware of A1 finds a block, B1 which is included on the main chain as

    B2 is found before A2. Alternatively, a miner could have found B1 first and miner i,

    unaware of B1 finds A1 but B2 is again found before A2 and so B1 is included on the

    main chain. Thus, for A1 to be included in the main chain, it must either propagate

    around the network fully with no collisions with other blocks or win any race between

    competing blocks. The probability that miner is block is added to the main chain is

    given by P (x ). It is a function of the number of transactions that all of the other minersinclude in their blocks denoted by the vector x = (xi,xi) is a vector of the numberof transactions included in all the miners blocks. As we will see, P (x ) is a decreasingfunction of xi.

    Miners constantly compete to find blocks on the Bitcoin network. However, this model is

    looking at the expected benefit from finding one block. The repeated interaction between

    miners is not significant as there is no change in the probability of winning caused by

    what happened in the past. The key strategic choice in the model is the number of

    transactions to put included in a block. The profit function for miner i can therefore be

    written as:

    i = Pi(x )(R+ cxi) (6)

    Where R is the block reward, c is the transaction fee and . The Block reward is an

    amount of bitcoin that are newly minted bitcoins given to the miner for contributing a

    block to the network. The block reward schedule is predetermined and decreases over

    time according to figure 2. Importantly, the block reward is independent of the number of

    transactions and hence its magnitude determines the opportunity cost of having a block

    orphaned. The transaction fee is currently fixed on the Bitcoin network at a minimum

    of 0.0001BTC ( $0.05) but there are plans to let the fees float in a market betweenusers and miners. For our model we will take this fee as given and fixed. The miner will

    26

  • only include transactions if the expected benefit from doing so is positive.

    piixi

    =Pi(x )

    xi(R+ cxi) + cPi(

    x ) (7)

    Which results in the following optimal condition.

    xi = max

    {Pi(x )

    | P i (x ) | Rc, 0

    }(8)

    Hence the nonce only equilibrium occurs where

    Pi(xi, 0, 0, ....0)

    | P i (xi, 0, 0, ...0) | Rc 0 ,xi 0 (9)

    4.1.3 The probability function

    The probability function represents the probability that, given a miner publishes a block,

    it is included on the main chain. When a miner publishes a block, the network is divided

    into two partitions. Nodes that are informed about miner is block and nodes that are

    uninformed. We will use i to denominate any miner informed by miner is block A1 and u

    to denominate any miner who is in the uninformed partition. The numbers in parenthesis

    indicate how many blocks have been found by the informed partition compared to any

    blocks found in the uninformed partition. Since the outcome is probabilistic, all possible

    cases are considered:

    Case 1: No other block is found prior to full propagation of A1. (1:0)

    Case 2: A miner in is partition finds A2 prior to full propagation of A1 (2:0)

    Case 3: A miner in us partition finds B1 after A1 has been propagated (1:1)

    Case 3.1: A miner is is partition finds A2 and A1 hence is added to the main

    chain (2:1)

    27

  • Case 3.2 A miner is us partition finds B2 and hence B1 is added to the main

    chain (1:2)

    Case 4: A miner j N finds B1 prior to A1 being propagated (1:1)

    Case 4.1: A miner is is partition finds A2 and A1 hence is added to the main

    chain (2:1)

    Case 4.2: A miner is us partition finds B2 and hence B1 is added to the main

    chain (1:2)

    Cases 3 and 4 are not outcomes of the game since this leaves the network undecided

    if A1 is included in the main chain. Another block needs to be published in order to

    arrive at unanimous consensus. The probability that miner is block, A1, is included

    in the main chain is determined by the probability that the game ends in states 1,2,3.1

    or 4.1. Under the assumption of monopolistic competition, the payoffs in each of the

    winning cases are the same. Even though in some cases more than one block is found

    by the informed partition, the probability that a small miner finds two blocks in a row

    is negligible. Hence we can rewrite miner is payoff function as

    i = (P1 + P2 + P3.1 + P4.1)(R+ cxi) (10)

    Initially, we will assume that B1 contains 0 transactions for u N to simplify theanalysis. This presents a worse case scenario as there exist strategic complementarities

    in the number of transactions that miners include in their blocks. The proof can be

    found in section A. Intuitively, the inclusion of more transactions by other miners

    improves the position of miner i in the network since their block A1 is competing with

    slower competition. The ability of miner i to remedy the increased probability of a

    block already circulating through including less transactions is limited. This is driven

    by the small probability that such a case could occur and the small marginal benefit

    28

  • from including fewer transactions. Given that other miners increase the size of their

    blocks, miner i has an incentive to increase the size of their blocks.

    Assuming B1 contains 0 transactions implies that we can leave Case 4 out of our analysis

    since in this case miner i never finds a block to compete with B1 since it propagates

    instantaneously and hence any block found by i will use B1 as a parent. This assumption

    also pins down the probabilities in the move from Case 3 to cases 3.1 and 3.2. figure 4

    0! D(xi) !

    0! D(xi) !A2! B2!

    0! D(xi) !B2! A2!

    Case 1!

    Case 2!

    Case 3!

    Figure 4: Timings of events leading to the three different cases

    shows the timings of the events that lead to the different cases considered in the model.

    In Case 1, no miner finds a block until A1 has fully propagated at time D(xi). In Case

    2, the informed partition find A2 before the uninformed partition find B2 and before A1

    has fully propagated. In Case 3, the uninformed partition find B2 before the informed

    partition find A2 before A1 has fully propagated. Case 3.1 follows from Case 3 by

    weighting the probability of ending in Case 3 by the share of the network controlled by

    the informed nodes at time. Case 3.2 is the same but using the uninformed share of the

    network.22

    22The assumption of instant propagation allows us to specify these probabilities. If the competingblock B1 did not propagate instantly we would need to specify how the block propagation of both A1and B1 change in each others presence. There is not yet enough data to fully understand how muchblock size impacts the propagation delay of the two blocks involved in an orphan race.

    29

  • 4.1.4 Case 1

    Case 1 occurs when prior to the full propagation of A1, no block is found by the network.

    We will denote the number of transactions included in A1 as xi. Let tn denote the time

    of arrival of the first block found by any miner on the network. Since the whole network

    is working on solving a block and it finds blocks according to a poisson process with rate

    , we can write down the probability that no miner has found a block at time t as:

    P(tn > t) = et (11)

    Thus the probability that we end in Case 1 is simply:

    P(Case 1 | xi) = eD(xi) (12)

    Proposition 4.1 P(Case 1 | xi) is decreasing in xi at an increasing rate.

    Recall D(xi) = + xi.

    xi

    [e(+xi)

    ]= e(+xi) < 0 ; > 0, > 0

    xi

    [ e(+xi)] = 22e(+xi) > 0Finding a block is a memoryless process as each guess at the cryptographic puzzle is

    independent. As the time that the network spends on finding a block goes to infinity

    the probability of a block being found converges to 1. Hence the probability that the

    network does not find a block converges to 0. As the number of transactions, xi, included

    in A1 increase, there is an increase in the time it takes for A1 to reach full consensus, and

    hence the probability that no blocks are found in the time to full propagation falls.

    30

  • 4.1.5 Case 2

    Case 2 occurs when, before the full propagation of A1, the informed nodes find a block,

    A2, and the uninformed nodes do not find a block. The probability of Case 2 occurring

    is determined by the ratio of uninformed to informed miners over time.

    Recall that (xi, t) is the proportion of nodes that are informed about miner is block

    for a given xi at time t. Following the empirical section, it is modelled as the CDF of

    the log normal distribution with estimated parameters and . At time zero, the time

    of propagation, no other nodes are aware of A1. As miners verify A1 and relay it to their

    neighbours, more miners join is partition. As time passes and more nodes are informed,

    the likelihood of a node being connected to a random node in the network that does not

    know about the block decreases. Figure 5 shows the change of (xi, t) depending on xi.

    The effect of the change in the location parameter is more pronounced than the change

    in the scale parameter as evidenced by the horizontal shifts in the curves.

    The probability of ending in Case 2 is comprised of two concurrent processes. The

    informed miners are working on finding A2 and the uniformed miners are working on

    finding B1. Note that B1 competes to be included on the main chain with A1. Let ti

    be the random variable that gives the time the first block, A2, is found by the informed

    partition. Let tu be the random variable that gives the time that the uninformed par-

    tition find their first block, B1. In order to proceed, we need to make a homogeneity

    assumption. We require the hashing power is evenly distributed across the network in

    order to evaluate the impact of a change in propagation time. This assumption is less

    restrictive than it first appears. Many of the nodes on the network are not actually

    miners but are voluntary nodes that are simply relaying transactions and so the number

    of nodes actually represented by miners is small. The network is randomly formed, and

    thus the probability of connecting to a random node with a large share of hashing power

    is equal to any other node. Furthermore, the incentives concerning high connectivity on

    31

  • 1 10 100 1000Latency HsecondsL

    0.2

    0.4

    0.6

    0.8

    1.0

    HtL

    0

    100

    500

    1000

    2000

    Figure 5: (xi, t) for different values of xi

    the network have not been well studied and therefore it is unlikely that it is has become a

    domain of strategic interaction between miners. Although the split between the processes

    is changing over time, the network will still find as many blocks in a given period of time

    in expectation. The evolution of the split according to (xi, t) is a deterministic function

    of time. Hence, the arrival process of blocks follows non-homogenous poisson processes

    with rate parameters i(t) and u(t) respectively. Non homogenous poisson processes

    are characterised by their mean m(t) = t0 (t) dt. The distribution of the arrival of the

    first event in a non-homogenous poisson process is given by: f(t) = (t)em(t) (Gallager

    2011).

    In this case, m(t) has the intuitive interpretation of network equivalent time. Take

    the informed partition as an example. The rate parameter is the share of the network

    that is uninformed at time t times the network rate, (xi, t). The integral of which

    is t0 (xi, s) ds. Since (xi, s) represents the share of the network at time s the

    integration is the amount of network equivalent time that the partition has spent on

    finding a new block. The area under (xi, s) and 1(xi, s) at time t must add up to t.Figure 6 shows how initially the uninformed nodes have a larger share of the network and

    are hence more likely to find a block. Since we are interested in who finds the first the

    32

  • 100 200 300 400Seconds

    0.2

    0.4

    0.6

    0.8

    1.0

    HtL

    HtL1-HtLHtL=0.5

    Figure 6: (xi, t) and 1 (xi, t) for different values of xi

    block after A1 was published, greater weight is put on the beginning of the distribution

    rather than the end as the network has a higher probability of already finding a block.

    Let ti be the random variable that indicates the time that the informed miners find

    their first block and tu the random variable that the uninformed miners find a block. To

    simplify notation, mi(xi, t) = t0 (xi, s) ds and mu(xi, t) =

    t0 1 (xi, s) ds.

    P(Case 2 | xi) = P(ti < tu)

    = P(tu t) P(ti t)

    = P(tu > t) P(ti t) + P(tu = t) P(ti t)

    Evaluating at D(xi), the time that A1 has been fully propagated, gives:(1

    D(xi)0

    (1 (xi, t))emu(xi,t) dt) D(xi)0

    i(xi, s)emi(xi,t) dt

    +

    D(xi)0

    (1 (xi, t))emu(xi,t){ t

    0(xi, s)e

    mi(xi,s) ds}dt

    (13)

    33

  • 4.1.6 Cases 3, 3.1 and 3.2

    Case 3 is the outcome where a miner j in the uninformed partition finds B1 before

    a miner in the informed partition finds A2 and before A1 fully propagates. Case 3 is

    defined by two competing blocks of the same height both believed to be the longest chain

    by partitions of the network. One must be picked to be added to the main chain. In

    Bitcoin, this is known as an orphan race. One of the blocks being propagated will not

    end up on the main chain. Arriving in Case 3.1 means that the informed partition finds

    A2 before the uninformed partition finds B2. This is determined by ratio of informed

    to uninformed miners. Under our assumption that B1 propagates instantaneously, the

    number of miners in the uninformed partition at the time B1 is found is the number of

    miners who will work on finding B2.23

    Recall that ti is the random variable that indicates the time that the informed miners

    find their first block and tu the random variable that the uninformed miners find a block.

    Also note, mi(xi, t) = t0 (xi, s) ds and mu(xi, t) =

    t0 1 (xi, s) ds.

    P(Case 3 | xi) = P(tu < ti)

    = P(ti t) P(tu t)

    = P(ti > t) P(tu t) + P(ti = t) P(tu t)

    Evaluating at D(xi), the time that A1 has been fully propagated, gives:(1

    D(xi)0

    (xi, t)emi(xi,t) dt

    ) D(xi)0

    (1 (xi, s))emu(xi,t) dt

    +

    D(xi)0

    (xi, t)emi(xi,t)

    { t0

    (1 (xi, s))emu(xi,s) ds}dt

    (14)

    23The number of miners working on B2 will be strictly lower than this number due to the delay inother miners hearing about B1.

    34

  • Result 4.1 Under monopolistic competition, the probability of ending in Cases 2 and 3

    are increasing in xi

    Under monopolistic competition, the probability of ending in Case 2 is increasing in

    xi. The effect of increasing the time until full propagation dominates the slower rate

    at which nodes are informed about the block. Larger blocks lengthen the window of

    interest in the game and hence increase the probability that a block will be found by

    the network. The increase in the probability that the network finds a block is split cases

    2 and 3. There are no closed form solutions and therefore the data form the empirical

    section is used to derive the probabilities in figure 7.

    Under the assumption of monopolistic competition, the probability of ending in Case 3

    is increasing in xi. This is due to two complementary effects. The amount of time until

    full propagation has increased and the there is a greater share of uninformed miners at

    every point in time. Again no closed form solution exists but the result can be seen in

    figure 7. When miner i decides to include more transactions this increases the amount

    of time uninformed miners spend on a competing problem. Extending the time until

    full propagation also increases the probability of arriving at Case 3 as there is a greater

    amount of time in which the uninformed miners can find B1.

    Given that the network arrives at Case 3, there are still two possible outcomes. Either the

    informed miners find A2 or the uninformed miners find B2. Under the assumption that

    B1 propagates instantaneously, the network is now divided into two stationary partitions.

    Only the arrival of a block from either partition will cause the network to change. The

    network at that instant can now be modelled as being governed by a homogenous poisson

    process with a stationary split between two partitions. The probability of the arrival of

    a new block in any given time horizon is i.i.d. Given that the arrival of B1 occurred at

    time t, the rate parameters of the informed and the uninformed partition are (xi, t)

    and (1(xi, t) respectively. A homogenous poisson process split into two independent

    35

  • 2000 4000 6000 8000 10000xi

    0.2

    0.4

    0.6

    0.8

    1.0

    Probability

    Case 1

    Case 2

    Case 3

    Figure 7: Probabilities of ending in Cases 1, 2 and 3

    processes has a convenient result for the arrival of the next event (Gallager 2011).

    P[ti < tu] =(xi, t)

    (xi, t) + (1 (xi, t)) (15)

    = (xi, t)

    The probability of Case 3.1 is then the probability that we arrive at Case 3 at time

    t weighted by the share that the informed miners have at time t, (xi, t). Rewriting

    P[Case 3] with the weighted probabilities gives:

    (1

    D(xi)0

    (xi, t)emi(xi,t) dt

    ) D(xi)0

    (xi, t)(1 (xi, t))emu(xi,t) dt

    +

    D(xi)0

    (xi, t)emi(xi,t)

    { t0(xi, t)(1 (xi, s))emu(xi,s) ds

    }dt

    (16)

    36

  • Similarly the probability that Case 3.2 occurs is

    (1

    D(xi)0

    (xi, t)emi(xi,t) dt

    ) D(xi)0

    (1 (xi, t))2emu(xi,t) dt

    +

    D(xi)0

    (xi, t)emi(xi,t)

    { t0

    (1 (xi, s))2emu(xi,s) ds}dt

    (17)

    Result 4.2 As xi increases, the probability of Case 3.2 increases faster than the proba-

    bility of Case 3.1 occurring.

    Result 4.1 showed that an increase in xi increases the probability that B1 is found before

    the full propagation of A1. Within Case 3 the probability of ending in Case 3.2 increases

    at a faster rate than Case 3.1 since at every time period there is now a greater share of

    uninformed nodes so there is an increase in the probability that a competing block B1

    will eventually make it onto the main chain. This can be seen in figure 7.

    At 2300 transactions, there is a equal probability of ending in Cases 3.1 and 3.2. This

    result, stands in contrast to the conventional wisdom about orphan races on the Bitcoin

    network. Smaller blocks are able to propagate much faster but our analysis shows that

    their size is not decisive in an orphan race. Figure 8 shows that the inclusion of a block

    in the main chain is primarily decided by the time which it was found and secondarily by

    its size. For blocks with transactions fewer than 2300, the larger block still wins the race

    with greater probability than the smaller block. After 2300 transactions the probability

    of the block containing zero transactions has a higher probability of being included on

    the main chain.

    This result shows the importance of understanding the combination of the spread of

    information on the network and the rules of the protocol. The uninformed partition can

    find blocks until the last miner is informed about miner is block but after the block has

    propagated to over 50% of the network the likelihood of it winning the orphan race is

    by definition less than half. The point at which the probability of Case 3.2 occurring

    37

  • 2000 4000 6000 8000 10000xi

    0.05

    0.10

    0.15

    0.20

    Probability

    Case 3.1

    Case 3.2

    Figure 8: Probabilities of ending in Cases 3.1 and 3.2

    is higher than Case 3.1 therefore shows the point at which the uninformed miners are

    more likely to find a block before half the network has been informed of miner is block.

    The intuition behind this result can be seen by referring to figure 6, noting the difference

    between the shift in the tail of 1 (xi, t) compared to where it crosses the 0.5 dashedline.

    The increase in the amount of time uninformed time that uninformed miners spend on

    creating a competing block is spread throughout the Log Normal distribution and hence

    the probability of a collision does not determine the orphan rate. Rather it is the width

    of the tail that could lead to a large orphan rate. The condition for 3.1 to be higher to

    3.1 is equivalent to saying the majority of the blocks found by the uninformed partition

    occur before A1 has propagated to 50% of the network under the assumption that B1

    propagates instantly.

    Bitcoin orphan races are decided probabilistically based on shares of the network working

    on competing blocks. The results derived are driven by the respective shares of the

    informed and the uninformed partitions. Empirical data informs reasonable estimates

    for how this split evolves over time. While the model is parameterised for the specific

    38

  • observation period, the model could be adapted in the future for any changes in network

    topology or average size of blocks.

    5 Nonce only equilibrium

    Since there are no closed form solutions to the model, we proceed with numerical and

    graphical analysis to look at the outcomes of the game. We are primarily interested in

    how the expected benefit of miner i changes as a function of xi the number of transactions

    included in their block. At the end of the section we will look at the incentives that

    larger miners face and consider the incentives towards centralisation. Initially we proceed

    under the assumption of monopolistic competition.

    The current block reward in Bitcoin is 25 BTC for each block that is found. Houy

    (2014a) found that the Block reward would have to fall to 2.03 BTC or transactions

    fees to rise to 0.00123BTC in order for the largest mining pool, GHash.io, to include

    a positive number of transactions. In Houys specification, GHash.io has 26% of the

    network hashing rate. Under monopolistic competition accounting for the dynamics of

    information propagation, the equilibrium in which all miners include 0 transactions is

    broken when the fee rises to the 0.004 BTC.

    Figure 9 shows how the change in transaction fee affects the optimal number of transac-

    tions included by a miner. All lines begin at the same intercept as they result in equal

    payoffs. As miner i includes more transactions, it increases the probability that their

    block is orphaned but they also receive transaction fee revenue. The bottom line, set at

    the current transaction fee level, falls sharply as miner i includes more transactions. The

    increase in revenue is not enough to offset the increase in probability that the block is

    orphaned. However, if the transaction fee was set at 0.0004 BTC, miners would have an

    incentive to include transactions as many transactions as they could. The shaded area

    39

  • 0 2000 4000 6000 8000 10000xi

    22.5

    23.0

    23.5

    24.0

    24.5

    25.0

    25.5EB

    0.0001

    0.0002

    0.0003

    0.0004

    Figure 9: Expected Benefit with R = 25 and varying values of c

    in figure 9 shows the difference in miner revenue between the nonce only equilibrium

    strategy and the inclusion of transactions.

    The power of transaction fee revenue increases in tandem with the halving of the block

    reward, R. In approximately August 2016, the block reward will halve to 12.5 BTC

    per block mined. The halving of the block reward the transaction fee required to break

    the nonce also halves to just double current levels, 0.0002 BTC. The optimal number

    of transactions included remain unchanged following from the equilibrium condition.

    Miners would still wish to include as many transactions as possible.

    While the block reward halving alleviates some of the incentive problems due to infor-

    mation propagation, the equilibrium condition in the model could put an upper bound

    on the price of Bitcoin if it is going to be used effectively as a payment system. Miners

    trade-off the additional revenue from transaction fees with the increased probability that

    the block they find will be orphaned. Recall the equilibrium condition

    xi = max

    {Pi(x )

    | P i (x ) | Rc, 0

    }(18)

    40

  • Assume an interior solution. If the price of Bitcoin in USD goes up, this increases

    both R and c in equal proportions since both the reward and the transaction fee are

    denominated in Bitcoin. Note that in order for us to remain at the equilibrium number

    of transactions processed, the cost of an individual transaction in Bitcoin terms has to

    remain constant and hence increase in USD terms. An increase in transaction fees in

    USD terms may result in fewer transactions being processed on the network. Earlier

    this year the minimum transaction fee set by the core development team was debased

    to account for the price of Bitcoin rising (Hearn 2014). In the future, a debasement

    in the minimum transaction fee might result in fewer transactions being processed as a

    result.

    The current transaction fee is around four fold lower than the value needed to escape

    the bad equilibrium where no miners have an incentive to include transactions in their

    blocks. However even with fees at their current levels, miners include transactions in

    their blocks. Over the observation period, the average number of transactions in a block

    was 368. The model provides a method of estimating how much deviating from a nonce

    only strategy costs the miner in expectation. Figure 10 shows the expected benefit

    for finding a block under a nonce only strategy which is independent of the number

    of transactions and the payoff for including xi transactions in a block. The shaded

    area in figure 10 shows the revenue forgone from deviating from a nonce only strategy.

    Including 500 transactions in a block causes miners to lose in expectation 0.14 BTC for

    every block they find, approximately $100 in current value terms or 0.5% of operating

    revenue. Recall that blocks are found approximately every ten minutes and this effect

    is compounded over the number of blocks that a miner finds.

    Miners could be competing strategically to raise the benefits of block rewards and trans-

    action fees indirectly. Miners may sacrifice changes in expected gains from following a

    nonce only mining strategy with these other strategies. The model could be extended

    41

  • 0 2000 4000 6000 8000 10000xi

    22.5

    23.0

    23.5

    24.0

    24.5

    25.0

    25.5EB

    Nonce Only

    Deviation

    Figure 10: Loss in revenue from deviating from a nonce only strategy

    to include a consideration of the participation of miners in pools. Mining pools offer a

    revenue smoothing service at a fee. They are visible actors on the network since they

    have to prove to their members when they find a block. Including more transactions

    may attract more miners to the pool who wish to support the Bitcoin network. In the

    internet forums, miners are often praised for including a large number of transactions.

    Furthermore, some pools are criticised for their bad behaviour causing miners to switch

    pools. With costless switching, mining pools may be very sensitive to bad press.

    6 Discussion

    The previous sections outlined the difficulties from information propagation that Bitcoin

    must overcome to become an incentive compatible decentralised payment network. In

    this section, we discuss the implications of partitions on the network more broadly. In

    light of the adverse incentives that face miners and the costs to the Bitcoin network

    more generally, we then consider a proposal to remove some of the costs associated with

    network latency. Although there are technically feasible amendments to the Bitcoin

    42

  • protocol that would speed up block propagation, network latency actually provides a

    layer of protection against malicious strategies between miners.

    The probability function outlined in section 4.1.3 not only describes the incentives facing

    the individual miner i but also defines the amount of time that the network spends away

    from global consensus. Recall that a partition in the network is defined by some miners

    holding a different set of beliefs on the current state of the ledger. The Bitcoin protocol

    has been shown to come to unanimous consensus in a finite time horizon as long as

    over half the computing power remain honest (Miller & LaViola Jr 2014).24 However,

    little attention has been devoted to understanding the dynamics that unfold while the

    network is not at unanimous consensus. This may be in part due to the small number

    of transactions facilitated on the Bitcoin network at present and also a lack of modelling

    of how these partitions occur.

    In the model presented, Cases 2 through 4 represent the probabilities that the network

    moves into a state with multiple partitions. Result 4.1 showed how the probability of

    the network ending in one of these cases is increasing in the number of transactions

    being processed on the network. Hence increasing the transaction volume on the Bitcoin

    network will cause more and longer partitions to occur. Recall that the probabilities

    of moving into a Case were driven by network equivalent time spent by miners on a

    competing problem. We found that this to be strictly increasing in the number of

    transactions processed. Hence miners spend longer in partitions as a result of more

    transactions being processed.

    Partitions on the Bitcoin network open attack vectors. One example is the Finney

    attack. The attack was suggested by Hal Finney, a cryptographer who was one of the

    first people to run the Bitcoin software in 2009 and was the recipient of the first Bitcoin

    transaction. The attack proceeds as follows. A miner works on a block that contains

    24Honest defined here as not colluding with an adversary.

    43

  • a transaction between two addresses that the miner owns and does not announce that

    transaction to the network. As soon as the miner finds the solution to the cryptographic

    puzzle that enables them to publish the block, they run to the merchant to pay with the

    already spent outputs.25 The window for the attack is the propagation time of the block.

    If blocks were propagated almost instantly then Finney attacks would be impossible to

    pull off. However, if the merchant is in a partition of the network that does not hear about

    the block for a considerable portion of time this increases the effectiveness of the attack.

    Combatting such an attack may require waiting for more blocks to be found in order

    to be sure that the transaction is valid and will not be double spent. In that scenario,

    Bitcoin ceases to be compatible with making every day purchases in stores.

    Other implications of partitions on the Bitcoin network are yet to be explored. Geo-

    graphic partitions in the network might result in increased orphan rates. In our model,

    we assume that the entire Bitcoin network is structured as a random graph. However it

    is possible that there is some preferential attachment by geographic region. The empiri-

    cal data shows that the the country with the second largest number of nodes was China

    (figure 13), however the number of inv messages received from Chinese nodes was eighth

    largest by country location (figure 12). Our measurement node was located in the US

    and hence this might indicate a geographic partition in the spread of information on the

    network. Furthermore, one of the major Chinese mining pools, Discus Fish, had anoma-

    lous information propagation trends for their blocks (figure 11). Their smaller blocks

    propagated slower than their bigger blocks. Although this could have been due to the

    small sample size, it could also be due to the lack of transactions in those blocks seen by

    the network ahead of the block being published. When miners hear about transactions,

    they automatically check their signatures to expedite validation when blocks containing

    them are propagated. If information about transactions is not spreading fully around

    25Recently, a business, Bitundo, has started offering Finney Attacks as a service. They allow you tosend double spends to their mining pool and they will notify you when they find a block so that thecustomer can double spend their coins at a merchant.

    44

  • the network, this will create longer propagation times and therefore more time wasted

    by miners in general.

    Addressing the costs associated with network partitions arising from slow information

    propagation will be required to ensure that Bitcoin can serve as scalable decentralised

    payment system. Mark Friendenbach (2014) has suggested a new protocol for the way

    that nodes will receive and relay blocks to the network . Currently the network fully

    validate blocks before forwarding them to peers. The information transmitted is also

    the entire contents of blocks. The primacy of blocks is determined by which block

    message is received first, which is subject to the transmission and validation delays.

    The new proposal, suggests broadcasting the unique block headers separately from the

    block message, which will allow miners to discern which block came first. Block headers,

    however, gives miners the option of basing primacy on the time the block header is

    received, which is almost instantaneous. Although it is not 100% safe to start mining

    on top of a block header, a miner, after the regular delay and validation could treat

    the first block that they heard of as the longest chain and put it as the parent of their

    block.

    To see how this works, take an example: two blocks with a common parent A0 are found

    in quick succession; A1 and B1. A1 is found first but is a large block. B1 is found a

    few seconds later and is an empty block. Case 3.2 showed the likelihood that B1, the

    later but smaller block, could propagate faster, orphaning the larger A1. Under headers-

    first, the cost of ass