The Evolutionary Equilibrium of Block Withholding Attack

Bitcoin is the most famous and the most used cryptocurrency in the world, such that it has received extreme popularity in recent years. However the Bitcoin system is accompanied by diﬀerent attacks, including the block withholding (BWH) attack. When a miner plays the BWH attack, it will withhold all the blocks newly discovered in the attack pool, damaging the honest miners’ right to obtain the fair reward. In this paper, we consider a setting in which two miners may honestly mine or perform the BWH attack in a mining pool. Diﬀerent strategy proﬁles will bring diﬀerent payoﬀs, in addition inﬂuence the selection of the strategies. Therefore, we establish an evolutionary game model to study the behavior tendency of the miners and the evolutionary stable strategies under diﬀerent conditions, by formulating the replicator dynamic equations. Through numerical simulations, we further verify the theoretical results on evolutionary stable solutions and discuss the impact of the factors on miners’ strategic choice. Based on these simulation results, we also make some recommendations for the manager and the miners to mitigate the BWH attack and to promote the cooperation between miners in a mining pool.


Introduction
Bitcoin is a cryptocurrency, originally proposed by Nakamoto [1] . Unlike the existing currencies, Bitcoin is decentralized and runs without administrators. Because of this, it has become a great success. One of the key technologies which Bitcoin relies on is the blockchain. Blockchain is a public distributed ledger in which all the network nodes can participate to verify the transactions. Such a structure is beneficial to keep the data integrity, continuity, and consistency, which makes the blockchain possess several nice features, such as decentralization, programmability and security.
As one of the most successful applications of blockchain technology, Bitcoin system leverages the consensus protocal of Proof-of-work (PoW) to maintain the properties of consistency and security of data [2] . To reach an agreement among all nodes, PoW requires the participants to solve a complex SHA 256 mathematical puzzle, which is hard to calculate but easy to be verified [3] , by consuming their computational power. The one who first solves the puzzle is the winner and it has the right to broadcast its verified block to the blockchain network and then obtains the corresponding reward. Generally, these participants who calculate the puzzle are named as miners, and the process to solve puzzle and obtain the reward is called mining.
To obtain the reward from the Bitcoin system, all miners compete to be the first to solve the puzzle and generate the block. Generally, the system automatically adjusts the difficulty of the block generation, to maintain the average time interval to create a block about 10 minutes. Because of the increased difficulty of the system and the small computational power, a solo miner rarely generates a block. Although the expected revenue of a miner is positive, a miner has to wait for a quite long time to create a block and to earn the actual reward. Therefore, joining a mining pool is a good choice for a solo miner. Generally, a mining pool consists of a pool manager and a group of miners. The main task of the manager is to outsource the work to the miners. Once a miner submits a full proof of work (FPoW) to the manager, then the manager will send this FPoW to the Bitcoin system. When the manager receives the full revenue of the block from the system, it shall fairly allocate this revenue to the miners according their computational power. At the same time, the mining pool also accepts the partial proof of work (PPoW) and estimates the miner's computational contribution according to the rate with which it submits this PPoW. Such contribution is an important evidence for the manager to distribute the revenue to these miners who only submit PPoW.
Due to the opening pool, the Bitcoin system faces several kinds of attacks, such as selfish mining attacks [4] , FAW attacks [5] , block withholding (BWH) attacks [6] and DDOS attacks [7] . In this paper, we mainly focus on the BWH attacks. When a miner plays the BWH attack, it only sends the PPoW to the manager and discards the FPoW. On the one hand, since the attacker discards every FPoW, it does not bring any contributions to the pool. On the other hand, the attacker can share the revenue obtained by other miners, as it submits a PPoW to the pool. Obviously, the BWH attack seriously damages the honest miners' right to obtain the fair reward.
In this work, we will discuss the BWH attack in a mining pool by constructing an evolutionary game model. To simplify our evolutionary game model, suppose there are two miners in a pool, each of whom has two strategies. One is to cooperate, i.e., mine honestly, and the other is to launch the BWH attack. In each round of the evolutionary game, both of miners observe each other's strategy, and adjust their current low income strategies to the higher income strategies. We are more interested in the evolutionary stable strategies (ESS) of the game and how different factors prompt the cooperation tendency between miners.

Related Work
When a miner plays the BWH attack in a mining pool, it only submits PPoW and discards FPoW, which makes the manager be convinced that this attacker is indeed trying to mine for the pool. Because the manager mistakes the attacker as an honest miner, it also allocate revenue among the attacker and other pool miners. The BWH attack was first proposed by [8]. In 2014, a mining pool called Eligius suffered the BWH attack and lost 300 BTC [9] . Since then, different kinds of research have paid attention to this attack.
Eyal [6] first applied a pool game between two mining pools to analyze the Nash equilibrium of the BWH attack. In the case of [6], two pools shall make decisions whether or not attack, which is similar to the famous Prisoners' Dilemma and thus is called Miner's Dilemma. In order to prevent the pools from being trapped in a miner's dilemma and to optimize the mining model, [3] proposed a subclass Zero Determinant (ZD) strategy, by which a miner could control another miner's payoff and increased the social revenue. [10] modeled a computational power splitting game and showed that the attacker can gain profits in the long-run and may not be so for a short time, implying that the existing pool reward sharing protocols in Bitcoin are insecure when the miners launch the BWH attack.
By deepening of study on the BWH attacks, more researchers pay attention to the mitigation strategies of the BWH attack. [2] believed that most of the countermeasures to mitigate the BWH attack change the mining algorithm, which lowers the practical adaptability. So, the authors suggested three necessary conditions for the BWH countermeasure: No loss, compatibility, and fairness. The incentive compatibility of the reward allocation mechanisms of mining pool was introduced by [11], which can encourage the miners to submit blocks immediately and guarantee the mining pool's revenue. A concept called "special reward" was proposed by [12], granting additional incentive to the miner who submits a valid block to the pool and the BWH attacker would never receive the special reward. In this scheme, the revenue that the attacker gains is less than his expectation and thus could make a mining pool repulse the BWH attackers. [13] presented two schemes to counter the BWH attack, the one applies the cryptographic commitment schemes and the other is an alternative implementation by using hash function, both making it impossible for miners to distinguish between full proof of work and partial proof of work. A generalized model was constructed by [14] to analyze the equilibrium of the BWH attack, in which the authors found that increasing the asymmetry of information by information conceal mechanisms could decrease the negative influence of the BWH attack on the pool.
In decision-making research, evolutionary game theory is a common tool for constructing a model and analyzing the choice of strategies [15] . Classic game theory assumes that all players have perfect rationality, while evolutionary game theory is only based on the bounded rationality [16] . That is, the choice of each player's equilibrium strategy is the result of the continuous learning and adjustment, rather than a one-time choice. The basic solution concept of the evolutionary game theory is evolutionary stable strategy (ESS) [17] , and it is used to describe the stable state of the evolution process. Recently, a few kinds of literatures study on the blockchain by using evolutionary game theory. [18] applied the evolutionary game to describe the dynamic mining-pool selection process in a PoW-based blockchain network, and they provided the theoretical analysis of the evolutionary stability under the two-pool condition. [19] modeled the process of mining as a two-stage game model in order to characterize the decision that the pool whether to open or not and to launch the BWH attack or not in the PoW-based blockchain network. They applied the evolutionary game theory and analyzed evolutionary stability of the strategy selection. This method could overcome the shortcoming of the NE which only describes the local optimization of the pool strategies selection. In [20], the authors investigated the evolutionary mining game with miner's dilemma under the BWH attack and studied the population changes with the time between participated pools through the evolutionary stability. They also analyzed the mining pool dynamics affected by malicious infiltrators and the feasibility of autonomous migration among individual miners.
Few work studies the BWH attacks by the evolutionary game from the perspective of mitigating the attacks. [21] constructed a symmetrical evolutionary game model to analyze the expected benefits of the strategy selection of two miners in a mining pool, where the computational power of the two miners are the same. The authors explored how the pool administrator could mitigate the BWH attack under different supervision and punishment mechanisms. In this paper, an asymmetric evolutionary game model, more general than the symmetrical one, is constructed, in which there are two miners in a mining pool and they have different amounts of computational power. Our objects are to explore the influence of the main factors on the miners' strategy selections, and to make suggestions to mitigate the BWH attack to promot the cooperation between miners.

Paper Organization
In this paper, we analyze the BWH attack in a pool by establishing an evolutionary game model. Motivated by [14], we construct the payoff functions under different strategies in Section 2. By establishing the replicator dynamic equations, different evolutionary stable strategies under different conditions are derived in Section 2. In Section 3, we analyze how different factors influence the miners' strategic choice by a series of simulations. Last section provides several valuable suggestions and concludes this paper.

The Evolutionary Game Model for the BWH Attack
In this section, we first establish an evolutionary game model to study the BWH attack in a Bitcoin mining pool, and analyze the evolutionary stable strategies by using the replicator dynamic equations.

Basic Evolutionary Game Model
Generally, there are many miners in a Bitcoin mining pool. Since this work only concentrates on two strategies: One is honestly mining (C) and the other is the BWH attack (A), let us assume that there are two participants: Miner 1 and miner 2 to simplify our discussion, like [3,22], each of whom has aforementioned two strategies. Therefore, we construct an evolutionary game model, in which there are four strategy profiles: (C, C), (C, A), (A, C) and (A, A). Different strategy profile would bring different payoffs to each miner. During the evolutionary game, the miners keep learning to adjust their low-income strategies and to imitate the strategy choice of the miner, who has a higher income, until the strategy profile of two miners reaches a stable state. To establish the evolutionary game model between two miners formally, following assumptions and parameters are necessary to be introduced in advance, which are similar to those in [14]. 1) Two miners own different amounts of computational power. Let miner 1 and miner 2 have a 1 and a 2 units of computational power, respectively. W.l.o.g, we assume that 2) The reward the mining pool obtains per unit computational power per unit time is denoted by R.
3) When the miner honestly mines and sends the full proof of work (FPoW) to the manager instantaneously, the cost of computational power per unit time to mine is C 1 (C 1 > 0). If the miner employs the BWH attack to only submit the partial proof of work (PPoW), the mining cost per unit time it consumes is C 2 (0 ≤ C 2 < C 1 ).

4)
If both of the two miners honestly mine, then the probability to dig up the legal block will increase, which leads to the improvement of the expected profit. Thus we assume that the miners' cooperation with each other would enlarge γ multiples of reward (γ > 1).

5)
To encourage the miners to cooperate, the pool manager will draw an additional reward from the reward R to offer to the one who submits the FPoW. We assume the additional reward per unit computational power per unit time to be δR, where δ ∈ (0, 1).
If the two miners adopt the strategy of cooperation at the same time, then the revenue per unit computational power per unit time of the entire mining pool increases to γR. The payoff of each miner is the difference between the reward which is proportional to its computational power and the cost to mine honestly. Therefore, under the strategy profile of (C, C), miner 1 has its payoff of a 1 γR − C 1 and miner 2 has its payoff of a 2 γR − C 1 . If miner 1 mines honestly and miner 2 employs the BWH attack, that is the strategy profile is (C, A), then the actual useful computational power to dig up a legal block is a 1 and thus the total revenue per time of the mining pool is a 1 R. Under this situation, the pool manager first provides the additional reward to miner 1 to encourage its honest behavior, and then allocate the rest of reward a 1 (1 − δ)R to miner 1 and miner 2 proportional to their computational powers, respectively. So the payoff of miner 1 is a 1 δR + a 2 1 (1 − δ)R − C 1 . Miner 2 could obtain a partial of reward as a free rider, even though it just consumes a smaller cost. Thus its payoff is a 1 a 2 (1 − δ)R − C 2 . For the strategy profile (A, C), the opposite symmetry case happens and then the payoffs of miner 1 and miner 2 are For the last case of (A, A), as both of the miners attack, they cannot dig up a legal block, and thus no reward can be obtained. It follows that the payoff of each miner is −C 2 .
Based on the above analysis and the aforementioned assumptions, the corresponding payoff matrix is shown in Table 1.

The Solutions of the Evolutionary Game
In the established evolutionary game model, the two miners have two strategies and different payoffs. Let x and y, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, be the probabilities of miner 1 and miner 2 to play the strategy of cooperation, respectively. Therefore, the possibilities to employ the BWH attack of miner 1 and miner 2 are 1 − x and 1 − y, respectively.
Denote the payoffs of miner 1 when it takes the strategy of cooperation and adopts the BWH attack by U 11 and U 12 , respectively. According to the payoff matrix in Table 1, we can obtain U 11 and U 12 as follows: Hence, the average expected payoff of miner 1 is Similarly, let U 21 and U 22 be the payoffs of miner 2 when it mines honestly and employs the BWH attack, respectively. From the payoff matrix in Table 1, U 21 and U 22 are as follows: So, the average expected payoff of miner 2 is In an evolutionary game model, each participant keeps learning and then adjusts its lowerpayoff strategy to the higher-payoff strategy. Such a learning approach results in a greatly significant growth rate of a strategy in a replicator system, which reflects the evolutionary direction. By [23], the growth rate of a strategy selected by a participant is just equal to the difference between the payoff of this strategy and its average expected payoff. Because a 2 = λa 1 , we have the replicator dynamic equations of miner 1 and miner 2 are as follows: Then, Equations (1) and (2) combine the following replicator dynamic system: By the stability theorem of the differential equations, all the solutions satisfying F (x, y) = dx dt = 0 and G(x, y) = dy dt = 0 are the equilibrium points of the replicator dynamic system. It is not hard to see there are five fixed equilibrium points of this system: (0, 0), (1, 0), (0, 1), (1,1) and (x * , y * ), where From the results in [23], we know the stability condition at the fixed equilibrium points can be achieved by the application of Jacobian matrix. Therefore, the Jacobian matrix of the replicator dynamic system (3) is: The determinant (Det) and the trace (Tr) of J are as follows.
As different fixed equilibrium points bring different determinants and traces of the Jacobian matrix, we then list all determinants and traces in Table 2. Table 2 The determinant and trace of the system at equilibrium points DetJ TrJ

Equilibrium Analysis for the Evolutionary Game
Based on the results in [23], given a point (x, y), if the determinant DetJ(x, y) > 0 and the trace TrJ(x, y) < 0, then this point is an evolutionary stable strategy. Thus according to Table  2.2, following five situations are necessary to be discussed.
Situation 2 When a 1 δR+(1−δ)a 2 1 R−C 1 +C 2 < 0 and λa 1 γR−(1−δ)λa 2 1 R−C 1 +C 2 > 0, we have DetJ(0, 0) > 0, TrJ(0, 0) < 0 and DetJ(1, 1) > 0, TrJ(1, 1) < 0. It means that (0, 0) and (1, 1) are both the evolutionary stable strategies of the system. Because of the condition of a 1 δR + (1 − δ)a 2 1 R − C 1 + C 2 < 0, we can see that when miner 2 attacks, then playing the BWH attack is a better choice for miner 1, since it can obtain more payoff than the one when it cooperates. In addition, due to the condition that λa 1 γR − (1 − δ)λa 2 1 R − C 1 + C 2 > 0, it is not hard to observe that if miner 1 cooperates, then miner 2 selects the cooperation too. Thus the rational miners will choose to honestly mine if the cooperation can make them obtain higher payoffs. On the contrary, both of the participants choose to attack, if their payoffs are higher than the ones when they cooperate. So under this situation, the overall evolutionary results of the system are uncertain.
Situation 5 When a 1 λδR+(1−δ)λ 2 a 2 1 R−C 1 +C 2 > 0 and a 1 γR−(1−δ)λa 2 1 R−C 1 +C 2 < 0, then (0, 1) and (1, 0) are the stable points of the system. According to the conditions in this situation, we know that when miner 1 attacks, miner 2 gains more from honest mining, and when miner 1 honestly mines, miner 2 can gain a higher return by attacking. That is, when one side cooperates, the other side will choose to attack, and thus both sides will adopt different strategies. Under this situation, the dynamic system is eventually evolved to the state, in which one side adopts cooperation strategy and the other adopts attack strategy.

Simulations for Different Situations
To help the readers to intuitively understand the conclusions of the evolutionary game model with the BWH attack, we first simulate the dynamic evolutionary process of two miners' strategy selection under different situations demonstrated in Subsection 2.3 by using Matlab.

Simulations for the Influence of Parameters
In the mining process, the main objective is to explore the cooperation tendency between two miners, that is the evolutionary stable strategy (1, 1) is the ideal stable state which we expect for. In this subsection, we will discuss the influence of the parameters on both sides' strategic choice through analyzing the enlarging multiple γ of rewards, the percentage δ of additional rewards, and the ratio λ of the computational powers of two miners.
1) The influence of δ on the evolutionary behavior of miners. In order to observe the impact of the additional rewards on the evolutionary behavior, we fix other parameters, and let δ take 0.1, 0.3, 0.5 and 0.7 respectively. Furthermore, the initial probability x, y takes 0.2, 0.5 and 0.8, respectively. Figure 2(a) and Figure 2(b) demonstrate that the additional reward mechanism can encourage miners to mine honestly. With the percentage δ of additional rewards increasing, the evolutionary speed of miner 1 and miner 2 toward cooperation strategy increases too. On the one hand, with a continuous increase of the additional rewards, the rate the miner evolves to the cooperation strategy gradually increases. On the other hand, after the additional rewards increased to a certain amount, the marginal effect would reduce. As a result, a certain additional reward mechanism can motivate miners to cooperation. 2) The influence of λ on the evolutionary behavior of miners. We fix the other parameters, and let λ take 1 6 , 3 6 , 4 6 and 5 6 , respectively, to analyze the impact of the ratio λ of the computational power on the system evolution. The initial probability x and y take 0.2, 0.5 and 0.8, respectively. In Figure 3(a), we can observe that when the initial probability is low, the difference of the computational power between two miners has a great impact on the convergence speed of the miner 1's strategy. With the initial probability increasing, the difference of computational power has less influence on the convergence speed of miner 1's strategy. On the contrary, as shown in Figure 3(b), the changes of λ always have a significant impact on miner 2's strategy choice. When λ takes 1 6 , namely, the computational power of miner 2 is very small, it will adopts the BWH attack. With λ increasing, the miner 2 will turn from the BWH attack to honest mining. The higher the miners' computational power, the faster the miners' strategy converges to cooperation. Because of the higher computational power of miner 1, the rate it evolves to cooperation faster than that of miner 2. As a result, the higher computational power of the miner, the greater the probability to adopt a cooperation strategy.
3) The influence of γ on the evolutionary behavior of miners.
In order to observe the impact of the enlarging multiples of rewards on the evolutionary behavior, the other parameters are fixed, and γ takes 1.5, 2, 2.5 and 3 respectively. The initial probability x and y take 0.2, 0.5 and 0.8, respectively. With the gradual increase of the enlarging multiples of reward after the cooperation, the miners' strategy will converge to cooperate quickly. More precisely, from Figure 4, we can see that the enlarging multiples of rewards has a significant effect on miner 2. When γ takes 1.5, it takes a long time for miner 2 to evolve to be a cooperator. With γ increasing, miner 2 will turn to adopt a cooperation strategy regardless of the initial strategy selection. Comparing Figure 4(a) with Figure 4(b), the increased revenue caused by the cooperation strategy has an incentive effect on both parties, but the effect on lower computational power, such as miner 2, is more obvious.

Conclusions
In this paper, we study the game evolution process of miners' behavior to adopt the BWH attack by leveraging evolutionary game theory. It focuses on the game evolution process and the influence of different parameters on the ideal stable states through numerical simulation.
In a mining pool, the manager prefers mitigating the BWH attack and promoting the miners to develop stable cooperative relationship, so as to improve the revenue of the pool, reduce the waste of the computational power and increase the efficiency of mining. For each miner, it shall make a choice to cooperate or attack, by comparing the amount of computational power it has with the other's. From the simulations in Section 3, we can make some recommendations for the manager and the miners. Firstly for the miner who equips with more computational power, it is much better for it to be a cooperator. Secondly, if the reward of the mining pool can be enlarged more under the strategy profile (C, C), then cooperation is a better choice for each miner. Thirdly, a proper additional reward mechanism can help the pool manager to encourage miners to mine honestly. While, there are some limitations in this paper, the model in this paper only focuses on the case containing two miners, and thus we will discuss the model containing more miners in the future.