Prisoners Dilemma and its influence on Blockchain Consensus Models

Game Theory is the study of strategic decision-making. It is a core aspect of how Blockchain Systems are designed so that anonymous nodes all over the world can work together for the common good of a network. Essentially, you want to design systems so that everyone acting in their self-interest will help the network.

The Prisoner’s Dilemma is a classic example of Game Theory. Tokyo and Rio are both caught at the crime scene and brought in for questioning. Under the law in this imaginary world, there are four possible outcomes here. (1) Tokyo and Rio both confess, (2,3) one or the other confess, and (4) neither confesses. Everyone’s punishment will vary depending on who does what. For example, if Rio confesses and makes a deal with the prosecutor and Tokyo does not, then Rio gets to walk free and Tokyo takes the fault. If they both refuse to talk, they both go to jail for not cooperating. If they both confess, they will get a lenient sentence for wrapping this whole thing up quickly. You can see all of the outcomes below in the pay-off table.

The numbers denote the pay-off in this situation which is the number of years they will be sentenced to jail. The years are negative because the goal is always to maximize the pay-off in a payoff table. Here a good outcome for an individual is when the number of years of the sentence is being reduced and the pay-off is maximum.

The worst outcome was highlighted in red, the one-sided outcomes in yellow, and the best outcome in green. Now in a perfect world for Rio, should confess and hope that Tokyo doesn’t confess. If that happens, he gets to walk free but what if Tokyo also confesses? Then they both get a fairly bad outcome. In this imaginary legal scenario, the laws designed to make confessing to your crime is the ideal outcome. It saves everyone time and you get the most lenient sentence. Sure, you could roll the dice and stay silent, but the consequences could be dire. So, the system is encouraging a specific outcome, an outcome that is good for the people operating on the side of the law.

Let’s translate this to a blockchain. How does the blockchain network set up a system so that the best outcome for the individual is also the best outcome for the network? The most direct parallels we can see are in mining and block rewards.

In the bitcoin network, miners receive the bitcoin reward for successfully compiling a block of transactions. All of the miners in the network are simultaneously trying to do the same thing. A key aspect here is that rewards are only recognized on the longest chain, meaning that if you somehow cheated your block and it invalidates the one that follows, your rewards are gone. This consensus mechanism is known as Proof-of-Work (PoW), which validates the work and rewards if it is valid and if the work is invalid then there is no reward.

All that being said, let’s start with the worst outcome. If everyone is cheating constantly, trying to throw together any old invalid block then they’re wasting their time. There will be no valid chain and no valid rewards. Everyone loses. On the exact opposite end of the spectrum, if everyone works honestly then everyone has an equal shot at the reward. Nobody will waste their efforts and you’re bound to see a fair share of the reward. The edge cases here are one individual trying to cheat. If the majority of the network is honest, then you’re simply improving their chance at the rewards while you are removing your own. Your reward drops to zero and the rest of the networks improve. The bitcoin network has created a system where any amount of cheating becomes a big waste of time. The only chance you have of getting a reward for your actions is if you work in the best interest of the network.

The payoff table for this model is given below.

Ethereum takes it a step further by implementing a Proof-of-Stake (PoS) consensus mechanism. In this model, the miners have to stake some amount of their own Ethereum to become a validator. PoS mechanism not only rewards good behavior but also punishes bad behavior in the network. If the network detects any invalid blocks being created in the chain due to any dishonest mining behavior, then the POS mechanism punishes the miner by deducting a set amount from his stake of Ethereum. Hence the miner has to pay for his mistakes/dishonest behavior, literally! So this consensus mechanism forces everyone on the network to be on their best behavior to achieve the best outcome for themselves and the network as a whole.

The payoff table for this model is given below.

Again, none of these concepts are particularly new, it’s just that Blockchains like Bitcoin and Ethereum have employed them in new ways. Game Theory has successfully incentivized a network of thousands of nodes to work together despite not knowing each other at all!

– By Samarth C Swamy, Third Year Department of Mechanical Engineering

Leave a Reply

Your email address will not be published. Required fields are marked *