Your complimentary articles

You’ve read one of your four complimentary articles for this month.

You can read four articles free per month. To have complete access to the thousands of philosophy articles on this site, please

SUBSCRIBE NOW

Science & Morality

The Prisoner’s Dilemma and The Evolution of Morality

Brian King seeks the possible evolution of morality through computer simulations.

The Prisoner’s Dilemma is a game that you win by getting the lowest number of years in jail. Imagine you are one of two people who have robbed a diamond merchant. You and your accomplice are both apprehended by the police, and held in separate cells for questioning. The investigating officer offers both of you the same deal. If you confess and thereby incriminate the other, and the other remains silent, then for helping with enquiries your sentence will be reduced to one year, and your silent accomplice will get four years. If you both remain silent you will both get two years. Yet if you both confess and incriminate each other, you will receive three years. You must either betray your accomplice or remain silent.

In this game a person who remains silent is known as the co-operator since they co-operate with the other prisoner. The one who confesses, and in doing so incriminates the other, is known as the defector since they betray the other prisoner to try to get a lighter sentence. Try playing this game with someone, keep the score and see if you can come up with a strategy for success – meaning the smallest sentence you can get.

Relevance of the Game

In the game it seems that whatever your partner (opponent) does, it pays you to defect. If he or she co-operates (is silent) then your own confession will earn you just one year in prison whilst he or she gets four years; if, on the other hand, he or she defects and snitches you up, then it’s just as well that you defected too, since you will then get three years as opposed to the four you would have received had you remained silent. If, on the other hand you remain silent, you could either get four years if your partner confesses, or two years if they also remain silent. And the same applies to your accomplice. So whilst each individually would be better off to defect, you would together be better off to cooperate – remain silent – receiving two years each, a total of only four years. We can see that co-operation means forfeiting the best outcome (one year) for yourself so that both of you can get second best (two years). Your co-operation means sacrificing some of your benefits for the sake of others, and you also risk being duped by an aggressive defector.

The scoring is arbitrary except for the following: the temptation of defecting while your opponent co-operates should be the highest score (lowest prison sentence) and the second best option needs to be where both co-operate (second lowest prison sentence), the third best option is where both defect and the worst option needs to be where you co-operate when your opponent defects (often called the sucker’s payoff).

Contemplating your options

Game theory is the study of strategic decision making through the mathematical analysis of such games. These days game theory is being applied in disciplines as diverse as economics, political science, military strategy, business studies, and psychology. The Prisoner’s Dilemma concerns keeping your punishment as low as possible but other games can be set up where the aim is to make as much gain as possible either at the expense of someone else, or by co-operating – as in the Hunter’s Share (see sidebar). Such games can be devised to stand for any situation where you might want to take advantage of someone else’s trust in you to gain some advantage; they concern the choice as to exploit your neighbours to your own advantage or to co-operate with them. An example of the Prisoner’s Dilemma in the real world is of arms races between superpowers. Countries are jointly better off when they cooperate and avoid an arms race. Yet the dominant strategy, especially if there is no trust, is for each country to arm itself heavily.

The Prisoner’s Dilemma also works as a simplified model for competitive interactions between animals, including humans, in the Darwinian struggle for survival of the fittest. If you benefit by either co-operating or defecting in competitive situations where there is a real struggle to survive, then doing so means there is an improvement in your chances of survival and reproduction. So in these basic terms, your chances of survival and reproduction, and so continuation of your genes, can be attributed to your propensity to either co-operate and pass on co-operative genes, or to be ruthlessly selfish (that is, defect) so that the next generation, with similar nasty genes, will also have a higher tendency to be nasty defectors. If defection is always the best strategy in situations of comparative advantage, then all animals, including humans, should have evolved into non-co-operative nasty beings.

But it is clear that in fact many species, including humans, do co-operate – so the question is: How could co-operation have evolved in the elemental struggle for survival? How could it ever be rationally the best action to cooperate? Either there are various solutions to the Prisoner’s Dilemma or similar models that might explain the success of co-operation in the evolution of life, or there is another explanation for the origins of various altruistic types of behaviour (or both!).

Solutions to the Game of Life

The Prisoner’s Dilemma and similar games have been computerised so that they can be played enormous numbers of times, emulating long-term evolution, and strategies identified that seem advantageous in the long run. With survival and reproduction chances being linked directly to scores in these games, these strategies can be models for evolutionary trends too. In this way mathematicians have worked out that co-operation, and so perhaps even morality, can evolve naturally and rationally from selfish behaviour. There is no need for God or another all-powerful authority to impose co-operation on us or other animals.

In the 1970s, political scientist Robert Axelrod set a competition to discover which strategy would beat others in repeated Prisoner’s Dilemma games (see his The Evolution of Cooperation, 1984). The key strategies he and others discovered are as follows:

1) Always Defect

This seems to be the most rational strategy for a one-off Prisoner’s Dilemma game when you don’t know whether you can trust your opponent. Another example of a simple game is what Douglas Hofstadter (in Metamagical Themas, 1985) calls the Wolf’s Dilemma. Here people sit in separate rooms and are told to wait twenty minutes to gain a reward of $1,000 each, or alternatively, press a button before twenty minutes is up, which would give them $900 and stop everyone else getting anything. Although the best outcome would result from everyone waiting to collect the larger amount, you may wonder if someone might press the button prematurely – in which case you better press yours first. You will quickly realise that others might be fearful of the same thing, and so will also be considering whether to press their button first. So in fact the button is pressed very quickly, and one person gets a reduced award, and everyone else gets nothing. Rational thinking here promotes defection and has prevented the best outcome from occurring.

‘Always defect’ at first seems to be the most successful strategy in the Prisoner’s Dilemma too; but in repeated Prisoner’s Dilemma games it sows the seeds of its own downfall. Those who defect are soon known as not to be trusted; and so defectors are soon met with defection by other players and everyone ends up with lower scores than necessary – or in evolutionary terms, lower chances of survival and reproduction. Even worse for defectors is the fact that others might refuse to ‘play’ with them – the equivalent of social ostracism in the real world. Defection does seem to be an advantageous strategy when interacting with strangers, if you know you will not cross paths again. On the other hand, most social interaction is within settled communities of animals or people, and in these communities both defectors and co-operators are soon well-known.

2) Tit For Tat (Direct Reciprocity); or ‘You scratch my back and I’ll scratch yours’

“You lookin’ at me?” Bats: actually quite nice if you meet them in social situations
Bat © MathKnight 2014

In Axelrod’s trials, the program that won used a strategy in which one player always mimics the decision the opponent made – co-operating when the opponent co-operates, and betraying when his opponent betrays him. This strategy is known as ‘Direct Reciprocity’ or ‘Tit for Tat’.

A good natural example of this strategy in action in nature is with the vampire bat. These bats require blood every three days or they will die. If a bat is unable to hunt one night, other bats may regurgitate some of their night meal so that the hungry bat can eat. Researchers have found that they are more likely to share their meals with bats that have previously fed them. The bats spend a long time grooming each other and it seems they can recognise all members of their colony. This example of direct reciprocity, depends on the creatures not only recognising each other as separate individuals but also remembering acts of kindness, or acts of selfishness. So discrimination and memory, and also a settled, not-too-large community, are essential for such reciprocity to be evolutionarily advantageous.

Another example of direct reciprocity is that of ‘cleaning stations’ on coral reefs, where large fish go to be cleansed of parasites by smaller fish that they would normally eat. The cleaners even enter the mouths of the larger fish and clean their teeth. The smaller fish get a free meal, and the larger fish get cleaned; everyone’s a winner. If in this situation the larger fish ate the smaller fish, it would become a defector, and would not subsequently be cleaned, but this does not seem to happen.

In both these cases, one good turn deserves another and co-operation reigns. But paradoxically, in a world where everyone trusts everyone, the temptation to defect is greater partly because the widespread trust makes it easier to defect. If all cars were left unlocked, the opportunity and so temptation to steal one would be much greater. But in the tit for tat world if you do cheat – take advantage of others’ trust to gain a momentary success – there’s always the prospect of vengeful retaliation: if you defect then I will defect.

Tit for tat is successful because it is simple: if your opponent defects, he knows you will retaliate, and also that you will repay in kind if he co-operates. There is no build-up of grudges over a long term. However, if your opponent continues to defect then you will too, and so a ‘feud’ may develop. In fact, if both players adopt a tit for tat approach and through some mishap one thinks the other is defecting even when they are not, then the procedure will still end up as a continuous exchange of retaliation.

3) ‘Generous Tit For Tat’ and ‘Win Stay, Lose Shift’

In his 2011 book SuperCooperators, Martin Nowak of Harvard University reports trying to make Axelrod’s computer simulations more realistic by bringing in the occasional mistakes – to emulate when in the real world someone misconstrues what someone else is doing, for example. He found that a generous form of tit for tat, in which retaliation only occurs randomly and occasional defections are forgiven, is a better long-term strategy. This approach prevented the process sliding into continual retaliation. However, it evolved into an even more generous strategy, which thrived among co-operative players but also became an open invitation for the nasty defectors to come back in and have a field day. There seemed to be no strategy that remained dominant for long in these computer simulations. But then Nowak developed a strategy which was, in essence: if I am doing well then I’ll repeat my move, and if I am doing badly I’ll change my move – or as he called it, the ‘Win Stay, Lose Shift’ strategy. This strategy generally did well, but not against those who continually defect. What happened in his computer simulations was that tit for tat cleared out the nasty defectors, and then Win Stay, Lose Shift came in and won.

An example of this in the real world is provided by sticklebacks, which can leave their shoal and go off in pairs to scout to see if local predators such as pike are hungry. When they inspect the pike the two sticklebacks approach in short spurts, and then dash back if the pike moves. They play a version of the Prisoner’s Dilemma, as each fish either co-operates by volunteering to move forward, or defects by hanging back, exposing the other fish. In an experiment with carefully positioned mirrors, each stickleback had its own reflection as a companion. It was discovered that by tilting the mirror so that the companion (reflection) could be made to move forward or retreat, the real fish would switch back and forth between co-operation and defection, just as in a ‘Win Stay, Lose Shift’ strategy. The fact that the fish does not understand what it is doing is beside the point: such reciprocity requires no conscious decisions (as the computer simulations suggest). Rather the strategy is worked out through natural selection, which gradually programs the fish to respond appropriately to environmental stimuli.

Other strategies have been developed such as ‘Firm but Fair’ which is both better and nicer than ‘Win Stay, Lose Shift’. The key point is that computers have shown that evolution does not necessarily lead to nasty selfish beings evolving, but also shows how characteristics such as generosity or forgiveness could be evolutionary products.

4) Reputation (Indirect Reciprocity); or ‘I scratch your back and someone else will scratch mine’

One problem with the above scenarios is that real-world interaction does not often come in pairs. We live in much larger groups, and although the above procedures work well in small groups, the importance of indirect reciprocity in larger groups can be seen by considering reputation and its transmission via small talk (or gossip). In fact, reputation is crucial, as we all depend on others that we hardly know (or don’t know at all) for many aspects of our lives.

We usually want others to see us as good people, and we want to develop this good reputation so that others trust us. And there is a clear link between the development of co-operation and the evolution of empathy, which is a way of understanding the motivations and intentions of others. Primates, such as humans and (other) apes, have developed mirror neurons, which help us to appreciate what others are feeling or thinking: they enable us to place ourselves in others’ shoes. In fact, the evolution of co-operation in humans seems to have gone hand in hand with the evolution of the brain; the most conspicuous brain development since the appearance of hominids has been the development of the prefrontal cortex, just behind your forehead. The function of the prefrontal cortex is almost entirely connected with our social behaviour and our concerns about how others might judge us.

Once again Nowak pioneered new approaches in this area in computer simulation terms. In a new program he made two changes: one was to bring in numerous players so that everyone plays against different partners (so they cannot retaliate against a previous defector), but each player also develops an ‘image’ or ‘reputation’ gauged by how often they defect or co-operate. To make this scenario even more realistic, Nowak made subsets of people privy to these images, to mirror the real-life fact that gossip tends to be within small groups of people. Unsurprisingly he found that a simulated player’s responses depended not only on what another player had done to him in the past, but also on what he had done to others. Moreover, good reputations spread through the (virtual) society and increase the chances of co-operation; and bad players with poor reputations receive less help and others refuse to play with them. In the words of one of Tom Lehrer’s songs, “Be careful not to do/Your good deeds when there’s no one watching you”. Also, if you defect against someone with a bad reputation, it does not harm yours, but if you defect against someone with a good reputation, it does adversely affect yours.

5) Altruism

One of the big problems for evolutionary theory is how to explain the evolution of altruism – that is, sacrificial co-operation when you have no or little possibility of recouping your losses. In the competitive struggle for survival you would expect altruism to be suicidal. In recent work on group selection it has been found, again using computer modelling, that although being overgenerous does decrease your chances of survival and reproduction within a group, there is a different dynamic between groups. When groups are in conflict (at war) with other groups, those groups with a larger proportion of altruistically-oriented members do best. The more individually selfish groups fail in intergroup rivalry, and so selfish organisms die out particularly in times of intergroup conflict. This could explain how self-sacrificial altruism developed. Paradoxically, it might be that the highest human virtue evolved because of war.

Conclusion

Some say that if co-operative virtues have evolved through self-interest, this suggests that morality is simple egotism. However, this is to misunderstand what’s going on. The genesis of morality may be due to long-term genetic self-interest, but the point is that the evolution to become genetically-predisposed to act in certain (successful) ways works in human beings through moral sentiment, such as a sense of fair play (or conversely, of injustice), a sense of responsibility, a sense of guilt or compassion. So the moral instincts and the resulting behaviour have become part of our biological nature; and it’s not selfish to act out of a genuine sense of compassion or fair play. To argue that because the origins of morality are through self-interest therefore morality is just self-interest is an example of the genetic fallacy – the idea that the evolutionary origins of something determines its present nature and value.

People can, of course, pretend to be generous and co-operative in order to fool others as to their intentions and trustworthiness and then take advantage of them. This is clearly egotistical behaviour, but it is not what the evolutionary game theory story above is explaining – which is the possible genesis of moral behaviour, not the genesis of cheating. However, arguably, cheating is parasitic on morality, since you can only cheat if there is already some kind of trust operating.

Brian King is a retired Philosophy and History teacher. He has published an e-book, Arguing About Philosophy, and now runs adult Philosophy and History groups via the University of the Third Age.

The Hunter’s Share

This game is a little more life-like than the Prisoner’s Dilemma, as it involves more than two protagonists. The aim is to gain the most points, or ‘food’, after ten rounds or so. Imagine you are one of three hunters. In each round of the game only one succeeds in catching prey, and this is determined randomly by rolling dice, so that for each round it’s not predictable who catches it. The amount on the dice also determines the size of the catch – so that a ‘two’ is a poor catch, and a ‘three’ is only half of a ‘six’, and so on. The catch is then shared out. You the hunter can keep all of it for yourself, giving no points for the others; or share it out any way you like, either equally or unequally between the competitors.

Play the game with some friends and see who wins. What strategy has proved most successful? Why?