Sometimes it is not possible to find a Nash equilibrium using pure strategies, e.g. using iterated removal of dominated strategies. But, in these cases there will be an equilibrium where one or more of the players is mixing their strategies (for finite games).

To see what I mean, consider the following game where the row player can choose between the strategies **U** and **D**, while the column player can choose between **L** and **R**:

There are no dominating strategies among U and D, neither between L and R. In fact, for any cell in the matrix there is one player who will want to move away from it. (Say the strategy combination U-R is picked first. Well, now the column player is satisfied, but the row player thinks they could do better, changing to D, but then the column player will want to change to L, so then the row player changes to U, and …)

But what if strategies could be mixed in the sense that players pick one at random according to some distribution? Then, if the same game is played over and over the players would want to do well on average, and tune the probability by which they choose between strategies accordingly.

Could the *expected* utility be such that there's a Nash equilibrium (that is a state where none players would not want to pick another of their strategies)?

The strategies U,D, and L,R are called *pure* strategies while a combination of them is called *mixed*. A mixed strategy is essentially a probability distribution over each player's strategies.

Let's assume that the row player chooses U with probability \(p\), then they must pick the other, D, with probability \(1-p\).

In the same way, assume that the column player picks L with probability \(q\) and R with probability \(1-q\).

Like this:

Our task then is to set up some kind of equation system and solve for \(q\) and \(p\).

The key observation is that *for a player to be indifferent to the available strategies, the expected utility those strategies must be equal*. Meaning that the row player should determine \(p\) such that their opponent (the column player) expects the same reward from both their available strategies (L and R).

If this isn't the case, then there is always a 'better' choice for the opponent. So, *to determine the probability distribution for one player we look at which point the opposing player is indifferent*.

In this spirit, let's start by determining the probabilities \(q\) and \(1-q\) for the column player strategies. The above means that for the row player the following must hold \[E\left[U\right] = E\left[D\right].\]

So, what is \(E\left[U\right]\)? Well, with probability \(q\) the column player chooses L, in which case the utility for the row player will be \(2\), and with probability \(1-q\) the column player chooses R, in which case the utility for the row player will be \(1\).

So, if a game is played over and over the row player can *expect* the following reward on average from strategy U.

\[ E\left[U\right] = 2\times q + 1\times\left(1 - q\right) = q + 1. \]

Using the same reasoning we can see that the row player's expected utility for strategy D is:

\[ E\left[D\right] = 1\times q + 4\times\left(1 - q\right) = 4 - 3q. \]

Now, because the row player must be indifferent we know that \[ q + 1 = 4 - 3q, \] and solving it gives \(q = \frac{3}{4}\).

And here is our answer then. The probability for strategy L is \(q = \frac{3}{4}\), while the probability for strategy R is \(1-q = 1 - \frac{3}{4} = \frac{1}{4}\).

For completeness we then do the same for the row player; calculating \(p\), and \(1-p\). Now we need to see what the expected reward is for the column player.

\[ E\left[L\right] = -3\times p + 1\times\left(1 - p\right) = 1 - 4p, \] \[ E\left[R\right] = 2\times p + (-1)\times\left(1 - p\right) = 3p -1. \]

Then requiring the column player to be indifferent \[E\left[L\right] = E\left[R\right],\] \[1 - 4p = 3p -1,\] from which we have \(p = \frac{2}{7}\).

So, the row player picks strategy U with probability \(p=\frac{2}{7}\), and strategy D with probability \(1-p = \frac{5}{7}\).

That's it. Again, note that the probability for a strategy is computed from the observation that the opposing player must be indifferent.

The basis for this document is a game theory exercise session I taught for an AI course. The notes are quite verbose to start with, because I'm trying to explain every step extensively. Moreover some terms are not directly defined even though I try to explain things on a quite basic level. It was assumed that the students had attended an introductory lecture. So the game is on normal form, 'equilibrium' means Nash equilibrium, and so on. Also please note that in the text I use the terms reward, payoff, and utility interchangeably.

Basically, this is an in-depth example, not a formal introduction.

Iterated elimination is about removing strategies which are **dominated** by other ones. A player's strategy is dominated if all associated utility values (rewards) are *strictly less* than those of some other strategy (or a mixing of other strategies, but that can be left out for now).

Games between two players are often written in a so called game matrix. The first (row) player strategies are written as rows and the second (column) player's as columns. The utility/payoff outcome of each combination of strategies are then written in the cells of the matrix, so that the utility for the row player is the first value, and the utility for the column player is the second one.

I've marked it with colours in the example below, where the row player strategies and utilities are marked in blue, while the column player uses black.

Here the row player has three choices: A, B, or C, while the column player has three other: X, Y, or Z. In (basic) game theory the players are often assumed to be perfectly rational and self-interested: they want to do as well as possible for themselves, and don't care for others. Based on that we can ask if it is possible to exclude one or more of the strategies for one or both of the players, and in process simplify the possible outcomes. When one strategy is removed further simplification may become feasible, and so the whole process can be iterated.

To find *dominated* strategies - those which may be removed - we have to compare the utility values for the strategies for the respective player. Let's do that for the matrix above.

**The Row player** can choose any of the rows, and their utility is the first of each pair of values. First, compare **A** (values \(8,7,2\)) and **B** (values \(7,3,1\)): \(8 > 7\), \(7 > 3\), \(2 > 1\). Well, row B is smaller than A in every column. We say that A *dominates* B: If you think about it, this means that whatever strategy (X, Y, or Z) picked by the Column player, if the Row player has to choose between A and B - *it is always better to choose A*!

So we found one dominated strategy. Let's continue. **A** vs **C** : \(8 > 0\), \(7 >0\), \(2 <3\). So here with see that two components of A is greater than C but one is not. Therefore neither does A dominate C, nor does C dominate A.

Then **B** vs **C**: \(7 > 0, 3 >0, 1<3\) - neither of them dominates the other.

**The Column player** can choose between X,Y, Z. So, now we compare the second values against each other for every column.

**X** vs **Y** : \(5>1, 4 > 3, 1 > 0\); so every value of X is greater than the corresponding value of Y. This means that X *dominates* Y. (Because for the column player it is always better to choose X over Y no matter what the row player does.)

**X** vs **Z** : \(5 < 7, 4 > 2, 1 < 5\) - no dominance for either strategy.

**Y** vs **Z** : \(1 < 7, 3 >2, 0 < 5\) - again no dominance.

When we have found one or more dominated strategies these can be discarded from the game.

As we have found two of them in this iteration you may ask *'will it matter which one is removed first?'*

The answer is *no*, as long as there is a *strict dominance*. That is, the utility values of a dominated strategy must always be *strictly less* than those of the dominating one. (If all values are only less or equal it is called *weak dominance* and then the order can matter [so be careful!].) Here, we are only concerned with strict dominance.

Anyway, let's just cross out the dominated strategies from the table.

First **Y**:

Then **B**:

Now we are left with a simplified matrix, and can iterate the algorithm comparing the remaining utilities in the same way.

So compare **A** and **C** again: \(8 >0, 2 < 3\). Still no dominance here.

Then compare **X** and **Z**: \(5 < 7, 1 < 5\). Aha, so now Z *dominates* X. It didn't before, but as the row player has discarded strategy B, the need for that comparison is gone. So, column X can be removed:

Next iteration - we are left with in effect a 2x1 matrix. There is no choice left for the column player - it will always be best for them to choose strategy Z, but the row player may still pick either A or C. So, let's take this for another round and compare **A** and **C**: \(2 < 3\). So, this time around C *dominates* A, meaning this too can be removed:

We are now left with a single cell, and the knowledge that the optimal strategy for the Row player is **C** and the optimal strategy for the column player is **Z**. The associated utility is 3 for the Row player, and 5 for the Column player.

If you look at that reward you can see that no player is able to do better *on their own*. That is, the Row player would not rather choose A or B (given that the column player has chosen Z) and the column player would not rather choose X or Y (given that the row player has chosen C). (Sure, there are better rewards elsewhere, but that would require both players to agree to change, and once there at least one player can get an ever better reward by choosing something else.)

This is iterative elimination which is used to simplify games. Note that this time it took us all the way to a specific combination of strategies which isn't always the case. Often you may be left with multiple strategies per player.

The basis for this document is a game theory exercise session I taught for an AI course. The notes are quite verbose to start with, because I'm trying to explain every step extensively. Moreover some terms are not directly defined even though I try to explain things on a quite basic level. It was assumed that the students had attended an introductory lecture. So the game is on normal form, 'equilibrium' means Nash equilibrium, and so on. Also please note that in the text I use the terms reward, payoff, and utility interchangeably.

Basically, this is an in-depth example, not a formal introduction.

While updating these pages back last summer, I took a brief detour to write an exporter from jupyter notebooks to Emacs orgmode. I haven't written much about it here before, but jupyter notebook is one of my go-to tools for data analysis and prototyping.

At the same time, use orgmode for note taking, journaling, these pages, and so on. Now and then it becomes necessary to convert from jupyter to org. Jupyter notebooks can be exported to org via the fantastic pandoc program. However, I wanted to be able to treat code blocks in jupyter as org-babel source, and also be able to choose how to import latex, markdown, and other content to the org-file. This lies outside the scope of pandoc, so I decided to write my own nbconvert exporter plugin, called nbcorg.

You can install it from pypi using `pip install nbcorg`

.

After installation there will be new org targets are available under the jupyter export menu option, as well in for the standard jupyter nbconvert command line program.

It's still in an early version and has been available on github and pypi since July, but I thought I'd finally mention it here as well.

]]>In the darkening Finnish autumn of 2018 I decided to take part in a seminar on Digital Ethics at Aalto University. It proved to be a good decision - as you probably gather from these web pages, human relations with other computational processes is something I am always curious about - the course provided an interesting perspective based on sociology, design, law and ethics. It was also a lot of fun to engage in discussions with the other participants.

One term which kept coming back in the articles I read for the course was "governance". Very broad, clearly, and especially popular in circles discussing super-intelligent AI, but equally applicable to the wider spectrum of digital technology discussed during the seminar. Other terms that kept returning was "accountability", "fairness", "transparency", and so on. Terms related to technology performance, but performance not of a task, but in society.

I found that quite interesting, because as a species we humans are very dependent on our tools and technology. On a large scale, governance is about trying to steer that entwined and inseparable dynamic system of matter and though.

On the largest scale it is about playing an infinite game.

But, on the scale of that seminar, of digital ethics, it is about those terms - how to build technology for today, and for next year, which is fair, transparent, accountable, and so on. This is equal portions business and engineering. As much judicial matters as choices of design.

So, for my course project I decided to look at governance of such technology intended to supplement or replace human functions (a traditional reason behind new technology). It resulted in a mini-review, where I tried to use proposed approaches to governance to identify concerns of performance, and finally to base a discussion of how performance along those axis may be a basis for how a technology adapts or disrupts society.

I presented the poster at the FCAI-days 2018. It has soon been a year since I presented the work which somehow reminded me to put it up here.

The site has revamped. Including the RSS, I hope I did not break your feeds.

I didn't post anything on these pages for a very long time. Couple of years, I think. I might write an update one of these days.

In any case, around mid summer I found myself with an unplanned few weeks of nothing and decided to spend one of them revamping the web pages. The first results is what you see here. Most of the old stuff is till around, but I now rely on static pages (generated from org-mode sources, because that is how I have been taking notes the last few years).

WordPress, which the site ran on previously, worked brilliantly for most things, and I am sure my own quite limited knowledge and skill (not to mention interest, sadly) in web design can never reach some of the aesthetics of the most basic plugins I could just have dropped in under WP.

Still, I had gone around mulling a revamp for a couple of years already. First, my WP login pages got hammered, day and night, by scripts trying to break in. Of course I employed IP blocking, et c, et c, and had a strong password. Nobody ever got it, but it was annoying to see the logs over and over. Second, PHP has always turned me off as a programming language, so instead of fixing some issues I had with the previous page I just felt apathy and ran with it. (Well, not sure I will be saner with Emacs lisp, but at least I feel that I can script-process my static pages, if I really want to.) Third, given that this is just me putting out words in the void, WP was overkill.

Yeah, perhaps I'll get famous one of these days and really regret not having a comment section. Feel free to point it out when that happens.

.L

]]>