# High score tetris AI using branching algorithms

This post will explore how branching algorithms can improve the performance of agents playing tetris without gravity.

The techniques used are similar to Nintendo Tetris AI Revisited by Meat Fighter. It may be worth reading his post first, as this post builds upon his work

## PSO configuration

We define an `evaluate`

function as a heuristic that gives a score to a tetris board and block type (e.g. I-PIECE). `evaluate`

is calculated by getting the dot product of a feature vector and a weights vector of the same size, i.e. $\text{score} = W_v\cdot F_v$. PSO was used to find optimal weights for the weights vector of `evaluate`

. The feature vector and the PSO setup used is similar to that of Meat Fighter’s post. Note that the branching algorithm is not incorporated in the PSO training phase, `evaluate`

is used directly.

Notable differences include:

- Using global PSO from pyswarms, probably with different parameters
`{'c1': 2.5, 'c2': 2.5, 'w': 0.92}`

- num particles: 96
- num iterations: 3600

- Removing
`lock_height`

from the evaluation function

Our configuration for each particle is as follows:

- each particle plays 150 games using the same piece sets
- the score for a particle is the average of the top 50% of games

## Branching algorithm description

The branching algorithm is inspired by alpha-beta pruning algorithms, except with only one agent. The inputs are a board and the next two block types from the piece set (call these $p_1$ and $p_2$). The idea is to search promising branches to find the best placement for $p_1$.

### Bfs algorithm

Define the following methods:

`findAllMoves(board, blockType)`

: finds all moves (a.k.a`PieceInfo`

) given a board a blockType`evaluate(board, PieceInfo)`

: uses predefined weights to give a score for the game state

Firstly, if there is a move that results in a tetris with $p_1$, select that move.

If not, observe that we can apply these functions two times each to find an ideal position for $p_1$ just by using $p_1$ and $p_2$. This would result in a tree of height $2$, where we take the branch with the leaf node that has the best evaluation. This is the simplest way of utilising $p_1$ and $p_2$, and describes what is used in `Bfs`

in the results.

### Bfs+branching algorithm

For more predictive power, we aim to maximise the expected evaluation of a partial search tree. To do this, we sum the product of the evaluation if we had block type $t$, multiplied by the probability $P(t)$ based on the piece RNG algorithm. Adding a depth parameter $d$ to make the results computable, the recursive relation looks like:

$\texttt{evalBoard}(b, d) = \sum_{t \in T}{P(t)\cdot\texttt{evalGivenBlockType}(b, t, d)}\\ \text{where }T\text{ is the set of block types}\\ \texttt{evalGivenBlockType}(b, t, d) = \begin{cases} \max\limits_{p\in M(b, t)}{\texttt{evaluate}(b, p)} & \text{if } d = 0 \\ \max\limits_{p\in M(b, t)}\texttt{evalBoard}(\texttt{applyPiece}(b, p), d-1) & \text{otherwise} \end{cases} \\ \text{where }M = \texttt{findAllMoves}$From this, we see that the branching factor for `evalGivenBlockType`

calls is the number of block types ($7$) multiplied by the number of moves from `findAllMoves`

(typically $10-20$). To mitigate this, we define a pruning factor $n_1$, and only continue the search in the top $n_1$ scores, bringing the branching factor down to $7n_1$ (however, `evaluate`

still needs to be called on all of these states to establish which should be pruned).

An implementation detail is that `evaluate`

adds an offset of `-1e9`

when a tetris is made. Slight modifications are made to ensure that this offset is carried through to the leaf node of the tree, to promote aggressive decisions

With pruning on $p_1$ and $p_2$ included (prune factors of first layer 3, subsequent layers 2), only one extra layer was able to be computed in a reasonable time (132x slower than no branching). Profiling reveals that the ratio of `evaluate`

to `findAllMoves`

is about the same between `bfs`

and `bfs+branching`

($73:19$ and $78:16$ respectively), and the number of calls to each function `evaluate`

and `findAllMoves`

is 154x and 146x respectively.

### Bfs+branching visualisation

Below is a graph that visualises the branching factor. The numbers on the edges indicate how many children the parent node has. $s_1$ and $s_2$ are special because the block types are known. Note that, for each rectangle node, **all** the possible moves result in a call of `evaluate`

, for pruning or finding the best value in the leaf node. This explains the number of calls to `evaluate`

, despite the selective search (and therefore explains why it is much slower to compute)

## Results

After training is finished, a different set of 150 games were used to see how the bot performs. This was run with two configurations - with and without branching - using an identical state evaluation function `evaluate`

:

name | median | mean | maxout% |
---|---|---|---|

Bfs | 1066910 | 1053015 | 81.33 |

Bfs+branching | 1230460 | 1193441 | 94.67 |

Note that the highest score that can be achieved is 1542000. Raw data: bfs, bfs_branch.csv

We can also compare the probability of a bot achieving a certain score:

From this data we draw the following conclusions:

- 96% of the time, bfs+branching is superior
- On average, the bfs+branching achieves an additional 140426 points (or a 13% improvement)
- Notably, bfs+branching is 132x less efficient than bfs (913.5s vs 6.9s to compute)

## Possible improvements

There are a few things that can be done to improve the performance of the agent:

- Adjusting the parameters
- Adding or finding more features
- Using a neural network instead of a linear combination of parameters is an idea. Using a neural network can allow for cases where one feature can have more or less weight depending on another feature. For example, the agent should play more safely when the pile height is high or the board is too jagged

There are a few improvements that will be possible with better hardware:

- Training with a larger dataset
- Training with the branching algorithm instead of
`evaluate`

Ultimately I’m satisfied with the results - a 13% improvement is significant, especially when considering it is approaching the theoretical maximum score. It would be nice if it was less computationally expensive. There is potential for offloading the evaluation function (which is 75% of the processing) to the GPU, which would be a substantial improvement