Tests chess engines by letting them play each other.
chess-engine-tester is a command line tool that allows you to test two (or three) chess engines by letting them play each other. You can use applications like XBoard or Arena to do this too, but chess-engine-tester allows you to use the command line. It also has the somewhat unusual feature of letting three chess engines play in the same game.
chess-engine-tester uses the XBoard protocol, so it can only be used to test XBoard chess engines.
Download the latest chess-engine-tester package from the GitHub releases page, and unpack it somewhere on your hard drive. Add this directory to your PATH for easy access.
In addition to the chess-engine-tester package itself, you also need Java 17 or later to run the tool. You can download the Java 17 runtime for free from Adoptium. If you are on Linux or macOS (or Cygwin) you can also use a tool like SDKMAN!.
Assuming that you added the chess-engine-tester directory to your PATH, you should be able to type this to get some help:
$ cet --help
It will print something like this:
Usage: cet [-hV] -1=FILENAME -2=FILENAME [-3=FILENAME] -n=NUMBER [-o=FILENAME]
-t=TIME CONTROL
Tests chess engines by letting them play each other.
-1, --engine1=FILENAME Chess engine 1 config FILENAME.
-2, --engine2=FILENAME Chess engine 2 config FILENAME.
-3, --engine3=FILENAME Chess engine 3 config FILENAME. The optional third
engine will shadow the engine playing black. It
will think about the same moves as the black
engine, but its counter moves will only be
logged, and not played.
-h, --help Show this help message and exit.
-n, --number=NUMBER Number of games to play. Either 1 or a positive,
even number.
-o, --output=FILENAME PGN game file FILENAME. If not specified, no file
will be written.
-t, --time=TIME CONTROL Time control in PGN format. Either moves/seconds or
initial+increase (both in seconds).
-V, --version Print version information and exit.
To run the tool and start a match you need to specify the number of games to play (-n), the time control (-t), and the configuration files for the two chess engines (-1 and -2). The number of games can be either 1 for a single-game match or any even number greater than zero for a multi-game match. The time control is specified in PGN format, so either the number of moves to make in a period and the number of seconds per period, or the initial number of seconds for the game and the time increment per move. The configuration files are specified by a filename, including an optional path. The format of a configuration file is described below.
To start a four-game match between two engines, whose configuration files reside in a directory called conf, and with a time control of 40 moves in 5 minutes, you would use this command:
$ cet -n 4 -t 40/300 -1 conf/engine1.json -2 conf/engine2.json
Optionally, you can specify an output file (-o) where finished games will be stored, and the configuration of a third chess engine (-3), see below.
A chess engine configuration file is a simple JSON file with two entries—the command used to start the chess engine, and the directory to start in. The command may include the path to the chess engine executable. The directory is optional. If not specified, the chess engine will start in the current directory. Below is an example file for GNU Chess to use on Linux/macOS.
{
"command" : "gnuchess -x",
"directory" : "/tmp"
}
When running chess-engine-tester on Windows, you may need to use cmd.exe to start the chess engine, for example like this:
{
"command" : "cmd.exe /c ronja.bat",
"directory" : "c:/chess/ronja-0.9.0"
}
An interesting feature of chess-engine-tester is that you can let three chess engines participate in a game. When you start a game with three engines, the third engine will shadow the engine playing black. It will be fed the same moves as the black engine, but its counter moves will not actually be played. Instead, they will be logged for later inspection. If the move of the third engine differs from what the black engine played, the alternative move will also be included as a comment in the PGN file. The three engines run in parallel in different processes.
This feature can be used to compare two engines, or different versions of the same engine. All moves are logged with timestamps, so it is easy to compare what moves are generated and when by inspecting the log file.
The third chess engine must support some additional XBoard commands besides those that are required for basic play. The required commands are remove that is used to retract a move, and playother that is used to set the engine to play the color that is not on the move.
Below is an example of what the log file could look like when playing with three chess engines.
2021-11-01 17:21:59.522 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] WHITE -> 1. d2d4
2021-11-01 17:21:59.526 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] BLACK <- 1. d2d4
2021-11-01 17:21:59.527 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] EXTRA <- 1. d2d4
2021-11-01 17:21:59.745 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] BLACK -> 1... e7e6
2021-11-01 17:21:59.746 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] EXTRA -> 1... g8f6
2021-11-01 17:21:59.747 INFO [se.dykstrom.cet.services.game.GameServiceImpl compare] Black engine returned move e7e6 but extra engine returned move g8f6
2021-11-01 17:21:59.955 FINE [se.dykstrom.cet.services.game.GameServiceImpl logMove] WHITE <- 1... e7e6
Note that the default log level is WARNING. To make chess-engine-tester actually log the moves as in the example, you need to configure the log level to FINE in file logging.properties.
se.dykstrom.cet.level=FINE
The beginning of the move text part of the PGN file could look like below, when the third chess engine reported an alternative move.
1. d4 e6 {1... Nf6}
Debug output (any line prefixed by a #) that is printed by a chess engine is logged to the log file with level INFO. To see the debug output, you need to configure the log level to INFO or less in file logging.properties. See above for how to configure the log level.
To build your own version of chess-engine-tester you need Java 17 and a recent version of Maven. Note: Maven can also be installed using SDKMAN!.
Clone the git repo:
$ git clone https://github.com/dykstrom/chess-engine-tester.git
Build and run all unit tests:
$ mvn clean test
Build the distribution zip file without running any tests:
$ mvn clean package -DskipTests
To run the integration tests you need to install some dependencies.
- GNU Chess must be installed and added to your path.
- Ronja 0.9.0 must be installed in a subdirectory called engines. The directory structure should look like below.
chess-engine-tester/
└─ engines/
└─ ronja-0.9.0/
With the dependencies installed, you run the integration tests like:
$ mvn clean verify
If you don't want to install any dependencies locally, you can also run the integration tests in Docker. Note that Docker BuildKit must be enabled for this to work.
With Docker installed and BuildKit enabled, you run the integration tests like:
$ docker build .
Integration tests in Docker will always use Maven profile slow-tests.