Thursday, March 8, 2012

Computer Reversi, Part 1.5: The Thor database (Othello tournament files)

As part of my research into Reversi, I looked for an archive of games. The only substantial one I found is called the Thor Database. It's an archive of Othello games by French archivers. They have archived are over 100,000 games. You can download the game databases here.

My initial motivation for finding an archive was to have a easy way to unit test my code to ensure it implemented the Reversi rules correctly. If the game engine could play through 100,000 games and calculate the final score, that's a good indication that the rules have been implemented correctly. An archive can also be used for book openings. I have another idea for the archive too, which I will hopefully reveal soon.

My Reversi source code has a Thor project with a bunch of methods and classes for extracting the data for whatever nefarious needs you may have. I don't think there is any code in C# on the net for this, so maybe one day it'll be helpful for someone. Maybe. (Probably not.)

The file format documentation is challenging to understand because it is written in French. I have translated (thanks to Nuz and google translate) the core elements of the document in the section below. The aspects I found challenging were:
  • Interpreting the meaning of the black player's score (I needed to calculate the score as the final check to ensure my rules were working)
  • Interpreting the way they recorded the individual plays
  • Converting elements of a byte-array to a 16-bit integer. (The data are stored as little-endian. One way: BitConverter.ToInt16(new [] { thorArray[i], thorArray[i + 1] }, 0).)
To calculate the Black Score:
  • Get sum of black pieces and the sum of white pieces;
  • Whoever has the most pieces wins;
  • The winner adds to the sum the number of empty squares;
E.g., Black has 44 pieces, White has 8 pieces; Black's score is 56. Or, Black has 13 pieces, White has 35; Black's score is 13. In the case of a draw, Black's score is 32 (as the empty squares are shared between Black and White, 32 is the only possible score for a draw). I still haven't figured out what it means for Black's score when the game ends before neither player can play a piece. The score seems inconsistent. This only occurs in 406 of 107,473 games. (Not a big deal.)



The data, such as Word and Longint, are stored in Intel format, ie the lowest byte first. There is a game file for every year. Games are stored in any order, but normally grouped by tournament.

Database header fields
All files in the database Wthor have a header of 16 bytes, followed by a number of game records all having the same size. The header consists of the following fields:
  • Century file was created
  • Year file was created
  • Month file was created
  • Date file was created
  • Number of records N1
  • Number of records N2
  • Year of game
  • Parameter P1: size game board
  • P2 parameter: type of games
  • Parameter P3: depth
  • X1 (reserved)
Game fields
  • Tournament Number
  • Number of Black player
  • Number of White player
  • Number of black pieces (true score)
  • Theoretical score
  • Move list 
The plays are stored in chronological order. Row number (1-8) and column (A-H) can be derived from the following operation: column + (10 * row). E.g., a1 = 11, h1 = 18, a8 = 81, h8 = 88.

Tournament file
Each record (26 bytes) is an array of characters terminated by a binary zero. The effective length is 25 characters. There is a 16 byte header for this file. 

Player file

Each record (20 bytes) is an array of characters terminated by a binary zero. The effective length is 19 characters. There is a 16 byte header for this file.

6 comments:

  1. how can see the render of this project?

    ReplyDelete
    Replies
    1. The latest source code is at https://github.com/ledpup/Othello. You need Unity, found at http://unity3d.com/. Download the source code and double-click the reversi.unity file in the Assets folder once you have Unity installed. That'll open the project up in Unity. You can run it in there, or build and run separately.

      Delete
  2. By any chance have you figured out the content of 18 byte header?

    ReplyDelete
    Replies
    1. I meant for tournament file and player file, and not for the game file.

      Delete
    2. I didn't figure it out. Did you, in the end?

      Delete