Initially, I tried using a trie data structure. It seemed like an appropriate structure. However, performance was poor and programming against it wasn't overly intuitive. I tried changing how the data was stored to try to improve performance. Instead of using a human readable notation (i.e., algebraic notation, such as "E6"), I stored each play as a single Unicode character. With 60 possible plays, I used the alphanumeric set, capital letters and symbols. I did this by simply taking a numeric board position (0 to 63), adding 48 to it (to move it into the alphanumeric area of Unicode) and converting it to a char type. This improved things, but performance using the trie was still terrible.
I ditched the trie and wrote LINQ statements against a list of strings. Performance improved, but it was still poor. I switched the search from LINQ to use the BinarySearch method of a list. For a binary search to work, I had to order the data. I was surprised to find that the LINQ OrderBy method didn't sort my data the way I needed. I suspect that it doesn't distinguish case and can't cope with unusual characters. I switched to Array.Sort. Even that didn't work without the StringComparer.Ordinal option. I also needed StringComparer.Ordinal option for the BinarySearch method. All interesting little hurdles.
After all that (plus some more optimisations discovered by using Performance Analyzer in Visual Studio), performance was massively improved but still not perfect. I decided to process the calculations on a separate thread so that it didn't hang the user interface.
I was done. Stats appeared either instantly or within half a second and either way the user interface was not impacted. Yet, I still wasn't done!
Othello is a highly symmetrical game. The first play, a choice of four possible positions, is symmetrically equivalent. No matter where black plays on the first move, one could rotate the board and the board would look the same as any other play. The people who made the Thor database understood this. That's why they standardised the archive to have E6 as the first play for all games. I needed to display the same info on the C4, D3 and F5 tiles that I displayed on E6. Otherwise, the less knowledgeable player may wonder why 100% of games are played at E6.
To be able to display stats on a non-standardised position, I needed to test for symmetry in the game board. I found methods to flip and rotate bitboards. That solved half the problem. I could check if one game state was the same as another. Next, I rotated the stats from a standardised to a non-standardised board. I generated lookup dictionaries to be able to rotate the indices of the board positions. For example, position E6 would become position C4 on a rotated board.
|Display of game stats for black.|
The blue highlighted square displays black played there in 0.027% of games and won 10% of those games.
Later, I also used this data to help the computer player make its opening moves. But that's for another day.