Friday, July 13, 2012

Computer Othello, Part 2: Game archive as trainer

I wanted to create a computer game of Othello that did more than simply defeat the human player. I thought of trying to provide a story to the player to help them learn and improve against others. One idea was to process the Thor archive and create stats to apply to each potential play. In that way, the player can see how many people have played the position in the past and how well they did. However, with 100,000 games in the archive, doing this efficiently was tricky.

Initially, I tried using a trie data structure. It seemed like an appropriate structure. However, performance was poor and programming against it wasn't overly intuitive. I tried changing how the data was stored to try to improve performance. Instead of using a human readable notation (i.e., algebraic notation, such as "E6"), I stored each play as a single Unicode character. With 60 possible plays, I used the alphanumeric set, capital letters and symbols. I did this by simply taking a numeric board position (0 to 63), adding 48 to it (to move it into the alphanumeric area of Unicode) and converting it to a char type. This improved things, but performance using the trie was still terrible.

I ditched the trie and wrote LINQ statements against a list of strings. Performance improved, but it was still poor. I switched the search from LINQ to use the BinarySearch method of a list. For a binary search to work, I had to order the data. I was surprised to find that the LINQ OrderBy method didn't sort my data the way I needed. I suspect that it doesn't distinguish case and can't cope with unusual characters. I switched to Array.Sort. Even that didn't work without the StringComparer.Ordinal option. I also needed StringComparer.Ordinal option for the BinarySearch method. All interesting little hurdles.

After all that (plus some more optimisations discovered by using Performance Analyzer in Visual Studio), performance was massively improved but still not perfect. I decided to process the calculations on a separate thread so that it didn't hang the user interface.

I was done. Stats appeared either instantly or within half a second and either way the user interface was not impacted. Yet, I still wasn't done!

Othello is a highly symmetrical game. The first play, a choice of four possible positions, is symmetrically equivalent. No matter where black plays on the first move, one could rotate the board and the board would look the same as any other play. The people who made the Thor database understood this. That's why they standardised the archive to have E6 as the first play for all games. I needed to display the same info on the C4, D3 and F5 tiles that I displayed on E6. Otherwise, the less knowledgeable player may wonder why 100% of games are played at E6.

To be able to display stats on a non-standardised position, I needed to test for symmetry in the game board. I found methods to flip and rotate bitboards. That solved half the problem. I could check if one game state was the same as another. Next, I rotated the stats from a standardised to a non-standardised board. I generated lookup dictionaries to be able to rotate the indices of the board positions. For example, position E6 would become position C4 on a rotated board.
Display of game stats for black.
The blue highlighted square displays black played there in 0.027% of games and won 10% of those games.
After all that, I had a well working set of stats that I could display to the player to assist them in learning where the best opening moves are. It doesn't really provide any sort of story for the player, but at least it gives an indication of where a good play might be.

Later, I also used this data to help the computer player make its opening moves. But that's for another day.

Wednesday, July 11, 2012

Julia Kristeva

On Anthony's visit to Melbourne, he mentioned something about Julia Kristeva that I simply had to know more about. It was the idea that Kristeva suggests we move away from the scientific binary logic of false and true (0 and 1) to a poetic logic of 0 and 2. I was astounded to hear that. It seemed completely absurd (it's been a while since I read any post-modernism). I got the document and found the quote. Here it is:
The polyvalent logic presupposing an infinite number of values in the interval false-true 0  x  1 is a part of bivalent (0-1), Aristotelian logic. Poetic logic is inscribed on a different surface. It remains indebted to Aristotelian logic not in being a part thereof, but insofar as it contains and transgresses that logic. Since poetic unity is constructed in relation to an other as double, the problem of truth (of the 1) does not concern it. The poetic paragram bypasses the one, and its logical space is 0-2, the 1 existing only virtually. (pg 44, Towards a Semiology of Paragrams)
Admittedly, I didn't read all of the document. What I read, I didn't understand. I'm taking Kristeva out of context because I don't understand the context. Nevertheless, I got questions:
  1. How is polyvalent logic "part of" bivalent logic? The inverse (bivalent part of polyvalent) seems more reasonable to me. Yet, polyvalent and bivalent seem to have quite different logical rules.
  2. Unlike polyvalent logic, Kristiva claims that poetic logic (what's that?!) has a logical space of 0-2. It's therefore still bivalent, but because it doesn't contain 1 (truth) it need not concern itself with truth. Hm... but isn't the "1" merely a symbol to represent "truth"? Couldn't "2" also represent "truth"? Or "0" for that matter? In fact, you could be really crazy and suggest "√-1" to be the symbol for "truth". It wouldn't matter, would it?
I will leave you with another quote from earlier in the document. I find it entirely incomprehensible, though it does appear to contradict the above by saying that paragrammatic numerology (which I think is the same as poetic numerology) is "two" and "all" rather than "0" and "2". I like the "The zero is two which are one" statement, though that is very similar to another phrasing I know (i.e., the doctrine of the trinity - god, son and holy ghost are three that are one.) I'm not sure that the bible is a good source for learning logic, however...
The zero as non-sense does not exist in the paragrammatic network. The zero is two which are one: in other words, the one as indivisible and the zero as nothingness are excluded from the paragram, whose minimal unity is both an (empty) all and an (oppositional) two. We shall examine more closely this paragrammatic numerology, where there is no 'one' or 'zero' but only 'two' and 'all'. Unity is empty, does not count, the one is zero but it signifies: it controls the space of the paragram, it is there to fix the centre, but the paragram does not give it a value, a stable meaning. This 'unity' is not the synthesis of A and B; but it has the value of one because it is all, and at the same time it cannot be distinguished from two, because within this unity come together all the contrasting semes, both opposed to each other and united. At once unity and couple, the oppositional dyad, to apply a spatial expression, is realized in the three dimensions of volume. (pg 37, Towards a Semiology of Paragrams)

Saturday, July 7, 2012

More interview questions

We had a coding competition at work last week. We had an hour or so to code-up a solution to:
I want to simulate a vote by providing a question, valid responses, and an integer representing the voting population size.
When the vote occurs, there should be 1 random vote for each member of the voting population.
 The entries were judged on the following criteria.
  • Testability
  • Maintainability
  • Accuracy of solution
  • Structure of solution files
I've uploaded my contribution to GitHub. I didn't win. I made a few weird assumptions (that I've corrected in the code online) and for some reason went off on a tangent playing around with JSON serialisation. However, the wining solution wasn't particularly different to mine.

The purpose behind the competition was to use this task (or something similar) when interviewing candidates. I guess the people interviewing wanted a base-line to work off, that's why they had the staff do the task.

Anyways, it was quite fun. I've never done anything like that in a workplace before.

Friday, July 6, 2012

Interview questions

I've been asked the same questions during interviews in the past. The two that come to mind are:
  1. What's the difference between a clustered and non-clustered index in SQL Server?
  2. What's the virtual keyword in C# mean?
I finally got around to looking up the answer to the second question. I generally don't override methods, so I've never had much use for it.

The short of it is that virtual is required if you want to override a method of the base class. If you don't supply virtual, the best you can do is use the new keyword to create a completely new method that just happens to have the same name of a method in the base class. To take it a bit further, sealed will prevent any further overriding down the inheritance chain. Therefore;
  1. a base class may have a method with virtual assigned to it;
  2. deriving from 1, you could override the method;
  3. deriving 2, you could sealed override the method;
  4. deriving 3, you can't override any more (but you could new the method);

Is this an important question? Not in-itself. I was offered the jobs without knowing the answer.

I have written an example gist to further clarify.

Saturday, June 2, 2012

GitHub for Windows

I'm tired of using Skydrive as a "source code control" solution. That's no solution, just a file back-up. What about versioning and sharing code?! I've used lots of revision control systems before. Here's the list:


They've all been adequate but frustrating. The Internet is abuzz with talk of Git and GitHub. I tried setting up a GitHub account last month. After the first few steps, I looked at the long list of remaining tasks and decided it was "too much work."

To my rescue comes GitHub for Windows. Not only did it create my repository in seconds, it filtered out all the useless gunk that I don't want going into the code repository, namely binaries, ReSharper and Visual Studio profiler files. It still had some NCrunch files to ignore, but that's a minor issue. Could this be my revision control Holy Grail?

My Othello repository can be found at: https://github.com/ledpup/Othello.

Tuesday, May 22, 2012

DayZ update

It looks like the developer of DayZ is doing all the right things. Latest update notes:

* This is a pretty major update. I don't really know what affect the new sickness and temperature system will have on player behavior especially with antibiotics being so scarce.
* It is the genesis of an idea, so please remember this might cause havoc. You need to be careful.
* To light a fire, you need matches and wood in your inventory. Place more wood in the inventory of a fireplace to keep the fire going.
* A fire that does not have wood in it will go out when you try to light it.
* You can tell you have an infection, because your character will start coughing. The infection causes you to loose blood down to a minimum of 6000. This leaves you with reduced blood until you take antibiotics.



Here you can clearly see the beginnings of a Minecraft like play. If done right - without the missteps that Minecraft took - could be really good.

How to write software


A friend from my previous job sent me the below a couple of days ago. These suggestions fit perfectly with the exact opposite approach to what we were taking on the project I was working on there. It looks like things are changing.

It's all straight-forward stuff. If you work in enterprise software development, you already know how often these ideas are not followed.



Code Smells Within Classes

·         Comments: There's a fine line between comments that illuminate and comments that obscure. Are the comments necessary? Do they explain "why" and not "what"? Can you refactor the code so the comments aren't required? And remember, you're writing comments for people, not machines.
·         Long Method: All other things being equal, a shorter method is easier to read, easier to understand, and easier to troubleshoot. Refactor long methods into smaller methods if you can.
·         Long Parameter List: The more parameters a method has, the more complex it is. Limit the number of parameters you need in a given method, or use an object to combine the parameters.
·         Duplicated Code: Duplicated code is the bane of software development. Stamp out duplication whenever possible. You should always be on the lookout for more subtle cases of near-duplication, too. Don't Repeat Yourself!
·         Conditional Complexity: Watch out for large conditional logic blocks, particularly blocks that tend to grow larger or change significantly over time. Consider alternative object-oriented approaches such as decorator, strategy, or state.
·         Combinitorial Explosion: You have lots of code that does almost the same thing... but with tiny variations in data or behavior. This can be difficult to refactor-- perhaps using generics or an interpreter?
·         Large Class: Large classes, like long methods, are difficult to read, understand, and troubleshoot. Does the class contain too many responsibilities? Can the large class be restructured or broken into smaller classes?
·         Type Embedded in Name: Avoid placing types in method names; it's not only redundant, but it forces you to change the name if the type changes.
·         Uncommunicative Name: Does the name of the method succinctly describe what that method does? Could you read the method's name to another developer and have them explain to you what it does? If not, rename it or rewrite it.
·         Inconsistent Names: Pick a set of standard terminology and stick to it throughout your methods. For example, if you have Open(), you should probably have Close().
·         Dead Code: Ruthlessly delete code that isn't being used. That's why we have source control systems!
·         Speculative Generality: Write code to solve today's problems, and worry about tomorrow's problems when they actually materialize. Everyone loses in the "what if.." school of design. You (Probably) Aren't Gonna Need It.
·         Oddball Solution: There should only be one way of solving the same problem in your code. If you find an oddball solution, it could be a case of poorly duplicated code-- or it could be an argument for the adapter model, if you really need multiple solutions to the same problem.
·         Temporary Field: Watch out for objects that contain a lot of optional or unnecessary fields. If you're passing an object as a parameter to a method, make sure that you're using all of it and not cherry-picking single fields.

Code Smells Between Classes

·         Alternative Classes with Different Interfaces: If two classes are similar on the inside, but different on the outside, perhaps they can be modified to share a common interface.
·         Primitive Obsession: Don't use a gaggle of primitive data type variables as a poor man's substitute for a class. If your data type is sufficiently complex, write a class to represent it.
·         Data Class: Avoid classes that passively store data. Classes should contain data and methods to operate on that data, too.
·         Data Clumps: If you always see the same data hanging around together, maybe it belongs together. Consider rolling the related data up into a larger class.
·         Refused Bequest: If you inherit from a class, but never use any of the inherited functionality, should you really be using inheritance?
·         Inappropriate Intimacy: Watch out for classes that spend too much time together, or classes that interface in inappropriate ways. Classes should know as little as possible about each other.
·         Indecent Exposure: Beware of classes that unnecessarily expose their internals. Aggressively refactor classes to minimize their public surface. You should have a compelling reason for every item you make public. If you don't, hide it.
·         Feature Envy: Methods that make extensive use of another class may belong in another class. Consider moving this method to the class it is so envious of.
·         Lazy Class: Classes should pull their weight. Every additional class increases the complexity of a project. If you have a class that isn't doing enough to pay for itself, can it be collapsed or combined into another class?
·         Message Chains: Watch out for long sequences of method calls or temporary variables to get routine data. Intermediaries are dependencies in disguise.
·         Middle Man: If a class is delegating all its work, why does it exist? Cut out the middleman. Beware classes that are merely wrappers over other classes or existing functionality in the framework.
·         Divergent Change: If, over time, you make changes to a class that touch completely different parts of the class, it may contain too much unrelated functionality. Consider isolating the parts that changed in another class.
·         Shotgun Surgery: If a change in one class requires cascading changes in several related classes, consider refactoring so that the changes are limited to a single class.
·         Parallel Inheritance Hierarchies: Every time you make a subclass of one class, you must also make a subclass of another. Consider folding the hierarchy into a single class.
·         Incomplete Library Class: We need a method that's missing from the library, but we're unwilling or unable to change the library to include the method. The method ends up tacked on to some other class. If you can't modify the library, consider isolating the method.
·         Solution Sprawl: If it takes five classes to do anything useful, you might have solution sprawl. Consider simplifying and consolidating your design.