In this post, I aim to give a brief review on the useful strategies and tricks for the top 3 bots in last year’s Halite 2 competition.
I cherry-pick the contents from all these post-mortems majorly based on how they are likely to be able to be applied to Halite 3. Hence there are some tricks that are important to Halite 2 but are missing here. In this way, I am trying to make this review more useful specifically for Halite 3 preperation. I also bold the sentences that I find it worthy to be highlighted.
We can see that the bots of the top players are quite different in many aspects. For example, FakePsyho suggests to use a stateless AI while ReCurs3 designs a AI with several distinctive states with each one to handle a different producure of the games.
Table of contents:
For a more detailed version, please take a look at the link to ReCurs3’s Post-Mortem
To get a better idea of all the parts involved, here is what happens at a high level on every game turn:
Because the early game is so different and unforgiving, I isolated this part of the game in its own code path. The big two differences over the rest are choosing the initial planet(s) to settle on, and take into account potential rushes from the enemy, who attack straight away with their initial ships hoping for a quick win by catching bots off guard.
The choice of initial planet(s) is done once at game start, and is different between 2p games and 4p. The one for 2p is very old but I could not improve or find a replacement in time that seemed to perform as well. The idea is to control as much space as possible early on by choosing a planet with other planets nearby, to boost early production and be the first player to dominate the center, therefore breaking the symmetry, and typically winning the game.
4p games require another strategy, as the goal is quite different. A good spot is still desirable, but definitely not at the cost of being nearby enemy players. Otherwise, the risk of being rushed is greatly increased, and even with no rush, the risk of becoming the enemy’s target of choice is also higher. Having two enemies nearby early on is almost guaranteed to turn into a bad game.
The early game plan aims to establish the initial colonies as fast as possible while also keeping early invaders at bay. It also tries to break a stalemate in case no one colonizes by engaging combat after a while, and rushes if it owns no planet but the enemy does.
Another 4p only improvement, only enabled once the early game is over, this mask aims to force the bot to focus on one player instead of multiple at once. If a player can be eliminated more quickly, then its planets can be taken over faster and therefore create a bigger snowball effect. When no target player is assigned, the enemy having a ship closest to an allied planet is considered as the target.
This pass is looking to do high level decision making by assigning one of the following roles to every allied ship:
Either looping through targets to find a ship to assign to, or looping through ships to find a target, have their flaws as they will give poor solutions in some scenarios. It is important to minimize the sum of all distances between ships and their targets in order to respond faster overall. I thought of using an evolutionary search for that, but decided to start with a simpler algorithm at first and see if it needs replacement later. It survived in almost intact shape since.
The way the targeting, scoring and execution of a role is done is specific to each, so more details follow.
For a given ship, it just tries to find the closest planet with docking spots available. However, the twist is it will exclude planets that are considered unsafe to colonize. At some point I noticed my bot was wasting a lot of ships by docking them only to get destroyed a few turns later, so I tried to figure out how to prevent that.
The main idea to determine planet safety is making sure all nearby enemy ships can be dealt with in the future, by checking if for each one of them, an allied ship in proximity can reach that point in time.
The score given to that role is the distance to the planet’s surface. A -40 bonus is added if the ship can already dock, to not get distracted when it’s already almost colonized anyway. The assignment of this role fails if the available docking spots are already reserved by other ships, or if the planet becomes unsafe due to assignment of other roles.
For good or bad reasons, I decided a ship cannot be a candidate for both attack and defense at once. I thought it would be easier to balance priorities between colonizing vs fighting and attacking vs defending rather than everything at once, but cannot say whether it was better or not in the end.
For defense, only enemy ships near an allied docked ship are considered. For attack, for a long while all enemy ships were considered. At some point I experimented with being exclusively focused on doing economic damage and found a remarkable improvement in behavior and ranking. Ever since, only enemy docked ships are considered for attack.
An important detail of the strategic pass is it does a small alteration of the game state while computing roles. Given the current state, it looks at ships that will be produced by allied planets in the next 10 turns and adds them to the state using the position closest to map center regardless of ship proximity. Those ships will have a distance penalty assigned corresponding to 8 * numTurns. This means that every distance calculation in which these ships are involved get that penalty added as a distance. The implications of this alteration are considerable. It frees up a number of current ships to perform better roles, leaving some for future ships instead. For instance, it might choose not to bother defending and attack instead, or not send a ship to colonize a planet that will produce a ship to colonize with faster.
No prediction was done for the enemy as no behavior or ranking improvement could be found, sometimes even being detrimental.
Once every ship has been assigned a role, the tactical pass aims to use better moves than the one provided by the strategic pass, to still accomplish those roles but with better positioning.
Proper ship micromanagement makes a huge difference in Halite. Superior ship numbers in combat snowballs hard, avoiding useless fights lets ships focus on more important tasks and evading defenses can let ships harass better, etc. Given the combat mechanics it seemed really difficult to come up with a decision making structure similar to the other parts. Maybe some sort of tree search could help figure out the best movements, but the very large number of possible moves even with pruning, especially given the quantities of ships involved, did not seem feasible in any straightforward fashion.
So I started looking at a simpler version of the problem: if enemy movements are known in advance, how to position ships to take the most advantage of it? This was more or less my first iteration:
This gives a baseline for the current plan given by the strategic pass, which can then be iterated on to find moves giving a better score. I used hill climbing to refine the current plan:
With this iterative search the bot’s behavior was already massively improved. Ships would avoid losing battles by moving out of the way, or commit together to ensure the battle result would get even better. Emergent behavior, such as low health ships attempting to ram into an enemy, would happen as long the net health result is better. Afterwards it was all about adding various improvements to this basic idea.
So far it’s been assumed the enemy ship always move straight towards the closest target. It’s obviously not something that happens very often, so moves can be made that are very weak to other enemy responses. I experimented quite a lot with various probability models and ended up settling with this one. A set of 19 global enemy responses is precomputed, and the evaluation of each resulting simulation is summed up to make a single score. In other words, each tested move results in 19 different simulations and making sure the score is better on average. For enemy ship movement, I also disabled ship-ship collisions if the two ships involved belong to the enemy, in order to spend less time solving it and overestimate the enemy’s ability to group up together to get a more conservative prediction. Ship-planet collisions however are always maintained.
Unlike many other top bots, I did not have any specific code forcing ships to move closer together, as it created weird or undesirable behaviors. However I still used a clustering method to help the hill climbing get out of some traps. One problem I noticed is when a smaller mass of ships is engaged in combat with a bigger one, the hill climbing fails to move them individually out of danger, because the evaluation function only sees the short term impact. It thinks keeping them in the action will minimize the losses, failing to see the overall battle is lost.
So in addition to trying random actions on individual ships, it also tries applying a random action to all ships of a group and examines the result in the same way. It works rather well even if the grouping method is very naive:
For a more detailed version, please take a look at the link to FakePsyho’s Post-Mortem
Basically, it’s a function that given some state/decision/situation will evaluate it to (usually) a single number. Allowing us to order states/decisions/situations from best to worst. Designing solution around evaluation function means that you call your evaluation function with all potential choices and you select one that gives the best result. This means that if you want to give some type of decisions higher priority than to others, you need to bump up their values. If you have some machine learning experience, you can think of evaluation function as a model with high-level custom features and hand-crafted weights.
Evaluation function is generally used as a replacement for decision trees (random forest vs neural network analogy) or analytic solution. Good example of analytic vs evaluation is movement. You can try to calculate optimal move for ship directly (bad idea), or you can call your evaluation function for all possible angle/thrust values and select the best one. For example, the most naive pathfinding would be to evaluate all moves, for those that collide return -infinity and for those that don’t collide return -distance_to_target. And that’s already far superior to default navigation.
There’s a common problem in AI-battle tasks where you have N units to your disposal and each of them can perform some set of actions. There’s also a standard way of dealing with it. If the evaluation of (unit, action) doesn’t depend on results of other units, you can just iterate over all units and select the best action for each one. If it depends, things get slightly more complicated. When your evaluation function is comparable between different units, you can use global greedy assignment (not an official name of algorithm, but it’s quite descriptive).
while (len(units_without_actions) > 0) {
best_ship = null
best_action = null
best_eval = -infinity
for unit in units_without_actions {
for action in possible_actions {
eval = evaluate(unit, action)
if (eval > best_eval) {
best_eval = eval
best_unit = unit
best_action = action
}
}
}
update(best_unit, best_action)
}
For most problems this is going to work really well and will give result close to perfect matching. The only downside is that the algorithm now takes O(ships^2*actions) instead of O(ships*actions). But well, just learn to write fast code ;) Since this loop was expensive for me, I cached best action for each ship and recalculated it only when evaluation value for that action changed. This works, because my evaluation function could not improve after update() equivalent. Which means, if action gave the same value, it’s guaranteed that it’s still the best choice. Small trick but slashed down execution times by 50-100x in worst cases for whole phase.
There are dozens of small tricks within the code. I’ll give a list of some of the more interesting ones that I still remember:
For a more detailed version, please take a look at the link to Shummie’s Post-Mortem
Basically, it boils down to: Do not fight in a fight where you cannot win. Prior to this, most ships just attacked each other haphazardly. But, in reality, fighting for a tie does nothing. Detecting if you were outnumbered, or in a tie, meant that you should try retreating back to a friendly ship to outnumber the opponent. By the end, every top bot was doing this in some form or fashion.
@ewirkerman and I had decided to collaborate and create a NAP in 4 player games. Every game, we would send a signal in 4p games, and if the secret handshake was detected, we would not attack each other’s docked ships. For all intents and purposes, our ships were invisible to each other (or… at least that was the intent). Also, we would make sure that the enemy was completely eliminated before turning on each other. This would, in theory, guarantee that we were placed 1st and 2nd when we were on the same side of the map, and in theory, would still have some benefit if we were across or diagonal from each other.
We kept it hidden until the final two hours of the competition. Surprisingly, no one noticed, and we saw a definite spike in our win rates when we were in the same game. We had done a lot of local testing to make sure we had all the bugs worked out. I even uploaded my alliance dance code for about 2 weeks, but if it detected a positive, i would immediately crash. This would give us an idea of how often our signal would have a false positive. In two weeks of playing, there were 0 false positives.
It wasn’t always that easy actually. Our original signal caused me to crash in about 30% of my 4p games. However, we found out later that @ewirkerman actually had been sending his signal the entire time! Despite that, I saw games where I was crashing where he wasn’t in it, so we decided to make our signal slightly more complicated so that we would reduce the frequency of false positives.
A turn in my bot can be broken up into the following major steps which I’ll go into detail on:
The distractor bot was originally designed to be the “drone harass” bot. It was effective to send ships to docked ships and have it delay weaker bots from using new ships effectively and just use it to occupy 2-4 ships while trying to snipe off docked ships when possible.
As I talked about, the behaviors handle the strategic decisions, and the navigation handles the tactical function. My original navigation function for this role was the Attack_Docked_Ship_Avoid_All nav. That nav function basically always avoids getting into range of enemy undocked ships while trying to get into range of enemy docked ships. Eventually, I replaced this with what I call my dogfight nav function without changing the target selection of the distractor bot. Because of how I implemented the dogfight function, it still functions similarly to the old distractor bot behavior, but now if I have multiple distactors (or other combatants) nearby, it’ll know not to just try to avoid enemy ships if we have numeric advantage. This is how I get the “swarming” behavior where I target docked ships with a subset of ships and across the map instead of a single battlefront.
The 4-player version of this is almost identical, with the exception that we don’t want to target a docked ship that’s too far away from us since we should be focusing on the closest enemy.
Here’s a summary of how we evaluate moves in this function
References include: Post-Mortem Summary Page, ReCurs3, FakePsyho, Shummie