Imagine a mountain covered in fog
We recently asked Moultano if he could write up something to explain how the improvements to the skill learning formula that Hive 2 will use will work. It seemed too interesting to not post... so here it is.
-- Imagine a mountain covered by fog. You are lost somewhere on that mountain, and you are trying to get to the top. When you look around at your surroundings, you can only see a few feet around you. You look around, guess which direction is going most steeply uphill, and set off in that direction. While you are walking, the landscape is too confusing to figure out whether you are still going uphill or not, so you walk for a fixed number of steps in that direction, and then stop to look around again. Eventually, by repeating this process of going up the steepest slope you can find, you hope to get to the top. This analogy is essentially what computers do when they use an optimization algorithm called “stochastic gradient descent.” While it’s one of the simplest optimization algorithms out there, it is what powers the deep learning revolution that has achieved breakthrough performance on nearly every artificial intelligence task. (To make the analogy more accurate, imagine that the mountain is in a million dimensional space, and you can only see the ground directly under your feet.☺) Hive uses the same optimization algorithm to improve its (extremely simple) model. And because of that, we can use research originally applied to deep learning to improve how it learns. The new implementation of hive uses Adaptive Subgradient Methods for Online Learning and Stochastic Optimization a.k.a. AdaGrad. Despite the intimidating name, it has an extremely simple implementation and intuition. In our foggy mountain example above, adagrad modifies the distance and direction that you travel so that you go further in directions you haven’t explored before. In hive terms, that means new players will get much larger updates to their skill values than players that hive already knows a lot about. This should dramatically reduce the number of games it takes for hive to converge, and should help to mitigate smurfing.