Unless you're on some kind of social media and entertainment news diet, you've no doubt heard about the Infinite Mario AI competition, and Robin Baumgarten's A*-based solution. His videos became overnight hits on Reddit and gaming news sites, and his work was even featured in the mainstream press in various countries around the world!
Since Robin is a long time reader and active participant in the forums here at AiGameDev.com, I felt responsible to pick his brain about the subject and go into more details about his prototype! This article is the definitive interview about his Infinite Mario AI, from its background and implementation to the upcoming contest and possible competitors...
NOTE: Some of the questions below were asked by active IRC members of the #gameai channel on irc.freenode.net. It's our official channel, and for artificial intelligence in games generally. Don't hesitate to join us!
Alex Champandard: Hi Robin! Now you're a famous Ph.D. reasearching rockstar, I guess your schedule is very busy. Thanks for taking the time to answer these questions. :)
Robin Baumgarten: Hi Alex! Thanks for giving me an opportunity to present my take on this Mario AI! Also, I'd say one Mario AI doesn't make a rockstar, it is just a lucky combination of AI applied to a really famous video game plus some Reddit/YouTube updraft that made it so well known. A lot of the interest comes from people who are unfamiliar with AI techniques and are easily enthused. In this respect, I'm quite happy to be able to show them what AI in games can do these days.
AC: The first time I read the Mario AI contest page, I got the impression that Julian Togelius was particularly keen on promoting techniques based on computational intelligence and machine learning. Based on the way the contest was defined, my instinct told me that someone being pragmatic using classical artificial intelligence (even game AI *gasp*) could probably do better at this!
Then I saw your video and thought, "Aha! He must be someone from industry; it shows with the practical down-to-earth approach." So it quite surprised me to find out it was you — someone I knew doing research in academia :-)
RB: My train of thought was quite similar to yours, actually. However, I think these relatively simple problems are well suited to be tackled by computational intelligence and machine learning. They provide a simple enough interface, but a complicated enough game mechanic to be not immediately obviously solvable. Machine learning techniques are really difficult to handle for complicated (or even real-life) problems, and that's why games can be an ideal step between academic toy example and real-life application.
AC: Do you think this contest leaves enough room for less proven or techniques that aren't as pragmatic as yours? Have you won this already?
RB: By allowing all kinds of entries for the competition, the organiser Julian Togelius wanted to compare how different techniques can be applied to a problem, and I think he succeeded in showing that, but maybe not in a way he intended. The pragmatic use of a down-to-earth pathfinding algorithm works here, because the game is entirely deterministic and the physics engine is not a black box. That is a terrain where planners are very strong, and machine learning algorithms (and similar non-optimal and evolutionary techniques) are bound to perform worse. I doubt any entry that does not predict the game state and plan ahead can win this, however there are now entries popping up that seem to copy my solution exactly. That's probably the price for publishing video of my solution early :-)
AC: To be a bit more controversial, would you have to redefine the rules to justify the use of computational intelligence?
RB: I definitely think the rules of the competition have to be altered to allow a fairer competition. As soon as the typical elements of success for a planner are taken away — say determinism and known physics — computational intelligent entries are the way to go, I think. The two don't exclude each other though; I could think of a learning algorithm that learns to predict the future state, which is then fed into a traditional planner.
Let's dig into your implementation a little more.
#gameai: Are you planning against the entire level, or just the rightmost edge of the screen?
RB: The interface for the competition only provides information that is also visible onscreen, i.e. enemies and level details outside the right are unavailable. So the algorithm only plans a second or so ahead at every search iteration.
AC: When do you replan and how do you integrate the new plans with the current plan?
RB: To extend the amount of CPU time I can spend on planning (without getting slower than real-time), I spread out the search over several frames of the game. In the current version, that is 4 frames. In these 160ms, Mario can move a considerable amount, about a fourth of the entire screen. To have more time to react when a new enemy or a gap appears on the right side of the screen, I trigger immediate replanning events whenever a new obstacle is perceived. I have found that an acceptable plan is found quick enough to entirely replace the previous plan, there is no plan-repair required.
AC: Did you reuse any code, e.g. for the A* or representation of the problem?
RB: The A* algorithm has been written from scratch, as it is a simple enough algorithm to not require too much work. However, for the simulation I completely reused the game engine, which was made available by the competition organisers. I stripped the rendering code and almost immediately had a working simulation. Given that Mario physics are really simple (only about 200 lines of simple arithmetic), performance was not an issue here. I can imagine that this would fail with larger games such as 3D shooters. One of the biggest hassles during this project was actually the Java serialisation code to store states while I was exploring the search space. This wouldn't have been easier in C++ though I think, it is just an artifact trying to beat the existing physics engine into a planning algorithm.
#gameai: How does your heuristic work? are you taking the benefit of powerups or anything else into account?
RB: The main part of the heuristic is a simple time-constraint: try to get to the right of the screen as fast as possible. Therefore, the path-cost function is the spent time since the start of planning, and the heuristic is "given maximum acceleration, how long does it take to reach the goal". As the distance to the goal isn't known, an arbitrary (large) constant is used. In that sense, I'd say that the heuristic is not quite admissible (i.e. does not overestimate the cost to reach the goal), but because the overestimation is the same for the entire search space, this isn't a problem. To avoid running into enemies or falling into gaps, a high penalty is added to the heuristic so that it does not get chosen as the next node by the search algorithm.
#gameai: Can you envisage applying your solution to generate a behavior that maximizes points? Would there be any challenges there?
RB: This competition only rewards passing a level and does ignore any other metrics such as score and enemies killed, therefore I completely ignore these (except if I'm forced to notice them to not get Mario hurt).
As a pure pathfinding problem, trying to get as many points as possible will be much more difficult to solve though, as it is not clear what the heuristic and immediate goals should be. However, I could imagine adding a planner on top of the pathfinding which guides Mario to each of the coins, boxes and enemies successively and lets him kill them. This would not be optimal at all, but because there is no time-penalty for taking longer (within the 180 seconds allocated for each level) it should work. It would be interesting to see if it's possible to optimise it such that Mario keeps jumping on Bullet Bills to get more points until the time almost runs out and then rushes towards the goal.
AC: While working on the motion planner (free registration required) in our AI Sandbox, we quickly discovered that A* sucks at search problems in continuous space. Basically you just end up with a tree to search (there are rarely ever reused states, so there are no junctions to make it a graph). So A* doesn't shine in these cases... How does this problem space compare?
RB: This is true for this problem as well. While time is nicely discretized by the framerate, space is also (virtually) continuous in Mario. This creates a search space which, as you mentioned, is too large to allow a visited-states list, and even if there was one, minute changes in Mario's position (for example by walking forward and then backward, but because of friction and momentum, Mario does not end up at the same spot) would create a new state. I tried to tackle this with a discretised visited-states list, but that didn't prove to be a promising approach. Luckily, with the current level design created by Infinite Mario (the competition software), replanning and backtracking is rarely required. This would be more difficult with a complex level design that involves dead ends.
AC: Do you think A* is a good approach for this problem? Would you consider a different search algorithm that's better suited to continuous domains?
RB: I think we would either have to change the algorithm or transform the domain to reach a good approach. For the latter, there exist lots of techniques that abstract the space into linked local clusters and apply a hierarchical search on them. A continuous search space is often too big to search through exhaustively or optimally, especially if there are also moving obstacles, so that rougher heuristics and more reactive methods (such as anti-gravity movement, for example) are called for here, I think. I haven't really looked into these a lot, though.
AC: Are there situations where stopping and waiting would be a good choice?
RB: In Mario, this occurs only rarely, for example when a path is temporarily blocked by an enemy such as a Piranha Plant or a Bullet Bill. Because all other movement would lead to a very high cost (the heuristic penalizes getting hurt), the solution might be to just wait, even though it does not decrease the distance to the goal. In my implementation, this behavior emerges once in a while, but is not required often.
AC: Your prototype has been hugely popular; how do you interpret that? Is there more interest in AI & games these days?
RB: In part I think this popularity comes from observing something very well known and loved, such as a video game classic, and transforming it into something new. In this case, almost everyone can relate to playing Mario and knows how difficult (or not) it is to play it. For many people, it then is very astounding and surprising to see a computer playing through it with no visible effort and in break-neck speed, even though the algorithm behind it is much simpler than any AI that is programmed into most modern games, which calculate probability models whether opponents can see a player and so on. I have gotten a lot of comments of students that say they'd love to program similar AIs (or regret not taking the AI course in second year), so I think there's definitely a higher interest in AI for games these days. Of course this also creates a higher expectation of quality AI in future games, so I don't think we can rest on our laurels, but have to keep up with the cutting-edge of AI to not disappoint our players. I think there's also one or two blogs about that on the Internet, but I can't think of their names right now...
AC: In a way, the press preceeding this event has already made it impactful! However, from a technical perspective, what do you think about the impact of the competition? What do you expect we'll learn from this?
RB: I don't think the competition itself will lead to new techniques for solving (game) AI, however it will definitely be a good opportunity to look beyond one's own nose/favorite algorithm and see what other fields of research can bring to the table. Many conferences have competitions like this, and usually they are barely noticed outside academia (which might be our own fault) so I am not sure if this one will be any different. However, as the academic field gets more mature, we will see more elaborate games used as competitions, and fancier algorithms used as their solutions. If not already, I think these competitions might soon prove to be a treasure trove, also and especially for game developers.
Many thanks to Robin for taking the time to answer these questions. If there's anything else you'd like to ask him, don't hesitate to post in the forums!