Open Coverage
Dagstuhl

Monte-Carlo Tree Search and the Dagstuhl Seminar

Alex J. Champandard on May 11, 2012

This week, the world's leading researchers in the field of Game AI have gathered in Schloss Dagstuhl in Germany, to consider future applications and what needs to be done to bring those ideas to life. Representatives from industry were in a minority, but that's why AiGameDev.com was there to balance things out!

The goals of the seminar are to produce a future roadmap of research, as well as take part in impromptu AI coding competitions, which we'll no doubt cover in the future on the blog. In the meantime, it's interesting to cover the current state of academic research — and in particular current trends. MCTS is one of those.

Monte Carlo Tree Search


MCTS is a very recent technique that has captured the attention of academia only within the past 5-6 years. Some AI departments now have over 80% of their researchers working on this and major projects are currently underway on the topic.

What's most interesting, however, is how incredibly simple the technique is in practice! It involves trees, search, statistics and random numbers. Peter Cowling, Professor at The University Of York, elaborates:

“It's such a silly idea when you think of it. We can't solve the problem so let's randomly sample it. It doesn't feel formal enough, but it works!”

It's the proof that it converges (polynomially) given an infinite number of samples that has captured the interest of the academic community. In fact, this insight was developed independently by three different people in 2003-2004, and this created a field in its own right by 2006.

What's the magic? How can such a simple technique get good results? Peter Cowling continues.

“The formula that you use to decide where you explore next applies the traditional exploitation/exploration tradeoff. The UCB policy for instance manages to do so in a wide variety of situations.”

There's a whole class of policies that guide the tree search with the aim to minimize regret, so you randomly sample the tree in such a way that you get a good balance. Upper Confidence Bound (aka. UCB) is one of the approaches invented initially that basically stimulated research in this area.

How It Works

As a short overview, here's how MCTS works in practice:

  1. Construct a state/action graph for the game.
  2. Pick actions randomly until the game's end.
  3. Assess the final state as positive or negative.
  4. Propagate statistics back up the searched tree.
  5. Jump to 2., using statistics to guide traversal.

For more details, see this recently published survey paper and the mcts.ai website.

Applications in Practice

Monte Carlo tree search has had much success when applied. Simon Lucas, Professor at the University of Essex, explains:

“MCTS has had most success in board games, in praticular Go and Hex. It's especially good for games where you place a counter and it remains on the board until the end of the game.”

Using MCTS as an AI technique tries to minimize the need for complex heuristics, like those necessary to play chess for instance, and does an increasingly good job of it. However, as Simon Lucas points out, it's still the heuristics that take the technique from average to great.

“You have this tension between the ideal, getting away from manual heuristics or making it work well on very specific problems. That's what happens in the computer go field.”

In contrast, the relatively new field of general gameplay (GGP) tries to find AI general techniques that apply well to many domains, and MCTS is also the strongest performer in this category — even compared to minimax or learned policies.

Summary

MCTS has already shown a lot of promise in practice, in particular reaching high-level amateur play in Go. There are also an increasing number of applications in traditional games too, in particular card and board games. Researchers here are keen to apply MCTS to 3D video games as well, though that's still an open area of research!

Discussion 1 Comments

Frederico Nogueira on February 18th, 2013

How this technique is different from Reinforcement Learning? I never heard about it, but reading the short overview it seemed very similar to RL.

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!