Open Editorial

The #fsmgate Scandal and What it Means for Your AI Architecture

Alex J. Champandard on March 28, 2012

Every year at GDC, the AI Dinner takes place on Friday organized unofficially as the fourth roundtable by Neil Kirby. As a tradition, all the tables in the restaurant are covered with a layer of paper over the cloth, providing a great place to scribble schematics during heated debates and arguments. In other places, you'd be dismissed as a crazy mathematician or software engineer for scribbling, but in this setting you'll draw a crowd!

I take it upon myself to uphold the yearly tradition, as writing down notes tends to stimulate better discussions! Last year was a solution to procedural animation, the year before was understanding future hardware and its impact on AI code. This time, however, was the most controversial of them all — and there wasn't much competition from other tables this year either :-)

Photo 1: One of this year's AI Dinner tables, with Graham Pentheny, Alex Champandard (me) and Steve Rabin from left to right. (Photo by Neil Kirby.)

The Premise

What started the debate this year was a discussion of hierarchical task networks (HTN) and to some extent behavior trees (BTs), in particular how they handle interruptions such as responding to a grenade nearby, catching fire, or even heavy hit responses. When you have one large tree, the point of execution ends up jumping from one context to another, as interruptions will typically trigger higher priority branches in the tree itself.

The disadvantage of this approach is that the entire context for the behavior is lost as the interruption happens. In fact, most often the entire behavior 'stack' will be reset as a top-level survival branch is engaged. If you use this approach, you'll find:

  1. It prevents resuming the exact same behavior or plan once the interruption is over. Often, it may be acceptable or desirable to re-evaluate the entire tree after a survival interruption, but in other cases it may cause common problems such as hesitation or oscillation.
  2. This approach also means you can't have multiple behaviors running in parallel. The tree can only consider one option at a time, and can't stagger into cover while on fire for example.

Obviously, there are benefits to having one large tree (despite these problems) but I think it's often worth questioning assumptions!

Photo 2: Table sketch of a simplified behavior tree, with one main combat sub-tree (right) and a survival tree (left). Interruptions would cause the point of execution to jump from the main tree to a different higher priority leaf. (Minor changes were made for display purposes.)

Vertical Layering

One common way of solving the problem is creating multiple instances of the tree to deal with interruptions. So instead of jumping around from one combat branch to another survival branch when something unexpected happens, you let the combat branch run in the background and return to that later. Effectively, the top-level node in your traditional behavior tree becomes a stack where all child branches are kept in memory, but just one is run at the same time.

“Let the combat branch run in the background and return to it later.”

Brett Laming's MARPO architecture in particular, described in AI Wisdom 4, does this. The idea of using a stack to push execution contexts has proved useful for more traditional FSM implementations over the years, giving those architectures a well needed boost to keep up with modern alternatives. The advantage of Brett's approach is that it combines the stack with a more capable BT-style implementation.

Taking this approach to the extreme would involve removing the stack and single thread of execution, instead having all the branches run in parallel and arbitrate with each other at runtime to decide which gets priority when conflicts arise. Sound crazy enough to work? That's similar to what Façade does, relying on multiple trees to run behaviors in parallel that intermix at runtime.

Horizontal Layering

If you continue thinking about layering the entire AI architecture and running parts in parallel, why not do it horizontally instead? In this case, you'd have the AI split into functionality that starts at the low-level and progresses to the high-level. For instance, this could include animation logic, locomotion and posture logic, behavior logic, combat logic, and strategic logic.

In general, layering your architecture into at least two layers to separate the high-level AI from the low-level animation is standard practice today. But why not also separate navigation details from the strategy, or the combat tactics from the posture or locomotion management? In fact, many games already have such standalone systems running in parallel to managet things like weapon or head look, and the concept could trivially be applied elsewhere.

Photo 3: An architecture that exhibits horizontal layering, with the AI decomposed into its various functional components. Orders are given from top to bottom, and notifications flow back. (Minor changes were made for display purposes.)

Back to Interruptions...

The benefit of horizontally layering is that it not only makes the architecture much more obvious and elegant, but it also resolves the interruption problem. If something unexpected does happen, for example catching fire, then the component responsible for it can deal with that in parallel. For instance, catching fire could be dealt with by changing the locomotion cycle but continuing to navigate, or playing a custom animation that halts and resumes the path later.

Each event from the world is dealt with locally by components, and in turn sends more notifications to higher-level components that depend on it. Each one of those other components is also free to deal with the problem as it sees fit, in parallel with the rest of the behavior continuing to run...

Finite State Machines?

By the time you've split your architecture into sensible horizontal layers, even when building each component to deal with all possible external interruptions, the resulting AI logic would remain extremely simple. Simple enough, in fact, for a finite state machine to handle everything! Hopefully now you can understand the message that I skribbled onto the table cover.

Photo 4: Assuming architectural improvements reduce the complexity of AI logic, are behavior trees now obsolete? Semi-drunken notes scribbled onto the table...

Lessons Learned

This discussion emphasizes the importance of AI architecture above all. With the correct architecture it's possible to build entire AI systems around simpler techniques like finite-state machines. Such systems can't handle the complexity of character behaviors as they scale in size and number, but when handling isolated systems with very specific purposes, then FSMs are an obvious choice.

For instance, in his classic 2005 presentation on Goal Oriented Action Planners (GOAP) in F.E.A.R., Jeff Orkin built a system with a FSM sitting on top of a planner. While we've learned a lot from the planning layer over the years, now's the time to realize the importance of the finite-state machine layer! A few years later, Dmitriy Iassenev on S.T.A.L.K.E.R. managed to get around some of the problems with GOAP by layering them horizontally — again illustrating the benefit of this approach for reducing complexity.

What Next?

  • Work has already started in the Labs on integrating the ideas from this discussion into designs for behavior trees that work better in practice. If you'd like to find out more about BTs, we're holding an advanced day-long workshop on September 17th. Subscribe here for details shortly.
  • If you want to attend an European alternative to the AI Dinner before the next GDC, then look no further than our very own Game/AI Conference 2012, to be held in Vienna this year. Tickets go on sale very soon!


Special thanks to Troy Humphreys and Mike Lewis for fueling the debate with their generous beer donations! Additional credit goes to Daniel Brewer for stimulating the discussions, sitting opposite me almost every year :-) The photo of our AI Dinner table is courtesy of Neil Kirby.

If you have any thoughts on the topic, please feel free to post them below or in the corresponding Game/AI Forum thread!

Discussion 3 Comments

jcothran on March 29th, 2012

While you're listing competing subsystems, leave placeholders for morality,decay and schrodinger's cat as well - so when I'm playing GTA#451 and set someone on fire in public maybe that registers or is forgotten with the agents depending on whether any server is actually measuring the game.

yngvi on April 16th, 2012

Witch-hunting FSM We shouldn't demonize FSMs as a heretic representation of game AI logic. True, they don't scale well when employed to model a whole large complex system. They don't fit for this task due to explosion of the number of states and transitions. Here the fault lies in a bad choice of tool to solve a task at hand. You won't use a screw driver to hammer down nails. FSMs are not intrinsically bad but they tend to work better for certain purposes. Generally they are suited to express a well confined game mechanic or logic as described along the article. I advocate the embedding of FSMs inside BTs. FSMs can become a new type of behavior node, on the same line of Sequences, Parallels and Selectors, able to mutate it's internal logic based on simple conditions and state transitions, and therefore select the proper behavior branch for execution. One can think of FSMs as a smarter version of a Selector. This type of FSM Selector works particularly well to express higher level goals selection logic which tend to be happening closer to the BT's root. Claudio Pedica

triplefox on June 14th, 2012

This discussion seems remarkably similar to historical ones surrounding entity systems - in which there was a gradual movement from monolithic data structures to inherited classes and then to components. In the entity discussion, the problem was one of reusing the same behaviors in different contexts - and the resolution(to date) has been to architect mostly in terms of protocols, not data structures. If the components can communicate well with each other, it becomes easier to examine and debug each subsystem. Protocols are enablers of large-scale concurrency - as demonstrated by the internet. In that context it's not too surprising to see solutions of a similar nature arising in AI - instead of walking through a single graph to determine behaviors, we're turning towards more of a dataflow approach, allowing nodes to become internally sophisticated and at the same time remain independent.

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!