Article
files/books/SB00003CXZ4

Memento, Temporal Coherence and Debugging Planners

Alex J. Champandard on December 9, 2007

How does the hit film Memento, directed by Chris Nolan and released in 2000, relate to artificial intelligence and planning? Surprisingly, understanding the mindset of the main character, played by Guy Pearce, can help you make the most of planners. (It’s an amazing film also! See Amazon U.S., U.K.)

In the games industry in particular, planners have a reputation for being hard to debug. Even once you have your core algorithm fully tested, it’s often time-consuming to figure out exactly how new situations cause the planner to generate a particular solution.

Similarly, the protagonist in Memento has lost the ability to create short term memories since his wife was killed. In effect, he has to rethink everything from scratch every few minutes. Can you see how this quickly becomes a problem?

What is Temporal Coherence?

In this context, temporal coherence is the idea that very few things will change from one moment to another. In a world that respects the laws of physics, this is often the case as objects can’t move that fast. Changes happen incrementally, and life on earth has evolved to be able to deal with this. You don’t have to process the large quantities of information in the world; you just process the changes.

Normally, humans don’t have to rethink everything from scratch every few seconds. Their decisions from one moment to another are also likely to be temporally coherent. This has two benefits:

  • It’s much easier to make decisions and generally reason in a logical fashion.

  • It’s much easier for someone watching to understand the reasoning behind particular choices.

The guy in Memento, however, has no contextual memory to store the decisions he’s already made. So he has to “replan” completely every few minutes based on his high-level goals, which he keeps as Polaroids and tattoos on his body. Not only does this make it much harder for him to behave intelligently, but it also takes much more effort to follow what he’s trying to do. (The film plays with this idea brilliantly.)

Memento Bedroom

Photo 1: Having to rethink everything from one minute to another.

How Does This Affect Planners?

The problem is that planning in games is typically done Memento-style. More often than not, the development process for a planner in a virtual actor is structured as follows:

  1. Implement a STRIPS (Stanford Research Institute Problem Solver) or HTN (Hierarchical Task Network) planner.

  2. On a regular basis (around 5Hz is common), run the planner from scratch with the most recent information from the world.

  3. If the new master plan is different than the old plan, then execute it from now on as it must be a better solution.

This approach is typical because writing a planner that can replan to deal with dynamic changes is not trivial, and setting up the whole AI logic to monitor and figure out what changes are relevant takes a lot of work and experience.

However, this approach makes it much harder to debug the actual behaviors, because there’s no temporal coherence. You have to step through the whole decision making process to figure out what changed. No matter how good the tools are, you have to process all the information of the world when debugging the plan rather than monitoring incremental changes.

Memento Bar

Photo 2: A scene from Memento with Carrie-Anne Moss.

How to Fix This!

What’s interesting to note is that more reactive forms of AI, such as finite state machines (FSM) don’t suffer from these problems because they operate in control & monitoring mode by default — always checking for changes as failures and successes of the low-level AI actions. Effectively, in this case, you have the advantage of temporal coherence from one second to another in the AI.

Planners in contrast operate primarily in a separate decision-making mode, which means they’re disconnected from the real world by a set of logic symbols. As such, they’re not as well equipped to deal with dynamic changes and adjust the previous plan to the current situation.

In practice, if you don’t want to revert back to plan old FSM, you need to find hybrids systems for both decision-making and control — in particular capable of localized replanning. You have two choices:

  • The more academic solution would be to use a full planner, and extend its dependency checking abilities to be able to modify the current plan in case of changes in the world.

  • Game developers seem to be using behavior trees more and more these days, particularly because they use a form of greedy planning that’s simple and adapts plans locally very effectively.

If you’re interested in this topic, then stay tuned to AiGameDev.com for more information from both the theoretical and practical perspectives. Among other things, over the next few days, you’ll be hearing more from the talk at the GDC in Lyon about behavior trees.

Memento Car

Photo 3: Have I been here before?

Also be sure to check out Memento at Amazon U.S. or U.K.!

Discussion 10 Comments

alexjc on December 9th, 2007

This film is in one of my Top 5 favorite films ever! :-)

Dil on December 10th, 2007

I can't blame you. Who can? This movie is incredibly complex. You have to watch it at least three times to notice all the small indices left. that's what is fun with it. I am looking forward to see what's next about temporal coherence.

Dave Mark on December 11th, 2007

I believe that much of the strategy in how to deal with this would depend on if you are planning forward from the current state or backwards from the goal. In either case, it would be nice if you could tie individual preconditions that are NOT created directly by your plan (i.e. ones based to a world state) to data structures of the relevant portions of the world. For example, if part of you plan is based on the assumption that Resource A will be in Location A, you would need to have a flag that tells you if R(a) gets moved. Again, this brings up the debate between event messaging and polling. Should you add a callback to R(a) that says, "let me know if you move from L(a)" or does your plan generate a list of assumptions that it continually needs to poll and validate? The later solution is more self-contained but it makes for potential duplication of effort if many agents are polling the exact same validation. An alternate structure would be to create a MASTER list of some sort that is shared by all planners. In that list you would have all the currently active assumptions of all planners. Each node of the precondition list would itself have a list of planners that require that information (minimum of 1 planner). If a NEW precondition that is based on world state is required, it is pushed onto the list. If a precondition already exists, then that planner says "me too" and it's pointer is added to that node's list. You could then itterate through the list every so often (separate thread even?) and check some or all of the assumptions. If an assumption fails, (e.g. R(a) has left L(a)) it then sends event messages to the list of planners that need to know telling them that they now need to rethink things. The planners themselves just assume that their ideal world is intact until they hear otherwise. In essence, we are using a polling process, but it is streamlined somewhat by aggregating world information into one collection point. Disclaimer: It's way late at night and so I'm pulling AI out of my posterior. Honestly, I just made up this whole solution in the past 5 minutes. I hope I didn't spoil your next chapter, Alex!

alexjc on December 11th, 2007

[B]Dave[/B], It's not so important how the plan was found as long as its dependencies are clear, as you said. There are planners that deal with such problems, such as making "universal plans" that deal with all contingencies like a FSM would. It's still not trivial to implement, so I would suggest a behavior tree for those who aren't too comfortable with planners. Alex P.S. I had no particular plans to write about this yet, but since it's an up and coming topic I'll put some articles in the queue! :-)

Dave Mark on December 11th, 2007

Wow... now that I am awake, it's nice to see that I was at least partially lucid when I made that comment. I seem to remember pondering that issue in my sleep somehow. *sigh*

Sergio on December 11th, 2007

Replanning (or fixing the plan) when one of the plan's assumptions is no longer valid would fix situations like a character trying to go into a room just to find the door is locked. The planner decided to use the door, but that includes a precondition for the door to be open. But what about the opposite? Imagine the character thinks the door is locked, so he decides to try the window. However, in the way there he realizes the door is actually open. The window is still a valid option, nothing wrong with that, but the door is a better one. We should also replan in this case, which means the AI would need to keep track not only of the preconditions necessary for the plan to succeed, but all the state in the world, as any changes could theoretically lead to a better plan. In practice, I find it useful to encode memory as a tendency to stick with the current plan. So we favour selecting the same actions we already plan to use. But if something sufficiently interesting shows up, the character can and will change the plan. It's just a form of hysteresis, really.

Dave Mark on December 11th, 2007

In your situation above (door/window) you skipped one part. "[I]He realizes...[/I]" implies that there is a check that is done. That check on the status of the door (perhaps once he is in visual range) is a function call that can also trigger an event. Someplace is stored the agents belief that the door is locked. After all, that belief was used to create the plan. If the agent makes the check on the door, his belief about the door's status needs to be updated. If the belief about the door [I]changes[/I], then all information in the plan that was based on that status needs to be reanalyzed. So we are back to my point that changing information (whether the change be actual or perceived) would be the catalyst that forces a re-plan.

Sergio on December 11th, 2007

True. In the case of a discrete change like this one, you could wait for the notification that a change has occurred, and reevaluate the plan then. On the other hand, what if the situation changes in an analog way? Imagine that you need to pick up a key in order to open the door. You could go to where the key is located, go back to the door and open it. Or you could climb through the window. Now, if a guy is running away with the key, at some point the character might decide it's not worth anymore to go and get it, and he would prefer to use the window. How do you use events in this case? You could fire one anytime that guy moves. But what if you need to consider how hard it is to fight that character? And how difficult is to navigate the terrain to where he is? For simplicity I would rather reevaluate the plan so the character doesn't miss any opportunities.

Dave Mark on December 12th, 2007

I agree with your premise that non-discrete throws a wrench into things. In this case, however... If the key was on dude to begin with, it would have necessitated a node of "kill dude" or at least something like that. "Kill dude" would have to have a cost rating associated with it like any other edge in the plan graph. A good design would likely have "kill dude" ranked as more expensive as "use the bloody window" so the point is moot anyway.

jefft on October 2nd, 2009

Very interesting, I think the key is not only monitoring if anything changes the cost of the plan which is executing, but also whether a much easier path becomes available. For example, there's a mountain pass with guards. You decide to climb a goat trail through the mountains. Then the guards leave, and suddenly the main pass is wide open. To this end, you could maintain a list of plans considered which had lower cost than the current except for a dynamic obstacle. If that obstacle changes, do a partial or full replan. Also the big problem with replanning is that the goal may never be attained, if transient conditions alternately block the two best paths, for example cyclically moving platforms/doors/troop deployments. The AI might go take path A. Oh, it's blocked, but now path B is open. Take path B. Oh, it's blocked. Take path A. Ad infinitum. This sort of problem may easily occur if the planner isn't made aware of all possibilities in the world. One approach may be that once a path has been invalidated, the whole path's cost (or just the point it was invalidated at) is ramped up for a period of time so it's unlikely to be chosen again.

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!