As AI algorithms get more complicated they also get more resource-hungry, especially those relating to decision-making. Thanks to Moore’s Law, as time has marched on there has been a steady stream of relief in the form of more processing power. Also, as graphics processing exercises its pursuit of diminishing returns and edges toward the asymptote of reality, it grudgingly allows other computational areas such as physics and AI to use a greater percentage of that increased processing power. However, the players (and the game reviewers that are their unelected proxies) have expectations of better AI that always seem to outpace whatever gains we get in sheer horsepower alone.
The Single Frame Law
As AI programmers, we are forced (or force ourselves) up against the invisible wall of framerates. Our agents must live their lives in 20 millisecond slices — perceiving, pondering, planning and performing must all be arranged in little easily-digestible bites. What’s more, they share their cramped temporal quarters with dozens, scores, or even hundreds of other cohorts — all clamoring for the leftovers that the art department has discarded… and all working under the same 20ms edict. If you can’t decide what to do in 20ms, it isn’t worth doing. (Alternately, “You snooze, You lose!”)
We fuss over their progress like parents. We want so much from them — to give them the life that they so deserve… that we could give them! If only we could give them longer than those 20ms to act in, they could be so much more! But why not use more than 20ms? Who says that they should only be allowed a single frame to make up their minds? Why can’t it be different? Why not give them the freedom to think? To dream? To live their (often truncated) lives to the fullest? (Ok, this is getting a bit anthropomorphic here!)
Many AI routines, whether it be pathfinding, planning or complex strategic decisions are getting computationaly more expensive. Often, if we are to respect the unspoken Law of the Single Frame, we must choose a less accurate method of calculation… or simply cheat. The premise that these decisions must be made in a single frame may very well be flawed. The argument that I have heard numerous times is based on the reaction time of the agent. “The player wants to see immediate results from the units.” Does he, now? Or does he want to see “realistically immediate” results? And what does that mean?
The Reality of Human Reaction Times
Human reactions times vary depending on what type of reaction we are talking about. The reaction time to a simply sensory stimulus ranges from 150-300 ms (i.e. 7–15 frames). Note that this is simply for the reaction to a stimulus. ANY reaction. For example, an enemy wouldn’t react to a muzzle flash from a gun (“That’s a gun shot!”) for about a fifth to a quarter of a second. If you add in a simple deliberation factor (“Is he shooting at me?”), reaction times are significantly longer. Add in a complex decision that may be based on a number of factors (“Should I return fire or go to cover?”) and you may be closing in on a full second of reaction time. Even further, add the additional necessity to process the environment – perhaps even visually observing and calculating pros and cons (“Given the environment, where should I go for cover from the shooter?”) – and we now are to the point where there is a definite lag time that should be palpably apparent to the player. And yet, we insist that this decision needs to take place in the single frame between the gun being fired and the onset of the action of diving for cover.
I believe the gripe about “instant action” is not completely ill-founded. Certainly we want the agents to do something. We have all seen the situation where a stimulus is given, either by the game engine or by the player, and the agent stands there as if blissfully unaware for a moment until he decides, somewhat lackadaisically, that maybe he better get a move on. That sort of reaction is one that rightfully gets our AI labeled stupid, sluggish and unrealistic. But, if the moment that I pull the trigger the target immediately – i.e. the next frame – leaps for the best possible place in the room to take cover from me, it is going to look… well… unrealistic.
Delayed Reaction Gratification
So, what to do? To avoid the “instant reaction” problem, one person offered a solution up to me (much to my amazement). They suggested that, after the decision had been made, the agent play a decision animation for a specified period of time. At the conclusion of the enforced “wait time”, the real action would start. The decision itself, however, was still being made in that first frame. This seemed like a colossal waste of time. Although this did not happen, it would have been striking to hear that same AI programmer complain at a somewhat removed time that “we can’t do that complicated decision idea we had because it slowed down the frame rate.”
It seems that in order to be able to process bigger and better decisions, especially large scale strategic processing, we will need to move beyond what can be accomplished in one frame. There are already ways to do this to some extent. For example, there are numerous resources on spreading pathfinding over multiple frames. GOAP techniques that use pathfinding architectures can be similarly arranged. Strategic decisions such as analysis of enemy force deployment or determining dynamic cover points in which to hide are more difficult to spread over multiple frames. An acceptable approach to this would be multi-threading where a worker thread queues and processes requests and returns the completed decision to the requestor when it is finished.
This method, as well, has had its detractors. “The world will get out of sync, though! The decision may not be valid anymore by the time it is returned.” My simple answer to this has been, and still is… “So? Isn’t that realistic as well?” Again, remember folks… we are talking about individual frames here. Life at 20ms at a time! The real world doesn’t work in 20ms slices. And people certainly don’t think 20ms at a time!
A Quick Example…
Mapping this objection into the example above, if my agent detects a shot, determines it is at him, decides he can’t fight and needs to hide, and then, after scanning the environment, decides that the nearby barrel is the best bet for cover, he may have taken a full second. Even less if he had already mentally tagged the barrel as being a convenient cover spot. Now what could possibly happen in that single second?
Someone could have killed the shooter. The threat is gone.
A second shooter could have appeared nearby invalidating the barrel as cover.
A grenade could have been thrown between him and the barrel.
The barrel could have been blown up. (Everyone knows barrels are for shooting!)
His allies could have come into the room making fighting a better choice than hiding.
Any number of things could have happened… in that single second! Whereas in computer time it was a full 50 frames, in a human’s eyes, it is only one second. If you take a moment to think about it (was it cheesy to say that?), he wouldn’t have even realized THOSE stimuli existed until after he reacted to the first. The point is, it isn’t terribly tragic if the decisions being made get slightly out of sync with the world. As humans with reaction times far longer than 20ms, OUR decisions get slightly out of sync with the world as well.
In the mean time, in order to avoid the “dumb agent” look, I do agree with the idea of playing a “deliberation” animation or two. After all, we are all well aware of the need for the player to have feedback on what our agents are “thinking”. So, have the agent do nothing for a few frames (i.e. base reaction time), look startled by the flash (“Was that a shot?”), look afraid or thoughtful for a moment (“Should I fight or flee?”) glance around for a moment (“Where should I go?”), and THEN start diving for cover. And if something changes while he is thinking or even acting, have him react to that as well… once he notices it. (“Man… there’s another dude over there, too? Guess this barrel wasn’t a great idea, was it? Now where should I hide? Whoa… my buds are here! Maybe I don’t have to worry about hiding after all!”)
So Is Our Realism Realistic?
So, the question of the week? Why are we sacrificing ourselves on the Altar of the Single Frame — and thereby limiting ourselves in our quest for Realistic Behavior when, in fact, realistic behavior doesn’t even limit itself that stringently? How about it folks? Pros and cons? Experiences? Horror stories?
You have 20ms to respond. Go!