Next-gen games are proving to be quite a challenge to develop! From hearing Crytek’s Cevat Yerli talk at GDC in Lyon, it was clear that Crysis is based on some of the most advanced technology available in the games industry. Look no further than the animation system! But even that has its problems…
In particular for Crysis, Yerli wanted to make the characters feel much more real and connected to the world. To do this, it was necessary to reduce the amount of foot-sliding in the animations, which makes the characters less believable. These issues are so common in games we almost don’t notice them anymore!
That said, it’s possible to solve this problem with the latest animation technology, but working on the cutting edge is a challenge, particularly when integrating animation and AI. Even with fully working animation prototypes, it’s still a huge challenge to ship this kind of technology in game. As Yerli pointed out in his keynote:
“If there’s one area in Crysis where I think we failed it’s this one.” — Cevat Yerli, talking about problems with the advanced animation system.
In this article, I’ll take a detailed look at the kind of technology that’s used in CryEngine 2 for AI and animation, covering the basic concepts from my experience working with very similar systems at Rockstar. I’ll also go a bit further into juicy gossip and informed speculation to analyze exactly why it’s so difficult to get it working in game generally — not just for Crytek.
Dynamic Systemic AI
The design motto of Crysis, dating back to FarCry also, is that every battle should be unique, as Yerli emphasized in his keynote. The idea is to use large environments that invite variation, as well as an AI that offers diversity without necessarily relying randomness — rather like Halo’s AI does. (It uses the player’s behavior as the only source of uncertainty off which to make the actor behaviors less monotonous but still predictable.)
At Crytek, they call this dynamic systemic AI. It’s their version of the typical AI implementation in shooters these days, which lets the behaviors be a little more emergent rather than fully scripted. Here’s how you do this in practice:
Build an overall AI architecture with a sensory system.
Allow the logic to be customized modularly for each AI.
Use scripts to implement purposeful reactions to events.
Let the behavior emerge from the logic interacting with the world.
Of course, there’s a lot more to this, but it’s rather similar to other games with modern AI (see these technical reviews as a reference). Also, here’s an interview with Christopher Natsuume, producer of FarCry, which goes through the basics of how the AI works. These ideas were expanded for Crysis of course, but the game is still based on the same idea of customizable Lua scripts according to the CryEngine 2 specifications :
“Allows complex AI behaviors to be created without requiring new C++ code, including extending state machine behaviors from LUA scripts.”
This kind of AI provides the basis of the emergent gameplay in the trademark “sandbox levels” of Crytek, but it’s also responsible indirectly for controlling the animation.
As Yerli mentioned in his keynote, Crytek was very keen to play with the “theatre of the player’s mind,” since ultimately, it’s the most powerful tool game developers have at their disposal. A large part of achieving this comes down to fooling the player into believing that every actor is part of the world, and it helps if there’s no noticeable foot-sliding (a.k.a. foot-skating).
This animation problem happens in two cases:
Moving the animation around in space procedurally, and
Doing a naïve blend by simply interpolating keyframes.
Technologically speaking, it’s possible to reach zero foot-skating as long as your animators clean up the original motion-capture animations to be perfect also. Then, you can get perfect blends by lining up and synchronizing animations while blending them, which introduces no extra foot sliding.
The theory behind this kind of animation synthesis is often referred to as parametric motions. This paper on motion graphs in particular is the source of most implementations in the games industry these days:
Motion Graphs L. Kovar, M. Gleicher, and F. Pighin Proceedings of ACM SIGGRAPH, 2002. Download (PDF, 771 Kb)
Also, follow up by reading the paper on Parametric Motion Graphs to get an idea of how to blend animations together correctly. You can read about other improvements in the section on character animation at AiGameDev.com too if you’re interested in other derived approaches.
This is the kind of technology that I worked on at Rockstar for their internal middleware called R.A.G.E., which was the basis of R* Table Tennis  and soon GTA 4 . Now for the last bit of (public) gossip, one of the animation programmers who worked on this within Rockstar subsequently moved to Crytek… so the rest of this analysis should be relatively accurate!
Meanwhile, Back in the Real World!
Crytek effectively had this animation technology working fine as an isolated prototype, but in the game itself there were many problems. As Yerli mentioned in his talk at GDC Lyon, they just couldn’t get it into the game to the state where it met the requirements of the AI design.
I’ve worked on this animation technology before as well as the AI to control it, and we ran into similar problems (that game prototype was not released :-) Here’s how things went wrong step by step, and presumably how things happened at Crytek:
Animation Team: “We can build this awesome animation technology that has no foot-skating.”
Producer: “Sounds cool; let’s do it!”
Behavior Team: “How’s that going to work for the AI?”
Animation Team: “Don’t worry, we’ll retro-fit the new technology into the current API.”
Behavior Team: “Ok, that’s great. The current system is pretty responsive!”
Did you spot the problem already? If not, here’s how the end of the story plays out.
Producer: “So how’s this awesome technology coming along?”
Animation Team: “Good, we’re about to integrate it into the game for the AI.”
Behavior Team: “Hmm. The AI behaves completely differently now; the timing is off.”
Animation Team: “That’s easy to fix, all we need is more motion capture data and a few fixes in the AI.”
Producer: “No time for that; the game ships in a few months! Do whatever it takes to get the AI working.”
The problem, of course, is that by default your AI will behave much more sluggishly if you constrain it to what your motion capture can do. This is especially a problem if you don’t have many animations to provide better responsiveness. So in practice, your soldiers will rarely even have time to do anything intelligent before they get shot.
It’s certainly possible to make such technology responsive; Assassin’s Creed shows that (although there’s still room for improvement), but typically you don’t have the time and budget to capture all the animations required to get that responsiveness just for the AI characters. It takes hours of mocaps of different speeds, motion combinations, starting on different steps, etc.
On top of that, things become trickier when you develop the AI separately from the animation logic… (For player control, it’s much simpler.)
Rethinking the AI / Animation Interface
Traditionally, the AI controls the animation via a virtual controller or some kind of moving carrot. The justification for doing this is that you can interchange players with AI and still use the same animation system. In FPS games where the player has no visible avatar, there’s still a division between animation management and the AI logic.
This type of API is certainly simple, but in the case of advanced animation systems, it just isn’t expressive enough to allow for responsive yet realistic motion. This is a form of behavior aliasing caused by a bottleneck of information in the code.
To fix this, you need to extend this interface significantly:
The AI needs a better idea of the animation logic so it can take into account all possible options from a logical perspective, and trade-off their cost/length.
The animation logic needs a to know the intentions of the AI so it can select the best motion clips ahead of time, as well as finding the most responsive option.
Not only is this a technical problem, but it’s also important to get the workflow right also. You can’t just abstract either system as a black box and expect everything to turn out O.K.
The Solution for Responsive & Realistic AI
The fact is that both systems are highly constrained to what the other is doing. Sure you can fix this by extending the interface between the two, but ideally there’s a better solution:
Develop the animation system in the same way as AI is built, using goal-directed systems to figure out the best way to achieve an objective using animations.
Consider the animation as a lower-level of the AI system, and integrating it very closely with the AI rather than abstracting it away.
Prototype the AI bottom-up based on the real animations available, at least as much as there is top-down design.
Use multi-disciplinary sub-teams in AI/animation that work tightly together to resolve any problems immediately.
I have no doubt that Crytek will manage to get this right in their next game. Having made the mistake once, it’s easier to adapt and get it right the second time. I look forward to seeing the results, in the meantime be sure to check out Crysis! (See Amazon U.S. or U.K.)