Gabriel Ware asks “Do you have any experience designing AI for parallel computing? With PS3 hitting the market I believe designing AI engines with parallelization in mind is a very good thing.”
There are a few things you can do to support multiple threads, whether its the XBox360 or Intel’s multi-core architectures. Luckily, it’s not too difficult to achieve, it just requires good discipline for multi-threaded programming (to avoid deadlocks). Here’s how it’s done most often these days in the games industry.
Batch Up & Off-load
The first thing to realize is that the AI decision making itself takes up a very small percentage of the computation power. Most of the work is done in collision tests, animation, or path-finding — which most often take up over 95% of your computation budget per game character.
So once you’ve got those running on multiple-threads, then your AI will become at least an order of magnitude faster. To achieve this, do the following:
Design the AI decision making to be asynchronous, and work with outdated information if necessary.
Build your AI engine to make requests to external systems, and don’t do the calculations immediately.
Batch up those requests together and process them in a dedicated thread of your choice.
When so much work is done by small algorithms, it’s worth spending the time to make the operations work efficiently in parallel. Collision detection, animation blending, or A* searches are prime candidates.
On hardware architectures where only small amounts of memory is available to each thread (e.g. the SPU on the PS3), then you’ll need more work to figure out what data to send. For example, creating a local cache of the collision mesh, or a hierarchical representation of the navigation mesh that fits in memory.
Photo 1: Main threads for graphics and physics, and multiple worker threads.
Work Queue & Thread Pool
Batching is a great strategy for computations that occur very regularly. However, for more heterogeneous computations, it’s generally not worthwhile making a dedicated algorithm that can batch-process requests.
Instead, a good strategy is to have a command pattern. You implement all the work to be threaded as a Command object, and just add it to the work queue. Then, all your worker threads take items from the queue and process them one by one.
This approach is particularly useful as computation time is never wasted; as soon as a thread is going idle waiting for something (e.g. the rendering to finish), it can turn into a worker thread and process items from the work queue.
Beyond Low-Hanging Fruit
These two strategies give you the most benefit for the least work. But moving forward from this isn’t trivial. In fact, it’s particularly hard for game logic as there are many interdependent actors, so they can’t all be run in parallel.
The most promising solution so far involves borrowing ideas from Erlang:
Give each actor its own lightweight process (a.k.a. coroutine) and memory.
Only allow processes to share data by copying it, so there are no conflicts.
On the down side, this requires more memory, so it’s not ideal. But processes can freely migrate from one thread to another, or even changes machines transparently. So in terms of concurrency, it’s a step forward. It’s certainly worth investigating…
What strategies do you use for multi-threading your game AI?





3 Comments ↓
I plan on using Erlang itself to manage the execution flow of the program. I predict that it will work wonders. There’s still a lot of design that needs to be done, but especially for neural networks, it seems like it’ll be possible to have each neuron actually be a whole process.
Implementing other AI algorithms such as A* pathfinding involve just breaking it down into chunks that can done parallel, and use node links for dependencies.
Hi James,
That sounds like a very interesting project. Are you using any particular neural network or the typical multi-layer perceptron?
For large MLP networks, a great strategy for threading is to simply use OpenMP — which would be difficult to beat with Erlang. If you’re doing heterogeneous/sparse networks, or even different models (e.g. spiking NN), then Erlang may well pay off.
The same goes for A*. Having each node as a process will be rather slow, so it would have to be a special type of network/algorithm to get the benefits of concurrency from it…
Anyway, I’d love to hear how your project goes; keep us posted!
Alex
[…] at an online publication called AI Game Dev, there is an elucidating post on how to do multithreading of game AI code (posted in June 2007). Basically, the conclusion is that most of the CPU time in an AI system is […]
Leave a Comment