Gabriel Ware asks “Do you have any experience designing AI for parallel computing? With PS3 hitting the market I believe designing AI engines with parallelization in mind is a very good thing.”
There are a few things you can do to support multiple threads, whether its the XBox360 or Intel’s multi-core architectures. Luckily, it’s not too difficult to achieve, it just requires good discipline for multi-threaded programming (to avoid deadlocks). Here’s how it’s done most often these days in the games industry.
Batch Up & Off-load
The first thing to realize is that the AI decision making itself takes up a very small percentage of the computation power. Most of the work is done in collision tests, animation, or path-finding — which most often take up over 95% of your computation budget per game character.
So once you’ve got those running on multiple-threads, then your AI will become at least an order of magnitude faster. To achieve this, do the following:
Design the AI decision making to be asynchronous, and work with outdated information if necessary.
Build your AI engine to make requests to external systems, and don’t do the calculations immediately.
Batch up those requests together and process them in a dedicated thread of your choice.
When so much work is done by small algorithms, it’s worth spending the time to make the operations work efficiently in parallel. Collision detection, animation blending, or A* searches are prime candidates.
On hardware architectures where only small amounts of memory is available to each thread (e.g. the SPU on the PS3), then you’ll need more work to figure out what data to send. For example, creating a local cache of the collision mesh, or a hierarchical representation of the navigation mesh that fits in memory.
Photo 1: Main threads for graphics and physics, and multiple worker threads.
Work Queue & Thread Pool
Batching is a great strategy for computations that occur very regularly. However, for more heterogeneous computations, it’s generally not worthwhile making a dedicated algorithm that can batch-process requests.
Instead, a good strategy is to have a command pattern. You implement all the work to be threaded as a Command object, and just add it to the work queue. Then, all your worker threads take items from the queue and process them one by one.
This approach is particularly useful as computation time is never wasted; as soon as a thread is going idle waiting for something (e.g. the rendering to finish), it can turn into a worker thread and process items from the work queue.
Beyond Low-Hanging Fruit
These two strategies give you the most benefit for the least work. But moving forward from this isn’t trivial. In fact, it’s particularly hard for game logic as there are many interdependent actors, so they can’t all be run in parallel.
The most promising solution so far involves borrowing ideas from Erlang:
Give each actor its own lightweight process (a.k.a. coroutine) and memory.
Only allow processes to share data by copying it, so there are no conflicts.
On the down side, this requires more memory, so it’s not ideal. But processes can freely migrate from one thread to another, or even changes machines transparently. So in terms of concurrency, it’s a step forward. It’s certainly worth investigating…
What strategies do you use for multi-threading your game AI?