Review
ds-icon

Online Adaptation of Game Opponent AI in Simulation and in Practice

Alex J. Champandard on January 3, 2008

At a certain level, adaptation is a requirement for AI. If a game doesn’t have it, you end up living inside a movie you can’t change! In practice, all the traditional techniques (e.g. finite state machines, scripts) help developers implement adaptive behaviors, but things get tricky when you want to adapt to many dynamic factors like player styles.

This week’s Thursday Theory post looks into the award-winning paper behind dynamic scripting, as proposed by Pieter Spronck from the Computer Science Department at Universiteit Maastricht. It presents some interesting ideas for dealing with the problem of creating behaviors automatically that can adapt to almost anything in the game.

Motivation

The process of creating entertaining AI for non-trivial games is becoming much harder because of two reasons:

  1. Complexity — Large scripts are required to support games with many possible options, like role-playing games (RPG). Scripts are often long, static, poorly factored, hard to understand and maintain etc.

  2. Adaptability — With better AI opponents, the entertainment value of the game goes up. So there’s a strong incentive to make the AI more competitive by adapting to the player’s tactics.

Developers typically create adaptable behaviors by manually scripting the different adaptations by hand, which of course increases the complexity of the solution.

Instead, this paper proposes the use of unsupervised learning to solve the problem and investigates the feasibility of this approach. Dynamic scripting aims to provide a fast implementation, effective behaviors that are competitive with hand-crafted AI, robust in being able to deal with uncertainty, and efficient at learning from few trials.

Baldur's Gate AI

Screenshot 1: An encounter with AI opponents in Baldur’s Gate.

Contributions

There are two things that make this particular research project interesting:

  • An architecture that combines reinforcement learning and scripting together.

  • Its application to opponent AI in state-of-the-art RPGs like BioWare’s Baldur’s Gate.

The architecture itself is very reminiscent of Learning Classifier Systems (LCS) originally invented by John Holland. Here’s how it works:

  1. There’s a database of rules, typically one for each opponent. These are made up of hand-crafted rules which perform actions in the game.

  2. At the start of an encounter, a new script is generated randomly from the rules. Each rule is selected based on its weight value.

  3. Once the encounter is over, the outcome (either positive or negative) is used to update the weights of each rule.

Over time, the rules with the most success are more likely to get picked for new scripts, thanks to the reinforcement they receive from each encounter.

Dynamic Scripting Architecture

Figure 2: The architecture of the dynamic scripting system (see paper).

Abstract & References

Here’s the abstract for the paper:

“Unsupervised online learning in commercial computer games allows computer-controlled opponents to adapt to the way the game is being played, thereby providing a mechanism to deal with weaknesses in the game AI and to respond to changes in human player tactics. For online learning to work in practice, it must be fast, effective, robust, and efficient.

This paper proposes a novel technique called “dynamic scripting” that meets these requirements. In dynamic scripting an adaptive rulebase is used for the generation of intelligent opponents on the fly. The performance of dynamic scripting is evaluated in an experiment in which the adaptive players are pitted against a collective of manually designed tactics in a simulated computer roleplaying game and in a module for the state-of-the-art commercial game NEVERWINTER NIGHTS.

The results indicate that dynamic scripting succeeds in endowing computer-controlled opponents with successful adaptive performance. We therefore conclude that dynamic scripting can be successfully applied to the online adaptation of computer game opponent AI.”

You can download the paper from the website (PDF, 614 Kb):

Online Adaptation of Game Opponent AI in Simulation and in Practice
Spronck, P., Sprinkhuizen-Kuyper I. and Postma E.
Proceedings of the 4th International Conference on Intelligent Games and Simulation (GAME-ON 2003)

If you’re interested in further academic background behind these ideas, look into Learning Classifier Systems, and particularly Wilson’s XCS (which will no doubt also feature on AiGameDev.com in the future).

Evaluation

Here’s how I think the technology in the paper ranks in practice:

Applicability to games: 8/10
The dynamic scripting technique is designed with very sound principles from a game development perspective. It’s efficient and doesn’t require much memory. Not all games would benefit from having this kind of adaptation, as it takes a certain amount of encounters with a specific opponent for the AI to be able to learn something!
Usefulness for character AI: 6/10
The paper shows how dynamic scripting can be applied to combat in an RPG, for things like selecting attacks. Luckily, this is the kind of situation that wouldn’t suffer too much from random behavior, so the learning process is not too obvious. (In fact, randomness can be fun here.) However, other problems in game AI are much more sensitive to randomness, which limits the applicability of this technique.
Simplicity to implement: 10/10
The major benefit of this technology is that it’s extremely simple to implement. Reinforcement learning is conceptually very simple, and when it’s hooked up to a simple rule-based representation it requires very little work to generate new scripts.
Neverwinter Nights 2 AI

Screenshot 3: Combat in Neverwinter Nights 2.

Discussion

This paper, above all, shows the promise of unsupervised learning (which I think is very fertile ground) and demonstrates it in a game. There has been much more research on dynamic scripting since, and I expect many more research projects applying LCS-like technology into games…

For this paper in particular, what surprises me the most is the lack of references to Learning Classifier Systems. Pieter’s research is uncharacteristically well researched from the game development side, and that’s great to see! However, there’s a huge amount of knowledge and experience available in the field of LCS, particularly in how quickly a system learns and how it behaves when solving difficult problems.

As it turns out, the biggest problem with dynamic scripting is the very same that Learning Classifier Systems suffer from. While they are very good at learning to solve problems realistically, the process of them learning is not very realistic. Random exploration is a requirement, but it can appear uninformed and illogical. This would case problems in games where coherent behaviors are required (unlike combat).

Dynamic Scripting Statistics

Figure 4: Dynamic scripts get better over time, but do they learn realistically?

Basically, it’s the premise of generating these long “procedural” scripts from rules that causes the problem. The representation itself of these scripts makes it hard for designers in the first place, so simply hooking them up to an AI that uses random exploration and reinforcement learning will raise other problems!

Since game developers have been increasingly using planners to reduce complexity of their AI logic, it’d be interesting to see how these two approaches combine together. I raised the question in this article The Secret to Building Game AI that Learns Realistically; it’d be a fascinating research project.

What do you think about the technology behind dynamic scripting, and its applicability to games?

Discussion 5 Comments

kingius on January 4th, 2008

I read this with great interest - a simple way to implement a learning AI in a game. Very good stuff. However, I see an additional weakness to the ones mentioned. Having to do the analysis of success (fitness) of a strategy outside of the encounter itself is very limiting. We (as people) perform analysis of the current situation and adapt our strategy to it within the encounter itself, while this technique does not. For example, we may be able to recognise what the opponent is trying to do and quickly adapt our strategy to counter it, turning an otherwise losing position into a winning one. If the fitness criteria could be evaluated within the game, and have an affect on the rules weighting within the encounter itself; and it was coupled with an analysis phase of what the opponent is doing, I think you would have an extremely powerful system.

FreddieFreeloader on January 4th, 2008

In his [URL=http://www.cs.unimaas.nl/p.spronck/Pubs/ThesisSpronck.pdf]Ph.D.-thesis[/URL], where he first presents dynamic scripting, Pieter Spronck also applies dynamic scripting to the RTS-game WARGUS (an implementation of Warcraft II), though it did, as Robin mentioned, require some modifications from the RPG-based solution. Particularly interesting is how offline-learning is used to improve behaviour based on results found by online-learling. These two techniques have great promise for complementing each other. It's not difficult to imagine online-learning being used to provide a challenge as the game plays (obviously), while collecting data from these games (with thousands of customers playing there should be plenty) to use in offline evolutionary learning. As the process is fully-automated, producing new AI patches would be a breeze, though verification requires some thought. To some extent it can be done automatically as well, by pitting them against strategies from the previous patch. It could be useful to have different batches of strategies, some performing well against rushes, some performing well against defensive players etc. Also, Pieter suggests that machine-learning can be used to verify and improve manually designed strategies, either directly, by doping with the designed strategies, or indirectly, by inspecting strategies generated automatically against the manually designed ones. I can't quite remember what his student is looking into, but I think it had something to do with the technique not scaling well to more complex scenarios (or maybe I'm completely off base). An important point about DS is that it is not about evolving complex behaviour from a clean slate, both Pieter and many others (for instance John Laird) are sceptical whether or not it is possible. If not computationally infeasible in general, it is at least infeasible in games. Instead it's about adapting already complex behaviour by allowing the computer to make informed decisions about how to choose between suitably coarse-grained building blocks of behaviour. There are some interesting parallels between the process of doping in evolutionary AI (initializing the population with good solutions to hard instances) and this type of online-learning. I do think there is great potential for DS or DS-like techniques in many types of games, but, as both Alex and Robin mentioned, it is not a one-size-fits-all solution, but then again what is, and it is not entirely trivial nor obvious how to modify the technique. For any game, what is important is that there is some suitable domain knowledge (for DS the individual scripts are pieces of domain knowledge) that can be used to generate intelligent behaviour. For HTN-planners and behaviour trees I suspect that a DS-like technique could be employed to choose between meaningful decompositions. Preconditions on actions would allow pre-pruning the available decompositions before making a selection (custom selector Alex!) based on weights calculated in a DS-like-manner or an [URL=http://www.cs.unimaas.nl/p.spronck/Pubs/AIIDE07Spronck.pdf]automatic ordering[/URL]. (In fact I'm going to investigate this over the summer in my MSc-thesis). Note, that there are certain restrictions that further helps DS produce intelligent behaviour, e.g. never drinking a healing potion if at full health. Thus available decompositions are analogous to available scripts in DS. I don't think that dynamic scripting has even been shipped in a commercial game (including Halo 3), though Pieter did mention that BioWare were investigating it further for use in Dragon Age. He couldn't give details, since he hadn't been told more than that. However, there is no doubt that the technique is not just interesting for academics, considering how enthusiastically developers responded to Pieter's presentation at the Kick-Off event.

kierand on January 4th, 2008

Here are my notes about Pieter's talk from the AI for games network launch event that I also attended.. As an academic, he really understood the issues that game developers face (he had actually taken the time to speak to some!). He gave some very comical examples of how some game AI is as bad now as it was 20 years ago. He then went on to talk about his research using machine learning, both offline where it is adapting to the game engine and online where it is adapting to the player. He then described and demoed a dynamic scripting solution where a typical NPC script is written on the fly using rules from a knowledge base. The weights of the rules in this knowledge base are updated during game-play as a reaction to winning or losing fights. This way, the AI can adapt to a player’s playing style. He showed that his system can adapt and beat a new (never before seen) static script within 3-10 fights. He has demoed this technology to Bioware who may be using it in the future. While creating his system, Bioware created some statically scripted AI over a year for the game test-bed and the dynamic scripting approach generated scripts that would beat that AI in less than 30 minutes. He has also implemented automatic difficulty changing into his system which will adapt so that a player will win and lose an equal number of times. While he agrees that this isn’t suitable for all games and for all users, he thinks that this system is good for novices learning to play the game. He wants to create AI that can be dropped into another game and have it just work. He also believes that there will be computer game AI that will pass the Turing test in 10-20 years.

premium on January 6th, 2008

A new feature has been posted in the Premium area.

At a certain level, adaptation is a requirement for AI. If a game doesn’t have it, you end up living inside a movie you can’t change! In practice, all the traditional techniques (e.g. finite state machines, scripts) help developers implement adaptive behaviors, but things get tricky when you want to adapt to many dynamic factors like player styles.

This week’s Thursday Theory post looks into the award-winning paper behind dynamic scripting, as proposed by Pieter Spronck from the Computer Science Department at Universiteit Maastricht. It presents some interesting ideas for dealing with the problem of creating behaviors automatically that can adapt to almost anything in the game.

Click here to read the original post.


Andrew on January 6th, 2008

I'm sure I read about this before. It was pretty cool, but in the case of Neverwinter Nights, also can be pretty random. Robin, just so you know, the default Neverwinter Nights AI in either 1 or 2 (even with latest patches) is pathetic, and mainly built on not doing anything immediately dumb then putting up any intelligence, since it works on the basis of "get best spell, try casting it", which ultimately fails in many cases - the next best option being "get random spell". :( Still, it is good to see it put into action and able to learn at least reliably. For combat it would help - although in an RPG like NWN there is a great many unaccounted variables, in simpler ones it'd be easier - Oblivion or somesuch - to code it in with this type of learning. Usually needs to be given a base "understanding" though, since otherwise it could learn some very odd behaviours unless the opponent was another AI bot which was programmed to be not dynamic.

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!