Online Adaptation of Game Opponent AI
in Simulation and in Practice

At a certain level, adaptation is a requirement for AI. If a game doesn’t have it, you end up living inside a movie you can’t change! In practice, all the traditional techniques (e.g. finite state machines, scripts) help developers implement adaptive behaviors, but things get tricky when you want to adapt to many dynamic factors like player styles.

This week’s Thursday Theory post looks into the award-winning paper behind dynamic scripting, as proposed by Pieter Spronck from the Computer Science Department at Universiteit Maastricht. It presents some interesting ideas for dealing with the problem of creating behaviors automatically that can adapt to almost anything in the game.

Motivation

The process of creating entertaining AI for non-trivial games is becoming much harder because of two reasons:

  1. Complexity — Large scripts are required to support games with many possible options, like role-playing games (RPG). Scripts are often long, static, poorly factored, hard to understand and maintain etc.

  2. Adaptability — With better AI opponents, the entertainment value of the game goes up. So there’s a strong incentive to make the AI more competitive by adapting to the player’s tactics.

Developers typically create adaptable behaviors by manually scripting the different adaptations by hand, which of course increases the complexity of the solution.

Instead, this paper proposes the use of unsupervised learning to solve the problem and investigates the feasibility of this approach. Dynamic scripting aims to provide a fast implementation, effective behaviors that are competitive with hand-crafted AI, robust in being able to deal with uncertainty, and efficient at learning from few trials.

Baldur's Gate AI

Screenshot 1: An encounter with AI opponents in Baldur’s Gate.

Contributions

There are two things that make this particular research project interesting:

  • An architecture that combines reinforcement learning and scripting together.

  • Its application to opponent AI in state-of-the-art RPGs like BioWare’s Baldur’s Gate.

The architecture itself is very reminiscent of Learning Classifier Systems (LCS) originally invented by John Holland. Here’s how it works:

  1. There’s a database of rules, typically one for each opponent. These are made up of hand-crafted rules which perform actions in the game.

  2. At the start of an encounter, a new script is generated randomly from the rules. Each rule is selected based on its weight value.

  3. Once the encounter is over, the outcome (either positive or negative) is used to update the weights of each rule.

Over time, the rules with the most success are more likely to get picked for new scripts, thanks to the reinforcement they receive from each encounter.

Dynamic Scripting Architecture

Figure 2: The architecture of the dynamic scripting system (see paper).

Abstract & References

Here’s the abstract for the paper:

“Unsupervised online learning in commercial computer games allows computer-controlled opponents to adapt to the way the game is being played, thereby providing a mechanism to deal with weaknesses in the game AI and to respond to changes in human player tactics. For online learning to work in practice, it must be fast, effective, robust, and efficient.

This paper proposes a novel technique called “dynamic scripting” that meets these requirements. In dynamic scripting an adaptive rulebase is used for the generation of intelligent opponents on the fly. The performance of dynamic scripting is evaluated in an experiment in which the adaptive players are pitted against a collective of manually designed tactics in a simulated computer roleplaying game and in a module for the state-of-the-art commercial game NEVERWINTER NIGHTS.

The results indicate that dynamic scripting succeeds in endowing computer-controlled opponents with successful adaptive performance. We therefore conclude that dynamic scripting can be successfully applied to the online adaptation of computer game opponent AI.”

You can download the paper from the website (PDF, 614 Kb):

Online Adaptation of Game Opponent AI in Simulation and in Practice
Spronck, P., Sprinkhuizen-Kuyper I. and Postma E.
Proceedings of the 4th International Conference on Intelligent Games and Simulation (GAME-ON 2003)

If you’re interested in further academic background behind these ideas, look into Learning Classifier Systems, and particularly Wilson’s XCS (which will no doubt also feature on AiGameDev.com in the future).

Evaluation

Here’s how I think the technology in the paper ranks in practice:

Applicability to games: 8/10
The dynamic scripting technique is designed with very sound principles from a game development perspective. It’s efficient and doesn’t require much memory. Not all games would benefit from having this kind of adaptation, as it takes a certain amount of encounters with a specific opponent for the AI to be able to learn something!
Usefulness for character AI: 6/10
The paper shows how dynamic scripting can be applied to combat in an RPG, for things like selecting attacks. Luckily, this is the kind of situation that wouldn’t suffer too much from random behavior, so the learning process is not too obvious. (In fact, randomness can be fun here.) However, other problems in game AI are much more sensitive to randomness, which limits the applicability of this technique.
Simplicity to implement: 10/10
The major benefit of this technology is that it’s extremely simple to implement. Reinforcement learning is conceptually very simple, and when it’s hooked up to a simple rule-based representation it requires very little work to generate new scripts.
Neverwinter Nights 2 AI

Screenshot 3: Combat in Neverwinter Nights 2.

Discussion

This paper, above all, shows the promise of unsupervised learning (which I think is very fertile ground) and demonstrates it in a game. There has been much more research on dynamic scripting since, and I expect many more research projects applying LCS-like technology into games…

For this paper in particular, what surprises me the most is the lack of references to Learning Classifier Systems. Pieter’s research is uncharacteristically well researched from the game development side, and that’s great to see! However, there’s a huge amount of knowledge and experience available in the field of LCS, particularly in how quickly a system learns and how it behaves when solving difficult problems.

As it turns out, the biggest problem with dynamic scripting is the very same that Learning Classifier Systems suffer from. While they are very good at learning to solve problems realistically, the process of them learning is not very realistic. Random exploration is a requirement, but it can appear uninformed and illogical. This would case problems in games where coherent behaviors are required (unlike combat).

Dynamic Scripting Statistics

Figure 4: Dynamic scripts get better over time, but do they learn realistically?

Basically, it’s the premise of generating these long “procedural” scripts from rules that causes the problem. The representation itself of these scripts makes it hard for designers in the first place, so simply hooking them up to an AI that uses random exploration and reinforcement learning will raise other problems!

Since game developers have been increasingly using planners to reduce complexity of their AI logic, it’d be interesting to see how these two approaches combine together. I raised the question in this article The Secret to Building Game AI that Learns Realistically; it’d be a fascinating research project.

What do you think about the technology behind dynamic scripting, and its applicability to games?

7 Comments ↓

#1 Robin Baumgarten on 01.03.08 at 5:30 pm

I’ve met Pieter Spronck at the Kick-Off meeting for the Artificial Intelligence and Games Research Network last month, where he gave a key-note on the problems of AI in current games (specifically RPGs) and also gave a short introduction to dynamic scripting, and its application to Neverwinter Nights. He himself stated that dynamic scripting is not meant to resolve the problems he mentioned in the first part of his speech (and I think the main reason for him adding dynamic scripting to the talk at all was because Peter Molyneux canceled his key-note just a couple of hours before he was due to give it — ah these gaming celebrities!).

In fact, the dynamic scripting approach gets harder to apply to more complex problems, for example to real-time strategy games where coherent behaviour is required. Afaik, a student of Pieter is researching into the applicability of DS to RTS, and the DS-algorithms required some major modifications.

So I agree with Alex, DS is restricted to a small set of problems where randomness in the learning process can be tolerated. More sophisticated strategies where actions depend on each other cannot be learned easily or have to be grouped to one big action, which again requires the intervention of the designer.

However, DS was quite successfull in Neverwinter Nights, it could easily defeat the built in AI, and it took Pieter only a very short time (iirc an hour or so) to adapt his algorithm to beat the next AI that the Bioware team released some time later as a patch for Neverwinter Nights that should improve the AI substantially.

#2 FuriCuri on 01.03.08 at 7:54 pm

Nice. Very interesting, as usual.
All I want to ask - does Halo 3 implement any dynamic scripting? Cause I just finished Halo 3 and all I can say that I very pleased with the opponents AI - it’s just so good, so fun to play with. It’s like each mob has it’s own AI written personally for him for only this instance and location.

I ask this cause I have doubts about usefulness of AI dynamic adaptation for the best playing experience.

PS: btw, it would be great if article discussion will be automatically created in forums (each article has it’s own thread).

#3 The Recursion King on 01.04.08 at 8:39 am

I read this with great interest - a simple way to implement a learning AI in a game. Very good stuff. However, I see an additional weakness to the ones mentioned. Having to do the analysis of success (fitness) of a strategy outside of the encounter itself is very limiting. We (as people) perform analysis of the current situation and adapt our strategy to it within the encounter itself, while this technique does not. For example, we may be able to recognise what the opponent is trying to do and quickly adapt our strategy to counter it, turning an otherwise losing position into a winning one. If the fitness criteria could be evaluated within the game, and have an affect on the rules weighting within the encounter itself; and it was coupled with an analysis phase of what the opponent is doing, I think you would have an extremely powerful system.

#4 Mikkel on 01.04.08 at 9:27 am

In his Ph.D.-thesis, where he first presents dynamic scripting, Pieter Spronck also applies dynamic scripting to the RTS-game WARGUS (an implementation of Warcraft II), though it did, as Robin mentioned, require some modifications from the RPG-based solution. Particularly interesting is how offline-learning is used to improve behaviour based on results found by online-learling. These two techniques have great promise for complementing each other. It’s not difficult to imagine online-learning being used to provide a challenge as the game plays (obviously), while collecting data from these games (with thousands of customers playing there should be plenty) to use in offline evolutionary learning. As the process is fully-automated, producing new AI patches would be a breeze, though verification requires some thought. To some extent it can be done automatically as well, by pitting them against strategies from the previous patch. It could be useful to have different batches of strategies, some performing well against rushes, some performing well against defensive players etc. Also, Pieter suggests that machine-learning can be used to verify and improve manually designed strategies, either directly, by doping with the designed strategies, or indirectly, by inspecting strategies generated automatically against the manually designed ones. I can’t quite remember what his student is looking into, but I think it had something to do with the technique not scaling well to more complex scenarios (or maybe I’m completely off base).

An important point about DS is that it is not about evolving complex behaviour from a clean slate, both Pieter and many others (for instance John Laird) are sceptical whether or not it is possible. If not computationally infeasible in general, it is at least infeasible in games. Instead it’s about adapting already complex behaviour by allowing the computer to make informed decisions about how to choose between suitably coarse-grained building blocks of behaviour. There are some interesting parallels between the process of doping in evolutionary AI (initializing the population with good solutions to hard instances) and this type of online-learning.

I do think there is great potential for DS or DS-like techniques in many types of games, but, as both Alex and Robin mentioned, it is not a one-size-fits-all solution, but then again what is, and it is not entirely trivial nor obvious how to modify the technique. For any game, what is important is that there is some suitable domain knowledge (for DS the individual scripts are pieces of domain knowledge) that can be used to generate intelligent behaviour. For HTN-planners and behaviour trees I suspect that a DS-like technique could be employed to choose between meaningful decompositions. Preconditions on actions would allow pre-pruning the available decompositions before making a selection (custom selector Alex!) based on weights calculated in a DS-like-manner or an automatic ordering. (In fact I’m going to investigate this over the summer in my MSc-thesis). Note, that there are certain restrictions that further helps DS produce intelligent behaviour, e.g. never drinking a healing potion if at full health. Thus available decompositions are analogous to available scripts in DS.

I don’t think that dynamic scripting has even been shipped in a commercial game (including Halo 3), though Pieter did mention that BioWare were investigating it further for use in Dragon Age. He couldn’t give details, since he hadn’t been told more than that. However, there is no doubt that the technique is not just interesting for academics, considering how enthusiastically developers responded to Pieter’s presentation at the Kick-Off event.

#5 Kieran on 01.04.08 at 12:52 pm

Here are my notes about Pieter’s talk from the AI for games network launch event that I also attended..

As an academic, he really understood the issues that game developers face (he had actually taken the time to speak to some!). He gave some very comical examples of how some game AI is as bad now as it was 20 years ago. He then went on to talk about his research using machine learning, both offline where it is adapting to the game engine and online where it is adapting to the player. He then described and demoed a dynamic scripting solution where a typical NPC script is written on the fly using rules from a knowledge base. The weights of the rules in this knowledge base are updated during game-play as a reaction to winning or losing fights. This way, the AI can adapt to a player’s playing style. He showed that his system can adapt and beat a new (never before seen) static script within 3-10 fights. He has demoed this technology to Bioware who may be using it in the future. While creating his system, Bioware created some statically scripted AI over a year for the game test-bed and the dynamic scripting approach generated scripts that would beat that AI in less than 30 minutes. He has also implemented automatic difficulty changing into his system which will adapt so that a player will win and lose an equal number of times. While he agrees that this isn’t suitable for all games and for all users, he thinks that this system is good for novices learning to play the game. He wants to create AI that can be dropped into another game and have it just work. He also believes that there will be computer game AI that will pass the Turing test in 10-20 years.

#6 Andrew on 01.06.08 at 9:41 am

I'm sure I read about this before. It was pretty cool, but in the case of Neverwinter Nights, also can be pretty random. Robin, just so you know, the default Neverwinter Nights AI in either 1 or 2 (even with latest patches) is pathetic, and mainly built on not doing anything immediately dumb then putting up any intelligence, since it works on the basis of "get best spell, try casting it", which ultimately fails in many cases - the next best option being "get random spell". :(

Still, it is good to see it put into action and able to learn at least reliably. For combat it would help - although in an RPG like NWN there is a great many unaccounted variables, in simpler ones it'd be easier - Oblivion or somesuch - to code it in with this type of learning. Usually needs to be given a base "understanding" though, since otherwise it could learn some very odd behaviours unless the opponent was another AI bot which was programmed to be not dynamic.

#7 John LaBouchardiere on 02.15.08 at 2:32 pm

This is all very interesting, for me I have been playing games all my life. From nes to Ps3 and PC stalker (what a disapointment on the AI life cycles that was, as well as the hyped AI ability to finish the game) games are easier for me now then when i was yonger. I suppose this is due to me learning different aproches AI uses and recalling the most relavent one to help me in the my current situation.
So for me, until AI has massive amounts of past experinces against alot of different players and play styles, it wont be challenging to me without using cheating methods like fixing random events in its favour or having god like knowledge of my actions without me being able to see the AI’s like fog of war in RTS.

Using repeated tests of methods and adding or reducing weight to positive outcomes will never be successful as its complete approach is statistic based, not relative to situation and the special skills of the current opponent.

Players are more then capable of exploting systems of evaluation in maths and statistics at very short notice, eg the game “falling forever” where the AI boss uses weapons that appear most effective against you, but can be tricked into think its effective by the player allowing themselves to take damage from easily avoided attack.

Leave a Comment

You can also reply to this thread in the forums.

Game AI Character