Machine learning conceptually has many benefits for games, notably for reducing development times and creating AI that can adapt to the player. However, it is difficult to apply in the real-world! Transfer learning can help by improving the speed and quality of the learning. The idea is to use knowledge from previous experiences to improve the process of solving a new problem — just like human players learn general RTS strategies and apply them to different maps.
This week’s Thursday Theory article looks at a paper entitled Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL from the Georgia Institute of Technology. But first, a quick explanation of those acronyms:
Case-based reasoning (CBR) is a set of techniques for solving new problems from related solutions that were previously successful.
Reinforcement learning (RL) is set of algorithms for solving problems using positive or negative feedback from the environment.
Theory aside, the paper also has a few interesting design decisions in the way the AI is structured — which should help anyone currently developing an RTS type of game. See the discussion section below.
For an RTS AI to be applicable to many different maps, programmers must give it general skills and it helps if level designers annotate the maps sometimes. However, applying machine learning (ML) to the problem as a whole is much harder, as the system learns specific details about the map — which can reduce its ability to work in other maps. (This problem is known as over-fitting in ML research, i.e. learning too specifically from examples.)
The purpose of this research, beyond just applying machine learning to RTS games, aims to:
Show that transfer learning can help the AI adapt to new levels and situations in a much quicker way than having to learn from scratch.
Develop a system that’s capable of achieving transfer learning in a commercial RTS game, in particular by combining modern AI techniques together.
DARPA has identified ten (10) levels of transfer learning that can take place. This paper tackles the first five (5): memorization, parameterization, extrapolating, restructuring, and extending.
Figure 1: Two simplified RTS maps to demonstrate transfer learning.
Generally speaking, there are three things that make this particular research project interesting:
A multi-layered architecture that helps the AI stategically as well as tactically.
The combination of reinforcement learning with case-based learning.
Design decisions in how to model the problem for the different levels in the architecture.
This is implemented as follows:
There’s a central database with a collection of rules, mapping state to all possible actions (and at which utility value). These are used by the other two components of the system as the basis for the tactical behavior.
A learning component takes feedback from the environment, and updates the utility value of each action. This is done using a reinforcement learning policy called TD-learning, which estimate if there were any improvements since the last step.
The planner then takes these rules, and computes a plan of action randomly based on the utility of the actions.
Apart from the planner, these ideas are very similar to John Holland’s learning classifier system (LCS), and particularly Stewart Wilson’s version called XCS . Learning classifier technology is, however, not mentioned in this paper (seems like a common omission).
Screenshot 2: The architecture of this hybrid Case-Base Reasoning & Reinforcement Learning system.
Abstract & References
Here’s the abstract for the paper:
“The goal of transfer learning is to use the knowledge acquired in a set of source tasks to improve performance in a related but previously unseen target task. In this paper, we present a multi-layered architecture named CAse-Based Reinforcement Learner (CARL). It uses a novel combination of Case-Based Reasoning (CBR) and Reinforcement Learning (RL) to achieve transfer while playing against the Game AI across a variety of scenarios in MadRTSTM, a commercial Real Time Strategy game. Our experiments demonstrate that CARL not only performs well on individual tasks but also exhibits significant performance gains when allowed to transfer knowledge from previous tasks.”
You can download the paper from the website (PDF, 352 Kb):
Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL Sharma M., Holmes M., Santamaria J.C., Irani A., Isbell Jr. C.L., Ram A. International Joint Conferences on Artificial Intelligence, 2007.
Here’s how I think the technology in the paper ranks in practice:
- Applicability to games: 5/10
- It would take a fair amount of R&D to apply this kind of technology into a game. Reinforcement learning is very useful, but it does take a certain experience. Certain games (including RTS) would benefit from this unsupervised learning, although there are many alternatives.
- Usefulness for character AI: 4/10
- The technology is flexible enough to apply to character behaviors, but it’s challenging to find ways for the AI to learn realistically when faced with complex problems. In this case, the RTS strategies and tactics chosen are relatively tolerant of random behaviors.
- Simplicity to implement: 3/10
- This architecture is a mash-up of modern AI techniques, as well as a few custom bits of math for the reinforcement learning. This will take a fair bit of time, and some experience in machine learning to pull it off.
Screenshot 3: The ability of the system to do transfer learning in restructured problems, and extended problems.
While the technology may not be robust enough to include in a game right now, this kind of research is very promising. The combination of planners with unsupervised learning was discussed previously in this article on online adaptation, so it’s becoming an increasingly popular approach.
On the design side, there are a few interesting things to note about the way the problem was solved:
The top level strategy is hard coded. In this case, it helps give the machine learning some guidance, but the top level of an AI architecture is usually very simple and rarely needs too much work.
- Tactical decisions comprise of: Attack, Explore, Retreat and Conquer Nearest Territory. This is a very simple action set, but it’s more than enough to entertain the player. There are many more actions at the lower level to fill in the details.
The information used by the AI to make tactical decisions is the following: average health of living troops, percentage of the initial strength, opponents percentage of strength, territories owned on the map, and percentage of enemy territories.
Regardless of technology, these little decisions are often what make or break any AI system. This process is typically called feature selection. If you’re building an RTS using a traditional scripting method, think about giving your AI this additional information too!
What do you think about the technology in this paper, and its applicability to real-time strategy games?