Open Interview

Making Designers Obsolete? Evolution in Game Design

Alex J. Champandard on February 6, 2012

As the games industry struggles with designing increasingly complex systems and mechanics, AI is proving to be a great tool for handling this complexity. In particular, there's an increasing number of researchers and companies looking into using AI as a tool for assisting game design and consequently reduce time taken fine tuning the particular properties of game elements.

In this interview with Paul Tozour we will take a look at his current kickstarter project City Conquest, a tower defense game that breaks off the usual pattern by including an offensive part as well. Yet what we will be covering over this interview won't be the novel mechanics or features but instead we will take a look at the evolutionary process Paul used to tune up the difficult parameters that keep the game balanced.

The Big Picture

Q: Did you find it necessary to try to express 'player fun' or 'player experience' into your fitness function? Is that even possible to achieve in practice?

For the first 16 years that I worked in the industry, I thought machine learning in games was pointless. I used to love to say "There’s no fitness function for fun." Whenever a conversation turned to machine learning techniques like neural nets or genetic algorithms, I’d just say "There’s no fitness function for fun" and argue that that makes machine learning useless for games, and that would be the end of it.


A Shield Generator.

And if you take it at face value, it's absolutely correct! I mean, imagine that you had a computer program that could actually tell you how much “fun” any given part of a game was. How could a program like that even work? How could it know that a certain part of your game is too repetitive? How could it know that fighting the Swamp Boss is more fun than the Lava Boss, that level 5 isn’t engaging because it has too much traversal, that the Plasma Gun isn’t fun to use while the Lightning Blaster is a thrill?

Writing a program like that would be like solving the Turing Test, only harder. You’d have to be able to fully model the player’s perception of the game and their interaction with it and accurately model all of the thousands of little reward centers buried in the human brain to know what actually makes a human player happy, and why we get such a bizarre thrill from making plumbers jump and clicking on cows.

So of course, that's impossible.

And saying “there’s no fitness function for fun” is actually more than correct – it’s also useful! Particularly in game AI development, people who are new to the industry sometimes come in with wildly unrealistic expectations for what machine learning can do and where it’s appropriate to use it. If they come from an academic background, sometimes they seem to think it’s “wrong” to solve game AI problems by hand, and they look for a magic wand they can wave to create the whole AI at once.

You have to push back against that mindset and shatter those illusions, and remind such developers that sometimes you can use machine learning to make a good AI system better, but it’s a terrible tool for trying to make a good AI in the first place.

And for a long time, I was satisfied with that answer.

But I eventually came around to the realization that while it’s correct, it's also irrelevant!

And I had a huge change in perspective. I came to the realization that machine learning isn’t a tool for game AI – it’s a tool for game design.

You may not be able to write a fitness function to put an exact number on entertainment value, but if you can state your design goals clearly, then you can very often write a fitness function that measures whether some part of your game satisfies them.

And then it's about selecting the right design goals in the first place to create the kind of entertainment experience you’re looking for, and determining which of those design goals are appropriate targets for any kind of AI-assisted tuning.

For example, with City Conquest, I had a specific tactical role in mind for every offensive and defensive building, and a lot of ideas as to what I expected would constitute effective offensive and defensive strategies. But I also had specific design goals for how the buildings should work together:

  • Uniqueness: Each of the nine defensive buildings (towers) and offensive buildings (dropship pads for units) should be distinct from all the others
  • Minimum bound on utility: No building type should be "underpowered" – every building should be important, and it should have some clear tactical role, and there should always be some scenario in which that building is the most important to winning.
  • Maximum bound on utility: No building type should be "overpowered" – in other words, no single building type, or combination of a limited number of building types, should allow the player to win the game against a non-reactive opponent.
  • Cost equivalence: Every building in the game should have a resource cost equivalent to its actual utility in the game.
  • Combined arms: Combinations of several different building types should be far more effective than building only a small number of building types (both offensively and defensively).
  • Reactivity: The optimal strategy for each player should depend significantly on the other player's strategy.

[To be clear, my system doesn’t actually explicitly measure these in its fitness function right now, and I don’t want to sound like I’m overpromising. But it’s a good example of the kind of design goals that computational intelligence can help you optimize against, if not always directly.]

Q: Do you feel that machine learning algorithms have been unfairly prejudiced against in the game industry?


The Disruptor tower.

Not necessarily. There’s a bias, but it’s not completely unjustified.

Some forms of ML (neural networks) are genuinely useless; many of them (genetic algorithms) are much too slow to be used at runtime; the results aren’t predictable; and as I mentioned, newcomers to the industry are sometimes very unrealistic about its potential and give it a bad reputation. There’s nothing worse than an AI developer who wants to wave the magic wand of “machine learning” instead of writing code.

But I’m not doing this because I’m any kind of ivory-tower academic or a starry-eyed idealist. I’m an industry veteran; I worked on two Metroid Prime games and was employee #3 at Gas Powered Games. I’m doing it because it works, and because I’m a capitalist and I believe in raising quality and lowering costs.

Q: What led you to use computational intelligence for game balancing?

I got tired of designing AI, and I realized it was time to “AI” design.

Here’s a thought experiment:

How are people going to design games 100 years from now?

Never mind that we have no idea what games themselves will look like that far into the future. A “game” probably won’t look anything like it does today. Maybe they’ll be holodecks or something. But it doesn’t matter.

In the year 2112, how will game designers get their jobs done?

What kind of tools are they going to use?

I don’t know the answer, but there’s no question it’s not going to look like what we have today. Game design is still in its infancy – or maybe its messy and chaotic pre-teen years.

A century from now, what it means to be a “game designer” will have evolved beyond recognition. We won’t have designers giving hand-wavy answers to basic design questions, making sweeping design changes without understanding the side-effects, or spending inordinate amounts of their team’s time (and their company’s resources) exploring the design space in the name of “iteration”.

At a minimum, we’ll have a shared design language across the industry, a shared understanding of what constitutes “good design”, and vastly more powerful design tools.

Imagine that you wanted to build a bridge. You’d need to hire a licensed structural engineer, wouldn’t you? That engineer would have to have earned a civil engineering degree with 3-5 years of study under their belt, and then satisfy a host of requirements to become certified to build bridges.

And that happens because everybody understands that bridges can fail, and earthquakes and wind and traffic accidents can topple them if they’re not designed correctly, and things like that can kill people and destroy the bridges. So you have to spend a lot of time studying to understand how to build a bridge the right way.

Everybody appreciates all of that, so everybody accepts that certifying civil engineers is the right thing to do to make sure they build good bridges.

But there’s nothing like that for games.

We consistently hire people who are enthusiastic rather than people who are genuinely qualified – we prefer charismatic designers who can talk the talk over cautious practitioners who can walk the walk. We take million-dollar projects and put them in the hands of people with no adequate design training, and then we fail to train them further or build a shared design vision between them. We let them endlessly throw ideas against the wall to see what sticks, and then we wonder why we end up with burned-out teams, failed projects, and studio failures out the other end.


The Ground Slammer.

And mostly that happens because we don’t talk about what happens when designers fail. We hide the devastating consequences of design failure. When a bridge collapses somewhere, it’s too big to hide and the whole world knows about it … but in the game industry, a design collapse is painful and embarrassing and we have the luxury of hiding it in our own studios.

And that has to stop. It’s just too expensive and too risky to keep doing design that way. Design has become the bottleneck.

So we need two things to happen.

First, game design needs to grow into a real discipline. I’m not saying we need actual certification like you have in civil engineering (though that might not be a bad idea), but the discipline of game design needs get out of its infancy (or chaotic pre-teen years) in a hurry.

And secondly, we need vastly better tools.

If you were to go back in time 30 years and asked designers what tools they’d like to see in the future, they would have described some amazing tools for building levels and doing scripting in three entire dimensions. And now those tools exist! We have engines like Unity and Unreal and Gamebryo that are everything a designer in 1981 would have asked for and more.

But these aren’t the only tools we need. Unity and Unreal can’t give you any insights on your game design. They can help you build your game, but they can’t tell you anything at all about the design decisions you make.

There’s so much more that can exist, and that will have to exist someday. It’s just that we don’t see that, because those tools haven’t been built yet – so we don’t realize they should exist, and we don’t realize they’re missing!

We have game “engines,” but we don’t realize that the rest of the car is missing.

Again, I used to think machine learning techniques were useless. But I got to a point where I stopped thinking of machine learning as a game AI tool and started thinking of it as a design tool, and that changed everything.

Take Wall Street as an example. There used to be a time when the New York Stock Exchange was full of traders yelling and making complicated hand gestures to each other to buy and sell shares. It only took a decade for all those pits to be abandoned and replaced with computers. Since then it’s been an ongoing arms race of ever faster servers and ever tighter connections and lower latency for high-frequency trading within microseconds. And it will never go back to the old days.

Or I could compare it to baseball recruiting. Baseball recruiting used to be a very subjective process. It was ten managers sitting in a room arguing about who to recruit to fix a broken baseball team, without really understanding what they should be looking for and no clear idea how to properly evaluate the factors that make a baseball team work. “We can’t hire that guy – he has an ugly girlfriend; that means no self-confidence!”

And in the book and movie Moneyball, you see how Billy Beane (played by Brad Pitt) hired an economics PhD from Yale and put together a statistical modeling approach that ultimately drove the Oakland Athletics to a staggering string of victories and changed baseball recruiting into a quantitative discipline. And it will never go back to the old days.

Game design right now is like the NYSE before computers, or baseball recruiting before Billy Beane.

And it will change, so we might as well embrace it. Moore’s Law is still in effect, and we have a massive amount of computing power at our disposal in cloud computing clusters.

Every day, computing power becomes a little bit cheaper and game designers become a little bit more expensive.

We’ve finally arrived at the point where we can actually run the millions of gameplay simulations we need to run to do GAs effectively and get those insights back into the hands of designers in a few hours or days, and it’s inevitable that that will transform the design process.

Q: So can we get to a point where we replace designers once and for all? :-)


The Lightning Tower.

My hope is that the future will bring much more powerful tools to bear on game design problems. We’ll have all the powerful game engines we have today, but we’ll also have the equivalent of a GPS and navigation system for game designers to guide them toward the decisions that satisfy their design goals.

And that will also raise the bar for what a game designer is expected to do to work effectively. So the best designers will have more powerful tools that help them design better games, faster … and the pretenders will find themselves out of their league and will probably have to find new jobs.

Both of those are good things.

Q: Currently you're using AI as a tool to assist the design process. Do you think it's possible to use genetic algorithms to optimize the game's parameters rather than just finding the dominant strategy?

Definitely. My approach is just one particular way of using GAs for one particular kind of balancing problem. It’s only a starting point; it’s not the only approach and it’s not necessarily the best one.

Phillipa Avery and Julian Togelius wrote a really neat paper (with Alistar and van Leeuwen) on using GAs to tune parameters for a more traditional tower defense game, where they evolved the parameters for their creeps and towers directly.

Alex Jaffe of UW also has a really terrific paper coming out on this topic (along with several co-authors). I’ve gotten a sneak preview but I’m sworn to secrecy.

There’s one company I know of with a broadly similar overall concept – the +7 Balance Engine. But their approach is very different. Their webpage says that they want to ensure that there are no dominant strategies.

I don’t think that’s a valid approach. You have to empower designers, and let them decide whether there should be any dominant strategies. Maybe they do want one dominant strategy! You can’t take someone else’s game and say “Here, let me balance that for you” because you can’t know what they mean by “balance.” You can’t anticipate all of their design goals in advance.

It’s about giving valuable feedback to the designers so they can ensure that their gameplay meets their design goals.

The big-picture goal is to build systems that dramatically expand designers’ visibility over the dynamics of the gameplay they create. We’d like to build systems that shine a floodlight on the fitness landscape and which player strategies emerge from every design decision.

And we want to be able to give them visibility on how any design change they could make will affect those dynamics, so designers can ask questions like, “what happens if I tune this parameter here – how does that change the set of viable player strategies?” Or alternatively, “What would I need to change in my parameters to make strategy X more (or less) desirable?”

Behind The Scenes

Q: How would you rate the importance of the underlying evolutionary algorithm compared to the overall infrastructure/process? How long did it take to implement this part of the code?

It’s been ~10% of my total engineering time. There were 3-4 days of dedicated coding to create Evolver in September – we had to split Evolver into a separate build, disable the renderer, increase the time step in the update loop, optimize the gameplay code, and then make it parallel using Microsoft’s PPL library. That got us to a point where we can run several million entire games overnight.

Since October, it’s only been 30-90 minutes a day of tuning. We have a dedicated Dell desktop PC running Evolver 24/7. Every day or two, we terminate the Evolverand extract the two best players by fitness score (one each from the red and blue player populations). Then we plug them into the game, play them back, and watch how they play each other and decide how to tune it based on that gameplay session. Then we make the code changes and run it again, and only occasionally dig deeper and make tweaks to Evolver itself.


The Grenade Launcher.

So that’s very handy. It’s like having a virtual playtesting team that works for free. It cost me a few days of development, but when I compare that to the salary I’d have to pay somebody to do that full-time, it’s a no-brainer.

Of course, it’s not a magic bullet. You still have to do manual playtesting, and it’s still a good idea to plug your parameters into Excel every now and then and fit them to a curve with Solver. But those are now small, secondary tasks, not the core of the balancing process itself.

So once I get Evolver’s output, I can plug it back into my game and watch the red and blue players duke it out. If I see a pattern where both players are over-using certain buildings or upgrades or combinations of buildings, then I probably need to “nerf” those features somehow – either I need to make them weaker, or I need to increase the resource cost to buy them in the first place.

Similarly, if there are certain buildings or upgrades that they never buy at all, I need to stop and ask myself why. Are they too expensive? Are they not useful enough? Or is there something else going on? Why is the Evolver not finding it useful enough to survive evolution? So I’ll increase its utility or decrease its cost to compensate.

And finally, I look at the context in which everything is being used and see if it matches my design goals for each unit. Sometimes I find the Evolver adapts the strategies I expected it to; other times, it surprises me, and I have to figure out whether I should tune the game to stop it from happening, or embrace it as an unexpected new strategy.

Some of the AI researchers I've been speaking with have raised the issue that my approach can really only identify one dominant strategy at a time. For example, you might have two strategies, A and B, that are both dominant, and with the way I’m doing it, the computer can only identify one of those dominant strategies.

And that's true, but you have to keep in mind that tuning is an ongoing process – it's not something you do in a single pass! I'm running the Evolver continuously and re-tuning every day or two based on its output. So if it only happens to find that A is the dominant strategy today, that’s fine, because it means I’m going to respond to that, and A will get nerfed so that B has a much greater chance to stick out like a sore thumb the next time I run it.

Q: Can you give us some examples of interesting results Evolver has given you?

This is more of a bug than a balancing issue, but I had a really interesting phenomenon on October 30 where the best red player script was always winning the game by a huge margin. And looking through the evolved script, it didn’t seem to be doing anything really unusual; the only unusual thing I could see was that it was building a Laser tower at a particular spot fairly early in the game.

Now, the Laser tower is a very unusual building in City Conquest. It’s the only defensive tower that doesn’t have a range limit – it can fire all the way across the map. It was designed to do a very small amount of damage to a very large number of units. The trick is to set it up the right way to ensure that it hits as much of the enemy army as possible.

So when I started playing back the scripts in the game, I was floored – the red player’s Laser was positioned in just the right spot where it could target the blue player’s Collectors from all the way across the map! It took 8-9 shots to take them out, but once it did, the blue player’s Collectors were destroyed with no way to rebuild them, so it would be starved of resources and it would lose every time.

Again, it’s more of a bug than a feature – Collectors were supposed to be invulnerable, and Lasers shouldn’t even be able to target them anyway, but invulnerability and targeting tweaks hadn’t been implemented yet and were scheduled for a later date. But it goes to show you how computational intelligence can find a path through the design space that a human designer might never even imagine.

A Laser attacking the blue player’s Collectors. The red player’s Laser is firing from offscreen to the upper-left.

Here’s a different example that shows some really good behaviors that I think closely mirror what a smart City Conquest player would do. These are two screenshots from a recent run that I discussed in Update 4 of my Kickstarter campaign. Again, keep in mind that everything you see in these screenshots (excluding the pyramid-shaped Mining Facilities on the crystals) was evolved entirely by the computer.

Here’s the blue player’s base:

There are five really interesting things going on with this base:

  • Single point of entry in the back of the base: The blue player forces all the enemy units to funnel around to a single attack point at the rear of its Capitol. This is the best possible positioning for the attack point, because it forces enemy units to go all the way around before they can attack, and it gives the blue player many more opportunities to defeat them.
  • Dividers to separate enemy forces: The blue player divides and conquers the red player's forces, splitting them between a northern and a southern route around his city. Since one route is longer than the other, enemy units will arrive at different times, and this makes them much easier to deal with as they will reach the Capitol at different times.
  • A well-placed Disruptor (the floating sphere with yellow stripes) near the top entrance allows him to interrupt the red army's cloaking, shielding, and healing effects at exactly the right time, just as they're beginning to get truly pounded by the blue player's defenses.
  • A fully-upgraded Shield Generator unit toward the upper left (the dropship pad with three aqua-colored lights) allows the blue player to protect nearby units from damage.
  • A set of 3 Lightning Towers cleverly dispersed at the front, side, and rear of the blue player's base, ensuring that these expensive buildings rarely waste time targeting the same units simultaneously.

Here’s a screenshot of the red player’s base. Notice how it employs many of the same intelligent strategies as the blue player, which are exactly the behaviors I intended to be good strategies in City Conquest.

  • Single point of entry: Like the blue player, the red player forces the opposing army to attack at the rear of his Capitol. The design of the base forces blue’s units around to the north, above the Mining Facilities, and then back through the rear entrance on the left side of the image.
  • Dividers to separate enemy forces: Like the blue player, the red player splits the opposing army into two different paths (east and west paths at the front of his base) to upset their arrival timing.
  • A perfectly-placed Disruptor (the floating sphere with yellow stripes above the Capitol) allows the red player to interrupt cloaking, shielding, and healing effects all along the front and side of its base (all along the north edge five Mining Facilities you see), as well as when enemy units circle around to the rear and finally close in on the Capitol.
  • A Lightning Tower at just the right spot (black spiral just above and to the right of the Capitol) gives excellent coverage, allowing the red player to zap enemy units at the front, side, and rear of its base. This placement ensures that this relatively expensive tower will see as much utilization as possible.
  • A pair of Ground Slammers, one at the front of the red player's base and another at the rear (shaped like tall pillars wearing metallic thumpers), allow the red player to create earthquakes and give area-effect damage to all ground-based units that pass by.
  • A pair of Grenade Tossers (deep holes in the ground on the left and right sides) is also nicely distributed in a very similar way, with an upgraded Grenade Tosser in the front and another non-upgraded Grenade Tosser in the rear. These allow the red player to hit all ground units in a wide area with a moderate amount of area-effect damage -- enough to weaken them considerably and allow the other towers to take them out.

Q: What type of evolutionary approach and individual representation did you use? Simply put, what did you evolve?

In this particular case, we’re just trying to evolve highly competent players to see what strategies good players use. We essentially evolve scripts of build orders.

City Defense is a tower defense variant, so it only has three player verbs: “build,” “upgrade,” and “sell” (I’m excluding the verbs for “game effects,” which are more like spells that players can cast). Also, selling buildings is usually a bad idea since you can’t get a full refund, so there’s no reason for Evolver to do it.

So an Evolver player script just boils down to a sequence of build and upgrade commands – build a building of type T at (X,Y), or upgrade the building at position (X,Y). There are no timing values associated with the commands; Evolver will ensure that each player executes a command as soon as it can afford to do so.

Also, since every building in City Conquest has a resource cost in terms of gold or crystals, but not both, it effectively processes the build order as two separate lists in parallel, one for each resource.

We start by generating 500 random build scripts for each of the red and blue players. Then we run four tournaments, each one pitting all 500 red player scripts against all 500 blue player scripts, using a random offset so that we’re pitting each blue player script against a different red player script in every tournament.

It’s probably easiest just to show you the C++ pseudo-code:

static int const skNumTournaments = 4;
static int const skPopulationSize = 500;

// Run 4 tournaments which pit all 500 blue scripts and all 500 red scripts 
// against each other
for ( int tournament_iterator = 0; tournament_iterator < skNumTournaments; 
++tournament_iterator )
    int random_offset = rand() % skPopulationSize;

    for ( int i = 0; i < skPopulationSize; ++i )
        // This ensures that ALL members of BOTH populations will play a game
        Script & blue = blue_player_scripts[ i ];
        Script & red = 
            red_player_scripts[( i + random_offset ) % skPopulationSize];

        // Now play a full game, and adjust the scores for “blue” and “red” 
        // as appropriate
        RunSimulation( blue, red );

We do 4 tournaments of 500 games each (pitting each red player script against a random blue player script) to make sure we get good coverage and reasonably accurate fitness values at the end of each tournament. Once the tournament is completed, we sort the scripts for each player by their fitness score, and apply standard genetic operators – random replacement of the lowest-scoring scripts (falling off linearly to a fixed minimum value over the first few hundred tournaments); random mutation of each script; and performing “crossover” to combine two scripts (biased toward crossover with higher-scoring scripts rather than lower-scoring ones).

A “mutation” has a random chance to change a building’s type, change a command’s location, swap the ordering of two commands in the script, or copy a command on top of an existing command.

We also added additional safeguards to ensure that scripts would stay flexible. For example, if Evolver tries to execute a build command and there’s already a building there, in most cases it will sell the existing building to allow it to build the new one and continue executing the script.

Of course, this is a relatively simple application, and it’s made easier by the fact that City Conquest lends itself to this as a TD game. Not every game can get away with a non-reactive build script like this. Most games are more complicated, and you need an entire AI system to get a reasonable simulation of a player.

For our next project, we’re planning to evolve entire behavior trees to allow us to generate more complex reactive behaviors rather than just a fixed build order. We’re also planning to move the Evolver into a cloud computing cluster and separate the populations out using an “evolution islands” approach with metadata.

Q: How do you determine the fitness of an individual? Evolutionary algorithms are often very sensitive to the definition of the fitness function.

Yes, they are. It depends on the particular game and what you’re trying to evolve.

In City Conquest, it’s back-and-forth tower defense gameplay, and the Capitol is the only attackable building. The game ends when one player’s Capitol gets to zero health. So the ratio of the health of the two Capitols is actually a VERY good indicator of who's winning. And when the game ends, the health of the winning player’s Capitol tells you a lot about how close a game it was. In an evenly-matched game, both players will do a lot of damage to each other’s Capitols; in an uneven game, the winner will destroy the opposing player’s Capitol without allowing much damage to his own.

Each script has a fitness score associated with it. After each game, Evolver takes the health of the winning player’s Capitol and adds that to the score of the winning player script and subtracts it from the score of the losing player script. This ensures that the greater the victory, the greater the effect on both scripts’ fitness scores.

When I started development on Evolver, I discussed my plans with a number of the top researchers in AI-based design and automated game balancing – Phillipa Avery of the University of Nevada – Reno, Gillian Smith and Adam Smith of UCSC, Alex Jaffe of UW, and a few others.

A few of them suggested that I set up “archetype” scripts using standardized “known” player strategies to ensure that I’d evolve against a broad set of possible human strategies. I recorded my own gameplay as I executed half a dozen standard strategies, and then for a few weeks I tried using Evolver to test against all six of those archetype scripts.

But I found that as I rebalanced my game every day based on the feedback from Evolver, it would invalidate all of those “archetype” scripts. I couldn’t commit to re-recording all those scripts every week or two as my balancing changed, so I had to abandon that approach.

Later in the development cycle, I also tried adding additional fitness qualifiers to encourage the evolved scripts to do more of what I felt a human player should do. Some of these didn’t work out, but one that did was adding a fudge factor to the fitness function for building and upgrading Skyscrapers. This is critical to success in City Conquest, since Skyscrapers expand your buildable territory, and upgrading them not only consolidates your hold over your territory but also gives you additional resources over time. But it’s the type of thing that isn’t necessarily immediately rewarded in a fitness function, because an evolved script won’t immediately capitalize on the additional territory and additional resources right when the Skyscraper is first added.

The Skyscraper fitness adjustment helps compensate for the natural delay between building and upgrading a Skyscraper and then evolving everything else you need to actually reap the benefits from it. This is purely a way of telling Evolver, “trust me, you want to be doing this, and we don’t want to wait for evolution to prove it every time.”

Final Thoughts

Q: What advice would you give to other developers thinking about using machine learning algorithms in their games?

I was the founder of my startup, so I already had the buy-in I needed from the upper management to run the experiment. And I undeniably realized major benefits from it.

But I can’t imagine trying to push something like this at any of the studios I’ve ever worked at.

Machine learning will take time to prove itself. It’s not going to happen overnight. The (often justified) skepticism toward machine learning in the industry makes it a tough sell to launch this kind of initiative internally. It’s probably going to have to enter from the outside, disruptively, rather than from inside studios that would offer enormous cultural resistance.

I’m reminded of when I worked on Thief: Deadly Shadows and Deus Ex: Invisible War at Ion Storm Austin. I had to fight tooth and nail to gain acceptance for the use of navigation meshes for pathfinding. I think almost every single programmer and designer in the studio argued against it at one point or another. Some of them treated me like I was crazy; others treated me like I had to be some kind of pointy-haired academic who ought to be off in a research lab somewhere if he cared so damned much about obscure academic issues like good pathfinding.

And now, they’re very common. I can make a list as long as my arm of successful AAA products that used navigation meshes.

But at the time, it was open warfare, and it was very difficult to make the case for something that seemed to me blindingly obvious.

It will be the same for machine learning and AI-based game design for some time to come.

So my advice is to make sure you have buy-in. Don’t put yourself through the pain if you can’t get full support and commitment at every level. Otherwise, you’re just painting a target on your back.

And if you can’t get it, and you see the possibilities, consider quitting your job. Launch a startup, or join mine!

If you liked this interview don't forget to take a look and back this interesting project at City Conquest Kickstarter Project!

Discussion 11 Comments

togelius on February 7th, 2012

Nice article! I'm happy to see evolutionary game design/optimisation making its way into published games, and looking forward to playing the results of this! We recently wrote a survey on evolutionary and similar ("search-based") approach to aid designers and create content in games. The article discusses a number of such projects, which might be of interest for those wanting to use evolution to help design or balance their game (during design time or during runtime): Also, I'm a bit surprised about the statement about neural networks being useless. They are general function approximators, and as such they do their job and can indeed be very useful (as evidenced by their ubiquity in industrial applications in non-game domains) - you just need to know what to use them for! For example, they can be used to model player preferences or player types, models which can then be used as part of your evolutionary tuning.

PaulTozour on February 7th, 2012

Thanks for the post, Julian! I really enjoyed reading your thesis a few weeks ago; I found it tremendously informative! Really looking forward to reading this one too. Yes, I was exaggerating somewhat with regards to NNs being "useless." I suppose that's my own bias showing through :) It's more that they are black-box systems, and this makes them somewhat more difficult for designers to accommodate than things like decision tree learning systems, which are typically in some form where they can explicitly understand the rules of the decision tree; also, from my (admittedly limited) experience, other techniques, like SVMs, often give better results for many applications than neural nets. My Data Mining professor at Penn was fond of saying that neural nets are "the second-best way of doing anything." So my default stance is to oppose neural networks unless someone can prove that they really are the best tool for the job -- and I'd love to be proven wrong, but I haven't seen any specific task in games where they really are.

togelius on February 9th, 2012

Regarding neural nets: I think there's in general lots of good ways to solve any machine learning / modelling problem. Searching for the "best" way of approximating a function is a bit futile, as this depends so much on the characteristics of the function and how much time and effort goes into tuning the algorithm and representation. SVMs are fine and sometimes outperform neural nets (often not), but they are at least as hard to tune as neural nets are, and at least as much of a black box (usually more). But in many cases, you just need something that does the job, at as I see it...
Happy to see someone reading my thesis - I though people just read papers nowadays!

anomalousunderdog on February 12th, 2012

Its like having The Flash work as your QA department. The bad news is his IQ is stunted to half. The good news is, its not going to matter.

PaulTozour on February 14th, 2012

> Its like having The Flash work as your QA department. Sort of, yes, although this is more for game balancing and parameter tuning than for QA (although it's definitely helped me find a couple of pretty obscure bugs in the process, too).

jcothran on February 15th, 2012

Thanks for the interesting article! How did you determine your initial population size of 500? Was it related to the length of build commands(the sequence length) allowed an individual? What rate crossover and mutation?

PaulTozour on March 14th, 2012

Re: population size: I generally find that I get the best results with genetic optimization when the population size is ~20-30% the number of evolutionary iterations I plan to do. So, it came down to how long I was willing to let it run per step (1-2 days), and it worked out that I could run a population of 500 for 2000 full tournaments in that time frame.

bvanevery on March 26th, 2012

I like the idea of automating more aspects of game design. I've manually balanced the forces, resources, placement, and scripted appearance of units for a large campaign in a linear turn based strategy RPG style game, with the goal of producing certain narrative effects as part of the gameplay. It's a lot of work to iterate that, and algorithmic tools could definitely help with that to some degree. I plan to do such in the future. I do not understand your projections to 2112. Why is the evolution of a game designer going to be towards that of a civil engineer? Why not towards a novelist, where the costs of production are nearly non-existent, and it's really all about crafting an attention getting imaginative experience in the mind of the audience? You seem to have present day concerns about expense. Perhaps that will last for a few more decades, but nobody in STTNG pays to realize their visions on a holodeck. At some point we are going to have pretty powerful tools for game design decision making, and they aren't always going to cost a lot of money. Many current problems of expense in game design can be obviated by simplifying what is desired, rather than throwing fancy AI at it. The perceived need to balance a bunch of buildings in "combined arms" fashion is an example of what I call "vectorization" in game design. One scalar quantity might have worked just fine as far as human psychological experience is concerned, but game developers turn it into 4, 12, or 20 quantities so that more art assets can be attached to each bit of chrome. More integer counters are added for stats, so that the player will take longer for the numbers to slowly tick upwards. Thus the commercial sale of the game will last longer, especially if subscription based, but it's a placebo for "content" in any event. It's a cheap way to produce. Games with vectorized resource management aren't better or more realistic, they just make the N-dimensional accounting more tedious. You could de-vectorize a lot of games just having "gold" as the resource, and "strength" as the unit capability. It pretty much works for the classic board game Diplomacy. Higher level human thinking will always be required, not just the mechanical management of spatio-temporal relationships. This is part of how game design will evolve in the future.

PaulTozour on April 30th, 2012

[QUOTE]> [bvanevery] "I do not understand your projections to 2112. Why is the evolution of a game designer going to be towards that of a civil engineer?"[/QUOTE] That was a thought experiment. The point was to underscore the vast chasm between a well-understood discipline -- as civil engineering is today -- and the undisciplined, fragmented, immature, and quite often random and highly bias-prone decision making that goes on in so much of modern "game design."

PaulTozour on April 30th, 2012

[QUOTE]> [bvanevery] "Why not towards a novelist, where the costs of production are nearly non-existent, and it's really all about crafting an attention getting imaginative experience in the mind of the audience?"[/QUOTE] ... I would love to think we can get to a point where the costs of production are non-existent. The reality is, right now, it's not heading in that direction; costs are skyrocketing across the board. The approach I discuss is one way to reduce costs in many design situations. Done properly, it should empower designers and lower costs, and lead us that much closer to what we all hope the future of game design can become. That's in line with what you're saying, isn't it?

PaulTozour on April 30th, 2012

[QUOTE]> [bvanevery] "You seem to have present day concerns about expense. Perhaps that will last for a few more decades, but nobody in STTNG pays to realize their visions on a holodeck."[/QUOTE] .... Well, yes; that's nice, but I don't really take Star Trek as a documentary. If we're really ever going to get to a point where we can have a holodeck-like level of generative creativity, we need to make pretty dramatic strides forward in empowering game designers, don't we? It's a necessary step in the development pipeline to get to a holodeck in the first place. The approach I discussed here is one way (of many) to do that.

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!