A Round-Robin Scheduler for Cooperative Multi-Tasking Using Greenlets in Python

Alex J. Champandard on February 27, 2008

Game logic is highly concurrent; not only does each entity in the world have its own main script, but there are typically many event handlers and latent behaviors that can run at the same time also. Luckily, there are many software engineering techniques to deal with this — in particular custom task schedulers.

This article has multiple purposes. First, it introduces the idea of a round-robin scheduler as a solution for this kind of multi-tasking — explaining what you need for it to work in your own game using executable pseudo-code. Second, it gives you a little background into this kind of concurrency for the Python programming language, notably stackless or the greenlet modules, showing how you can implement one using the other.

Those of you looking for a more official emulation layer for providing stackless features on top of greenlet, you should look at PyPy’s implementation right here.

Cooperative Scheduling

There are many ways to implement multi-tasking. One of them is to use cooperative scheduling, as used by the infamous Windows 3.1 operating system. The idea is to let each script or task terminate of its own accord when it’s ready to do so, only then proceeding to the next task.

In practice, this works out very well in games, for the following reasons:

  1. You control of the implementation of each of the tasks to prevent any of them from hogging resources,

  2. You can easily tune the granularity at which the processing happens for each task, and

  3. You have the option of building custom schedulers easily to make sure you maximize hardware usage.

That third point is what this article focuses on, showing you how you can build a simple scheduler to manage these tasks. But first, a little background…

A History of Multi-Tasking in Python

Since Python version 2.2, it’s been possible to implement a simple form of task concurrency using generators. Generators are essentially an implementation of the iterator pattern; each loop the code performs some computation and returns a value. While it’s supported out of the box, the down side of this approach is that it doesn’t allow nested functions and you can’t jump from one generator to another: you must always return a value to the caller.

from __future__ import generators

Stackless Python was the the first major project to provide general tasklets that allowed you to jump arbitrarily from one script to another without any restrictions. Under the hood, Stackless is implemented as a fork of the main Python codebase with non-trivial changes to the interpreter to make sure no data is stored on the C-stack — whence the name Stackless. The down sides of this implementation is that it requires a custom executable, and it only exposes a high-level scheduler from which you can re-implement low-level coroutines on top (these are more convenient when you need fine control over the sequencing of your scripts).

# This module is typically built in to the interpreter.
import stackless

The newcomer on the scene is the greenlet module, which rather amazingly does the same thing that Stackless does, but as an optional module separate from the Python interpreter! This presumably uses some non-standard manipulation of the C-stack to save and restore it as necessary — like the protothread library. This explains why the library is called py.magic! The only disadvantage of this solution can be a blessing too: you must build custom schedulers yourself.

# This library is external to the main interpreter!
from py.magic import greenlet

The ideal solution would be to use a scheduler interface primarily, breaking down into coroutines when more control is necessary. This article will show you how to rebuild the stackless scheduler on top of the greenlet module, thereby no longer requiring you to use the custom executables for running Python scripts.

Round Robin Algorithm

One of the most widely used scheduling algorithms is round-robin, which balances execution opportunities as fairly as possible while still being simple to implement. Basically, all the tasks to be scheduled are stored in a linked list. When the current tasks decides to yield control, the next in the list is picked.

This is basically how Stackless Python works. Sample usage looks somewhat like this:

# Main entry point for our coroutines.
def think(name):
  """Do some processing and yield control each frame."""
  for i in range(1,10):
    print name

# Setup two coroutines bound to the function above.
t1 = stackless.tasklet(think)
t2 = stackless.tasklet(think)

# Call the coroutines, adding them to the queue.

# Keep running the coroutines until there are none left.

This will output alternating lines with A and B, ten of each. See this wiki page about the Stackless scheduler for more information.

Coroutines Explained

Typically in a multi-tasking environment, the underlying framework needs to support coroutines (a.k.a. micro-threads). Coroutines are like simple programs with a stack and an instruction pointer (routines), but lightweight enough for them to run interleaved on the same hardware thread in a time-sharing fashion.

For instance, the greenlet module needs only the following functions to allow the code to switch arbitrarily between these different coroutines:

# Store the main greenlet.
root = greenlet.getcurrent()

def think1():
    """Simple parameterless helper."""
    print 'A', 1
    print 'A', 2

def think2():
    """Function called directly by main."""
    print 'B', 1
    print 'B', 2

# Setup the low-level coroutines.
t1 = greenlet.greenlet(think1)
t2 = greenlet.greenlet(think2)

# Let the last coroutine call the first and return.

This will print B1, A1, B2, A2. When greenlets expire, they jump back to their parent, i.e. the greenlet that spawned them.

Channels for Data and Control Flow

One major advantage of Stackless is that it supports channels as a way to communicate between different coroutines. This isn’t only useful for passing data around, but it also helps manage the control flow too. Here’s a quick example:

ch =

def sender():

def receiver():
  print ch.receive()

That covers extremely basic usage of the API, but here’s how they work under the hood.

  • Channels don’t contain permanent data; they just help the sender and the receiver connect.

  • Tasklets block when they are sending or receiving if there’s no other coroutine in the queue at the other side. In this case, the next tasklet to execute is picked by the scheduler.

  • When sending or receiving from a channel that has a tasklet in the queue, it is unblocked and resumed immediately.

It’s possible to build many different types of control structures on top of this low-level functionality, including a round-robin interpreter…

A Simple Implementation

To implement the scheduler, only one main data-structures is required. This could be a list, vector, or deque depending on performance in your language of choice.

from collection import deque

# This deque works efficiently for front and back inserts.
_scheduled = deque()
# Setup the main tasklet from the current greenlet.
_current = tasklet(greenlet.getcurrent())

The tasklet is built as a simple wrapper around the greenlet object itself. The scheduler, however, is more interesting.

def switch(next):
  global _current
  _current = next

def schedule():
  """This essentially means yield() or suspend()."""
  global _scheduled
  global _current

  # Don't queue tasks stuck in channels.
  if not _current.blocked:

  # Keep getting items from the front of the queue.
  while True:
    next = _scheduled.popleft()
    # Discard dead tasklets immediately.
    if not next.alive:
    # This tasklet is now ready to run.

Given the example for the channels above, here’s how you could implement them:

class channel(object):
  def __init__(self):
    self.receivers = []
    self.senders = [] = None
    self.balance = 0

  def send(self, data):
    # When the queue is empty, just block and yield control.
    if len(self.receivers) == 0:
        self.balance += 1
        _current.blocked = True

    # Activate the receiver and prepare the data!
    g = self.receivers.pop(0)
    g.blocked = False = data

  def receive(self):
    # Tasklets are waiting with data; activate them...
    if len(self.senders) > 0:
        next = self.senders.pop(0)
        next.blocked = False

    # Otherwise, if there are no senders, just pause 
    if is None:
        global _current
        self.balance -= 1
        _current.blocked = True

    # Reset data item for next time. Return its value.
    data, =, None
    return data

To wrap things up, here are the last few functions needed for implementing the stackless API:

def getruncount():
    """Number of other tasklets queued up for execution."""
    global _scheduled
    return len(_scheduled)

def run():
  """Continue until this is the only active tasklet."""
  while getruncount() > 0:

I hope this article was both informative, a welcome break from the usual C++ behavior tree code that normally features on Wednesdays. If you have any comments, feel free to post them below!

Discussion 2 Comments

NinjaItachi on February 29th, 2008

I'm impressed Python can now do that! The lack of corroutines, and a good way yo make a scheduler, was one of the reasons we had for switching to Lua. Anyways I think I'd still stick to Lua if you are scripting a C/C++ engine, its just simpler to integrate. But if your going all out Python this is definately useful.

aash29 on April 25th, 2008

Typo? [CODE]from collection import deque[/CODE] package that contains deque is called collectionS

If you'd like to add a comment or question on this page, simply log-in to the site. You can create an account from the sign-up page if necessary... It takes less than a minute!