Under the Hood


This page delves into the implementation of BFPX. We will look at continuations, what they are and why we need them. From there we'll have a look at how they may be implemented using the byte code engineering library (BCEL).

This page also examines the scheduling policy of the BFPX runtime. We will look at channels and channel communication and, finally, we will look at the Java compiler compiler (JavaCC) that BFPX uses to parse your source files.

Java Threads and Continuations

We are all familiar with Java threads. They can be used to schedule work for timed execution or to make more efficient use of your processor by working on background processes while you wait for I/O. Other applications is to have multiple processors work on the same problem in parallel, to speed up processing.

Unfortunately, Java threads are expensive in terms of memory footprint, task switching and synchronisation overhead. Such costs are insignificant in a J2EE environment, where programs spend most of their time waiting for an event to occur. In a tight environment, where threads are relatively short lived, such overhead becomes a lot more noticeable.

One of the reasons why Java threads are so expensive is that they have to carry around a call stack. A typical Java thread will be executing a method that is nested ten or more deep in other methods. When the thread is put to sleep, we have to remember all the gritty little details of where we are in the program. On top of that, each stack must be big enough to hold all the local variables that a typical method declares.

All this state that we need to be able to resume a thread is called the continuation state, or continuation for short.

If we look at the brainfuck programming language, we see that it does not have a concept of methods or functions to call. There is no way to declare local variables, since we have no concept of variables either. This means that one of the most memory intensive features of Java threads, the preservation of local state across tasks switched, is never used for brainfuck programs.

Another feature of Java threads is that they are pre-emptive, meaning that they may be interrupted at any given moment in time, regardless of what they are doing. Since Java threads may be interrupted at any point during their work, we must be quite careful how we do that so that we don’t leave the sleeping thread in a useless state.

Conversely, brainfuck programs have no real need for pre-emptive multitasking. BFPX programs are either executing statements of their program, or sleeping in wait of a peer to exchange data with. In fact, all we need to preserve is a bookmark that tells us where in the code we have to jump back to to continue our work.

And finally, the Java virtual machine is capable of distributing Java threads across multiple (more or less) physical processors. Because the BFPX runtime only uses a single Java thread to execute the entire program, BFPX programs cannot make use of multiple physical processors.

Continuations in Code

All we need, in a nutshell, is a function that can return halfway, and leave some sort of marker that tells the process where the function has to be resumed. We do this by placing numbered labels in the code, and returning the number of the next label where to start executing upon re-entering the function.

If I had chosen to generate C++ from my compiler, then I could have implemented these simple continuations using a switch statement. We can simply interleave the other control structures, as shown below. I have drawn a vertical marker next to the code so that you may compare the scope of the if statement with that of the switch.

In short, this code runs from label 0 and runs into the read() function. If there is no peer sleeping on the channel to read from, we should go to sleep, waiting for a peer to arrive. The read() function returns false, to indicate that no peer was waiting. We then return the value 1 to our caller, to indicate that we’d like to sleep and when we are woken up, we’d like to resume at case label 1.

The Java programming language does not permit us to interleave switch statements with other control structures. Thank goodness for that. This means we need another method to implement the continuations. We cannot simply generate Java from the BFPX sources and use the Java compiler to generate executable code.

As you probably know, the Java virtual machine is a lot like an actual, hardware based processor. It does not know the higher level language control structures that we find in the Java programming language. Instead, all complex structures are flattened into goto statements by the Java compiler. The solution therefore is simple. We can flatten the control structures ourselves and map them onto the JVM's all-goto instruction set.

The BFPX compiler generates its own class files using the Apache's byte code engineering library (BCEL). BCEL gives us access to the actual JVM machine instructions and allows us to generate the interleaved control structures relatively safely.

The screenshot below shows what a task switch looks like in the generated client code.

We generate as little code as possible with BCEL. As you can see from the screenshot, the BCEL code and the generated code are hard to understand and difficult to maintain. To supply the generated classes with basic functionality, each generated class inherits from a lightweight process base class. This base class implements functions such as reading and writing channels and initialisation of child processes.


The BFPX scheduler is a simple randomised, co-operative scheduler. All runnable processes are placed in a list and each time a process yields the processor, the scheduler picks a random process from the list and schedules it.

The co-operative multithreading has an interesting side effect, which is demonstrated by the sample code below. Based on what we have seen in the case study, you may expect that the output of these two parallel processes will mix up the two “Hello World!”s. Instead, the output is reliable and correct at first glance: two separate, readable lines. Consider the code shown below.


In that sample, we create two processes, which each go about writing a message to the screen. Each process does so independently. There is no synchronisation between the two processes. Reasonably, we expect that the output be mashed up, but when we run the sample a few times, we see the following.

The reason for this unexpected consistency is that the second process is never scheduled until the first one is complete because the first process does not yield the processor.

This order should not be relied upon in BFPX programs. It is likely to change in new versions of BFPX, as the scheduler and channel communication mechanisms mature.


The channels are surprisingly simple, for such a central concept of BFPX. Their implementation is really only a queue of waiting processes, split into readers and writers. Processes that want to read or write from a channel check these queues, looking for a suitable peer.

photo: Col Adamsonhttp://www.sxc.hu/profile/Col6085
if you find this interesting, terrible or just would like to know more, e-mail memailto:kjkoster@kjkoster.org?subject=
web statistics