Runtime Performance with D and its GC

15.09.15    gamedev    D programming language

Again and again I read opinions at the D programming language forums that D cannot be really fast. Some people say that the slow conservative GC of the D runtime makes long-running real time application impossible to write in D. This does not at all fit my experience.

I have written a game completely in D. From the (limited) feedback I received I can tell, that everybody who tried it had no technical problems at all. No hiccups, no freezes. I did not use custom runtime and I have happily used GC when needed.

I do understand that my use case is far from normal. But it is a real time application written in D. Real time applications tend to be diverse and have strange requirements. There is not really such a thing as a normal real time application. So the claim that D cannot be performant and is therefore rendered absolutely useless when given real time requirements is plain wrong.

One has to keep in mind that allocation of memory is generally slow. It is not something that should be done in tight loops anyways. There is an amount of bookkeeping by the OS and otherwise which cannot be eliminated by improvements to the GC. The fastest program is the one which does not allocate at all.

Now, given this knowledge one finds that allocation through the GC is not that slow in comparison. There is some additional bookkeeping compared to malloc. But keep in mind at least a part of this bookkeeping would have been done as well in a well designed program using malloc.

And what about the feared GC collection cycles? Let me tell you there is no such thing a dedicated GC thread. The possible moments for GC collection cycles are only when allocating through the GC. And even this possibility can be temporarily stopped using GC.disable().

The only real downside is when multiple concurrent threads might trigger a GC collection. All threads will have to be suspended until the collection is finished. This is a real possibility for a major improvement: a concurrent GC that does not need to suspend other threads. But as it turns out this is generally a non-trivial problem.

From this we can infer that just we have to be somewhat mindful when using the GC especially in tight loops. Being mindful about memory allocation is something one should be doing anyways. The attitude that memory is free is a bad habit spread by language like Java.

To behave reasonably one first has to understand when allocation is happening. There are only three ways the allocation on the GC heap can be triggered:

  1. Obviously, by using the new keyword, most often used to instantiate classes.

    auto object = new Object;
    
  2. By using an array literal (when not immediately assigned to a static array variable/parameter)

    int[]  needsGC    = [1, 2, 3, 4]; // on heap
    auto   needsGCToo = [1, 2, 3, 4]; // arrays default to slices
    int[4] onStack    = [1, 2, 3, 4]; // forced onto the stack
    
  3. Closures. These are only ever created when creating a delegate which references the current scope.

    int scopeVar = 2;
    auto writeIt = (){ writeln(scopeVar); } // closure over scopeVar
    auto addSome = (int p) => scopeVar + p; // ditto
    auto multTwo = (int p) => p * 2; // no closure although it still defaults to delegate
    auto func    = &std.math.cos;    // no delegate, no closure
    auto deleg   = &(new Object).toString;  // delegate, no closure
    

This all sounds as if one has to work around GC, as if the GC is a burden. But it is not. The GC is a solution to some memory allocation tasks and it is available to you when you need it in D. But it is not the solution to all problems.

When programming BBB I have used GC-allocated polymorphic objects to model game and UI objects. I have used closures for UI callbacks where appropriate. But I have also made heavy use of ranges and arrays of structs when needed.

D gives you such a wide range of tools to avoid allocation. There has even been a talk by Walter Bright about this. Use them: value-type structs, array slices (seriously, these alone are absolutely great), and ranges (essentially slices on metaprogramming)!