For our example, templates compile faster than generated code. With a few tricks, they are also faster for incremental builds.
Templates are a polarizing feature of C++. On the one hand, people love the ability to squeeze out extra performance and add syntactic sugar to their projects. On the other, some argue that they are slow to compile, bloat binaries and give incomprehensible error messages.
This article will look at one common use-cases of templates and determine if compile times are improved by using code-generation rather than templates.
We’re going to look at a simple vector class with a size determined at compile-time. The usage of this class might look like this:
The implementation is quite simple: we have a regular C++ struct
with one type parameter for the length of the vector:
However, we could also implement this class using the preprocessor!
These two implementations give us equivalent functionality. However, for a given size N
, the first generates a vector class using templates and the second using macros.
The question is: which compiles faster?
To compare the two approaches, we wrote a small code snippet that uses vector classes from length 0 to 256. We also tested another approach, where we took the macro implementation but ran the preprocessor before the test. This is equivalent to writing every class by hand!
We compiled each version 100 times and measured how long it took. Here are the results:
The results show that for this example, compiling templates is faster than the equivalent macro version! On top of that, templates are more maintainable, since the code is not duplicated, and the compiler can give better error messages.
If you consider how templates work, then this makes a great deal of sense. Instantiating a template is just a replacement of its template-parameters with concrete values or types. With code generation, we must parse the C++ and build the AST from scratch for each macro-result.
But What About Incremental Builds?
This test leaves out incremental builds. One advantage of using code-generation (it is claimed!) is that implementation of each vector class can be put inside of its own translation-unit, meaning that it does not have to be recompiled every time that it is used.
However, we can achieve the same effect using templates! C++ 11 introduced the extern template
construct, which tells the compiler that a template is compiled in another translation-unit:
extern template
is like a forward declaration for templates.
We can use this construct to pull the most common template instantiations into their own translation-unit, dramatically decreasing incremental build times.
And we extern template
the common cases in the vector header to prevent consumers from compiling their own version.
We then ran an incremental build (vector.cpp
already compiled). The results speak for themselves:
Next time you are trying to improve your incremental build times, consider forward declaring your templates!