C++ takes long to compile
There is more than one reason for it but one of the reasons is excessive re-parsing of the same
.h header files.
I’m using an extreme
discipline to keep compilation times in check.
The rule is simple: a
.h file cannot
I’ve been following this rule for several years in SumatraPDF
, a medium sized C++ project of over 100k loc. It works.
“It works” is more important that it seems. Many ideas seem great on paper but fail in practice. Name an economically successful communist country.
Don’t get me wrong: the price of minimizing compilation times is eternal vigilance.
Writing C++ while following that rule is annoying.
In code, things depend on other things. If a struct in
foo.h depends on struct in
bar.h a quick fix is to
#include "bar.h" in
You do it once and it works
Done once and for all: in your
foo.c you just include
foo.h and it brings in
That convenience comes with a hidden price. Imagine you have
foo2.h that also depends on
bar.h so you also
#include "bra.h" in
#include "foo2.h in
foo.c and bang! You just included and parsed
In real C++ codebases the same headers are unnecessarily re-included and re-parsed hundreds of times.
It’s a known problem. We try to mitigate it with
#pragma once etc. but in my experience those band-aids don’t solve the problem.
Following Rob Pike’s rule we must
#include "bar.h" and
foo.c in correct order.
The “correct order” part is what makes it annoying.
Let’s face it: a month after writing
foo.h I no longer remember that it depends on
So the way it goes is:
#include "foo.h" in
- I get a compilation error
what is this Bar you're referring to?
- I dig around and figure out that
Bar is a struct defined in
bar.h so I
#include "bar.h" before
- I get another compilation error
what is that Bar2 you speak of?. This could be unmet dependency from
foo.h or newly included
- I keep adding 10 more
#include to satisfy their cascading dependencies
What used to be a simple
#include "foo.h" can end up a lengthy game or
So beware: following this extreme rule will be occasionally painful.
I wasn’t following this rule from the beginning. A refactor of SumatraPDF code to follow it was painful.
I find this price is worth paying and not just because of shorter compilation times.
It also forces me to design better (simpler) dependencies.
Entropy is real. Complexity grows but our heads remain small.
In large programs you have hundreds of structs, classes, functions, enums and they form a complex web of dependencies.
It’s way too much to fully understand at once so we get sloppy, we take shortcuts just to get that damn thing to compile.
Over time the sloppiness accumulate and we might end up with inter-dependent, circular mess. You just want to
#include "Button.h" and somehow it ends up bringing in
I did that in my own code. Once things get tangled, it’s really hard to untangle them.
The chaos wins.
Don’t let chaos win. Be control.
I don’t think I’ve ever seen any C++ code bases that follows this rule.
This makes me either a madman or a genius.
An idea for reducing compilation times that has more awareness (but also not much adoption in actual code bases) is impl idiom.
I’m not using it because it requires writing more code. That is not a price I’m willing to pay.