Home / Extreme #include discipline for C++ code edit
Try Documentalist, my app that offers fast, offline access to 190+ programmer API docs.

C++ takes long to compile
There is more than one reason for it but one of the reasons is excessive re-parsing of the same .h header files.
In SumatraPDF I'm using an extreme #include discipline to keep compilation times in check.
The rule is simple: a .h file cannot #include other .h files.
I didn't come up with this idea, I got it from Rob Pike: http://doc.cat-v.org/bell_labs/pikestyle
I've been following this rule for several years in SumatraPDF, a medium sized C++ project of over 100k loc. It works.
"It works" is more important that it seems. Many ideas seem great on paper but fail in practice. Name an economically successful communist country.
Don't get me wrong: the price of minimizing compilation times is eternal vigilance.
Writing C++ while following that rule is annoying.
In code, things depend on other things. If a struct in foo.h depends on struct in bar.h a quick fix is to #include "bar.h" in foo.h.
You do it once and it works
Done once and for all: in your foo.c you just include foo.h and it brings in bar.h.
That convenience comes with a hidden price. Imagine you have foo2.h that also depends on bar.h so you also #include "bra.h" in foo2.h.
You then #include "foo2.h in foo.c and bang! You just included and parsed bar.h twice.
In real C++ codebases the same headers are unnecessarily re-included and re-parsed hundreds of times.
It's a known problem so we mitigate with #ifdef guards, #pragma one etc. but in my experience those band-aids don't solve the problem.
Following Rob Pike's rule we must #include "bar.h" and foo.h and foo2.h in foo.c in correct order.
The "correct order" part is what makes it annoying.
Let's face it: a month after writing foo.h I no longer remember that it depends on bar.h.
So the way it goes is:
  • I #include "foo.h" in brand_new_cpp file
  • I get a compilation error what is this Bar you're referring to?
  • I dig around and figure out that Bar is a struct defined in bar.h so I #include "bar.h" before foo.h
  • I get another compilation error what is that Bar2 you speak of?. This could be unmet dependency from foo.h or newly included bar..h
  • I keep adding 10 more #include to satisfy their cascading dependencies
What used to be a simple #include "foo.h" can end up a lengthy game or #include whack-a-mole.
So beware: following this extreme rule will be occasionally painful.
I wasn't following this rule from the beginning. A refactor of SumatraPDF code to follow it was painful.
I find this price is worth paying and not just because of shorter compilation times.
It also forces me to design better (simpler) dependencies.
Entropy is real; our heads are small.
In large programs you have thousands of structs, classes, functions, enums and they form a complex web of dependencies.
It's way too much to fully understand at once so we get sloppy, we take shortcuts just to get that damn thing to compile.
Over time the sloppiness accumulate and we might end up with inter-dependent, circular mess. You just want to #include "Button.h" and somehow it ends up bringing in NuclearPowerPlant.h
I've done this in my own code and once things get tangled, it's really hard to untangle them. The chaos wins. Don't let chaos win. Be control.
I don't think I've ever seen any C++ code bases that follows this rule.
This makes me either a madman or a genius.
An idea for reducing compilation times that has more awareness (but also not much adoption in actual codebases) is impl idiom. I'm not using it and I'm not a fan because it's hard out there for a PIMPL.

Feedback about page:

Optional: your email if you want me to get back to you:

Need fast, offline access to 190+ programmer API docs? Try my app Documentalist for Windows