Description
This isn't new, but the huge template mess everywhere in the library means that the compile times are slow, and they are currently even more terrible in the mutable sorters branch (which doesn't matter for now because it is blocked by internal compiler errors in clang++, but still...). However reducing those compile times without breaking the current API and reducing the power of the wrappers seems tricky. To be honest, I don't even know where to start to reduce the evergrowing template instantiations that occur everywhere. I tried a few things knowing that it wouldn't change much:
- Replace most
std::decay_t
byremove_cvref_t
- Invert the
choice
mechanism inhybrid_adapter
to reduce its recursive instantiations from 127 to <30 most of the time - Move a few elements out of templates when they don't depend on every template parameter
- Don't use
sorter_facade
to implement sorter adapters when they can already handle all theoperator()
overloads by themselves
It decreased the debug executable size a bit, but that's pretty much the only noticeable achievement.
But in the grand scheme of things it doesn't change much. I am looking for ideas to reduce a bit the compile times without breaking the library API, but I've only got a few tiny ideas:
- Maybe I could move even more things out of some templates. (I'm out of ideas for where that would be relevant)
- Turning trait-like class templates into templated
using
when possible generally ends up in the compiler instantiating fewer templates, which may help reduce both binary size and compile times. (I did that in a few places, now I don't know where it might still be relevant) - We could introduce more specialized
sorter_facade
derivatives for specific sorter implementations since the ones in the library mostly have the same functions, but it doesn't look easy to make it work generically for user-defined sorters without introducing more tags. std::conjunction
and friends are apparently slow and probably unneeded in most places; I should analyze the library to reduce their uses if possible.- Some libraries apparently improved compile time by replacing calls to
std::move
andstd::forward
by a bunch ofstatic_cast
, even though it doesn't look like the cleanest thing to do. (I tried that forstd::forward
and didn't notice a difference) - On some platforms, SFINAE conditions are 10x faster to compile when they are in the return type (as opposed to the template parameter list).
- Qualifying more calls in specific places may avoid to gratuitously trigger ADL so that the compiler won't needlessly look up for a function in the argument's associated namespace.
- Replacing calls to
std::distance
by subtractions when functions only handle random-access iterators may decrease the amount of tag dispatch happening (same for other iterators operations). It might also make it obvious when code is designed to work with random-access iterators only. - Some compilers provide intrinsics like
_EnableIf
that don't trigger template instantiations, which might be worth considering if they improve the state of things. On the other hand we should check whether it makes error message worse or not, because Error messages are unreadable #28 is a thing too.
Those are only small changes, and I doubt that it will make a difference. If anyone has better ideas to reduce the compile times without breaking the API, I welcome such ideas :p
List of articles about speeding up compilation of templates and related stuff (suggest articles if needed):
- https://odinthenerd.blogspot.fr/2017/03/start-simple-with-conditional-why.html
- https://boost-experimental.github.io/di/cppnow-2016/#/10/45
- https://mpark.github.io/programming/2017/08/08/using-killed-the-FUN-on-GCC-5/
- https://baptiste-wicht.com/posts/2017/09/how-i-made-deep-learning-library-38-faster-to-compile-optimization-and-cpp17-if-constexpr.html