Description
Design for program-defined metafunctions for cppfront
Introduction
This write-up presents a design to extend cppfront to evaluate program-defined metafunctions.
Conception
Support for metafunctions was first added by commit d8c1a50,
"First checkin of partial meta function support, with interface
meta type function".
Its commit message also included the following sentence.
- There is not yet a general Cpp2 interpreter that runs inside the cppfront compiler and lets users write meta functions like
interface
as external code outside the compiler.
After a lot of thinking, the idea of a "Cpp2 interpreter" seemed backwards to what cppfront is.
Cppfront takes Cpp2 and lowers it to Cpp1, just like
Cfront takes Cpp1 and lowers it to C.
Interpreting Cpp2 could then be taken to mean one of two things:
- Building an interpreter that is a superset of the C++ abstract machine.
This way, interpreted Cpp2 (i.e., metafunctions) is just as capable as normal Cpp2 code. - Building an interpreter that is a very constrained subset of Cpp2.
This would be likeconstexpr
in C++11, and would probably evolve similarly.
Interpretation 1 means changing what cppfront fundamentally is.
Interpretation 2 feels unsatisfactory.
It is very constrained and without the power of the whole language at your disposal.
I thus realized that there is an alternative to interpreted Cpp2.
That alternative is loading a metafunction compiled in a library during the execution of cppfront
.
This model doesn't change what cppfront is.
Additionally, a metafunction is normal Cpp2 code, just like the implementations of built-in metafunctions.
Counterpoints
In this design, a metafunction is "normal Cpp2 code".
In the Circle model of meta-programming, "normal Cpp1 code" can be executed at compile-time.
This has raised concerns, quoted below, that are relevant to the present design.
In our case, rather than compile-time, it's during metafunction evaluation.
However, we do not believe [the Circle] metaprogramming model is the right direction for C++’s future.
We raise the following concerns:
- …
- The ability (and potential need) to call into shared libraries from the compiler raises the
kinds of security concerns that led SG7 to discardstd::embed
(P1040).- …
-- P2062 The Circle Meta-model
Circle is a fork of C++ that enables arbitrary compile-time execution (e.g. a compile-time
std::cout
), coupled with reflection to allow powerful meta-programming. SG7 was interested in it and considered copying parts of it. However, concerns were raised about security and usability problems, so the ability to execute arbitrary code at compile-time was rejected.
-- 2020-02 Prague ISO C++ Committee Trip Report — 🎉 C++20 is Done! 🎉 : cpp
Also, the committee already reviewed a paper describing the Circle evaluation model and expressed some concerns with issues related to trust and implementability, but was generally interested in being able to do more at compile-time, within reason. I didn't mention that because that's already the trajectory for constant expression evaluation.
For example, I don't see the point of adding compile-time specific I/O APIs that won't be compatible with any library; the whole idea of Circle is that you just take your existing C++ code and use it at compile time.
The ability to open a file at compile-time and the ability to execute existing code have largely orthogonal concerns. I think we should be able to execute more code at compile without having to explicitly label it
constexpr
, but I draw the line at allowing the compiler opening arbitrary files on the whim of some 3rd party library on my behalf.
-- Part of a reply from the thread starting at https://www.reddit.com/r/cpp/comments/jf4wsw/comment/g9mxpqc/?utm_source=share&utm_medium=web2x&context=3
Alternatives
Any alternative that requires recompiling cppfront
or hard-coding metafunctions isn't viable at scale.
I also considered whether we could use Cpp1's constexpr
and consteval
.
These don't serve us if we are to use an existing cppfront
program.
Consider the counterpoints.
Given Cpp1's if consteval
, a constexpr
function can't be guaranteed to not use IO.
That said, it could be possible to require a metafunction to be constexpr
and to actually evaluate it during constant evaluation to produce the updated type.
The technique to implement that would me similar to the one presented in
Interactive C++ in a Jupyter Notebook Using Modules for Incremental Compilation - Steven R. Brandt.
But that is not this design (and I haven't explored such a design).
Counter-counterpoints
Maybe a metafunction can be required to be @pure
(#797 (comment)).
Then, even thought a metafunction is still normal Cpp2 code, it isn't as problematic.
Although @pure
still seems too restrictive.
Design
This is based on what I learned from studying the documentation of Boost.DLL.
We need to emit a metafunction as an extern "C"
symbol.
The mangling of a Cpp1 symbol is experimental and not as portable (https://www.boost.org/doc/libs/master/doc/html/boost_dll/mangled_import.html).
When loading the symbol of a metafunction, we need to use the same emitted name.
This means that we need a protocol for the symbol name and to "C namespace" it.
In its simplest form, we just need a function that,
given the Cpp2 name of a metafunction (as @
-used),
it returns a function object that evaluates the metafunction.
There is an implementation of this design at #907.
Details on how this design was applied, as well as other implementations details, can be found there.
Evolution
Name lookup
Up until now, cppfront has been able to rely on the name lookup of lowered Cpp1 code.
But this design introduces an evaluation point that happens outside the C++ abstract machine.
It wants to look up a name that has already been compiled in Cpp1
and use it as named in Cpp2 code before the Cpp2 code has been lowered to Cpp1.
The current design doesn't consider name lookup.
It expects a metafunction name to be @
-used unqualified and to follow C "namespacing" conventions.
Dependency scanning
The current design only requires specifying a protocol for lowering and loading a metafunction.
To author and consume a metafunction at scale, we also need dependency scanning, pretty much like Cpp1 modules.
Many of us use a build system to manage the complexity of building Cpp1 code.
We would like to avoid having cppfront
run on a Cpp2 source that hasn't changed
and if all of the libraries that provide the metafunctions it uses haven't changed.
Conversely, we want cppfront
to rerun if one of those libraries has changed.
We can't know which metafunction a Cpp2 source uses
without manually duplicating this information in the build system description.
cppfront
can't just emit the dependency information after the fact (like Cpp1 compilers on #include
d headers)
because the libraries need to have been built before it starts evaluating the metafunction.
It has been suggested that cppfront
could have a command line argument for compiling a metafunction library.
That would obviate the need for a dependency scanner, but this inversion of the build logic has drawbacks.
There was an article that I can't find, I think linked from the LLVM Discourse,
about how some other language's compiler (Go or Scala?) forked itself to build a module's sources in parallel.
That ended up resulting in file system races in very rare cases.
They rewrote their module compilation system to not fork itself and instead rely on their build system.
That fixed the issues, and even (significantly? in some cases?) reduced compile times.
I think the general issue is attempting to do what should be done at a higher level.
The higher level being that of the build system.
The CMake support for Cpp1 modules already went in the direction of a dependency scanner
(along with a long trail of papers for proper modules support).
I think it'd be unwise to go in the other direction,
which doesn't even seem to have build system support.