Skip to content

Design for program-defined metafunctions for cppfront #909

Closed as not planned
@JohelEGP

Description

@JohelEGP

Design for program-defined metafunctions for cppfront

Introduction

This write-up presents a design to extend cppfront to evaluate program-defined metafunctions.

Conception

Support for metafunctions was first added by commit d8c1a50,
"First checkin of partial meta function support, with interface meta type function".
Its commit message also included the following sentence.

  • There is not yet a general Cpp2 interpreter that runs inside the cppfront compiler and lets users write meta functions like interface as external code outside the compiler.

After a lot of thinking, the idea of a "Cpp2 interpreter" seemed backwards to what cppfront is.
Cppfront takes Cpp2 and lowers it to Cpp1, just like
Cfront takes Cpp1 and lowers it to C.
Interpreting Cpp2 could then be taken to mean one of two things:

  1. Building an interpreter that is a superset of the C++ abstract machine.
    This way, interpreted Cpp2 (i.e., metafunctions) is just as capable as normal Cpp2 code.
  2. Building an interpreter that is a very constrained subset of Cpp2.
    This would be like constexpr in C++11, and would probably evolve similarly.

Interpretation 1 means changing what cppfront fundamentally is.
Interpretation 2 feels unsatisfactory.
It is very constrained and without the power of the whole language at your disposal.

I thus realized that there is an alternative to interpreted Cpp2.
That alternative is loading a metafunction compiled in a library during the execution of cppfront.
This model doesn't change what cppfront is.
Additionally, a metafunction is normal Cpp2 code, just like the implementations of built-in metafunctions.

Counterpoints

In this design, a metafunction is "normal Cpp2 code".
In the Circle model of meta-programming, "normal Cpp1 code" can be executed at compile-time.
This has raised concerns, quoted below, that are relevant to the present design.
In our case, rather than compile-time, it's during metafunction evaluation.

However, we do not believe [the Circle] metaprogramming model is the right direction for C++’s future.
We raise the following concerns:

  • The ability (and potential need) to call into shared libraries from the compiler raises the
    kinds of security concerns that led SG7 to discard std::embed (P1040).

  • -- P2062 The Circle Meta-model

Circle is a fork of C++ that enables arbitrary compile-time execution (e.g. a compile-time std::cout), coupled with reflection to allow powerful meta-programming. SG7 was interested in it and considered copying parts of it. However, concerns were raised about security and usability problems, so the ability to execute arbitrary code at compile-time was rejected.
-- 2020-02 Prague ISO C++ Committee Trip Report — 🎉 C++20 is Done! 🎉 : cpp

Also, the committee already reviewed a paper describing the Circle evaluation model and expressed some concerns with issues related to trust and implementability, but was generally interested in being able to do more at compile-time, within reason. I didn't mention that because that's already the trajectory for constant expression evaluation.

For example, I don't see the point of adding compile-time specific I/O APIs that won't be compatible with any library; the whole idea of Circle is that you just take your existing C++ code and use it at compile time.

The ability to open a file at compile-time and the ability to execute existing code have largely orthogonal concerns. I think we should be able to execute more code at compile without having to explicitly label it constexpr, but I draw the line at allowing the compiler opening arbitrary files on the whim of some 3rd party library on my behalf.
-- Part of a reply from the thread starting at https://www.reddit.com/r/cpp/comments/jf4wsw/comment/g9mxpqc/?utm_source=share&utm_medium=web2x&context=3

Alternatives

Any alternative that requires recompiling cppfront or hard-coding metafunctions isn't viable at scale.

I also considered whether we could use Cpp1's constexpr and consteval.
These don't serve us if we are to use an existing cppfront program.
Consider the counterpoints.
Given Cpp1's if consteval, a constexpr function can't be guaranteed to not use IO.

That said, it could be possible to require a metafunction to be constexpr
and to actually evaluate it during constant evaluation to produce the updated type.
The technique to implement that would me similar to the one presented in
Interactive C++ in a Jupyter Notebook Using Modules for Incremental Compilation - Steven R. Brandt.
But that is not this design (and I haven't explored such a design).

Counter-counterpoints

Maybe a metafunction can be required to be @pure (#797 (comment)).
Then, even thought a metafunction is still normal Cpp2 code, it isn't as problematic.
Although @pure still seems too restrictive.

Design

This is based on what I learned from studying the documentation of Boost.DLL.

We need to emit a metafunction as an extern "C" symbol.
The mangling of a Cpp1 symbol is experimental and not as portable (https://www.boost.org/doc/libs/master/doc/html/boost_dll/mangled_import.html).
When loading the symbol of a metafunction, we need to use the same emitted name.
This means that we need a protocol for the symbol name and to "C namespace" it.

In its simplest form, we just need a function that,
given the Cpp2 name of a metafunction (as @-used),
it returns a function object that evaluates the metafunction.

There is an implementation of this design at #907.
Details on how this design was applied, as well as other implementations details, can be found there.

Evolution

Name lookup

Up until now, cppfront has been able to rely on the name lookup of lowered Cpp1 code.
But this design introduces an evaluation point that happens outside the C++ abstract machine.
It wants to look up a name that has already been compiled in Cpp1
and use it as named in Cpp2 code before the Cpp2 code has been lowered to Cpp1.

The current design doesn't consider name lookup.
It expects a metafunction name to be @-used unqualified and to follow C "namespacing" conventions.

Dependency scanning

The current design only requires specifying a protocol for lowering and loading a metafunction.
To author and consume a metafunction at scale, we also need dependency scanning, pretty much like Cpp1 modules.

Many of us use a build system to manage the complexity of building Cpp1 code.
We would like to avoid having cppfront run on a Cpp2 source that hasn't changed
and if all of the libraries that provide the metafunctions it uses haven't changed.
Conversely, we want cppfront to rerun if one of those libraries has changed.

We can't know which metafunction a Cpp2 source uses
without manually duplicating this information in the build system description.
cppfront can't just emit the dependency information after the fact (like Cpp1 compilers on #included headers)
because the libraries need to have been built before it starts evaluating the metafunction.

It has been suggested that cppfront could have a command line argument for compiling a metafunction library.
That would obviate the need for a dependency scanner, but this inversion of the build logic has drawbacks.

There was an article that I can't find, I think linked from the LLVM Discourse,
about how some other language's compiler (Go or Scala?) forked itself to build a module's sources in parallel.
That ended up resulting in file system races in very rare cases.
They rewrote their module compilation system to not fork itself and instead rely on their build system.
That fixed the issues, and even (significantly? in some cases?) reduced compile times.

I think the general issue is attempting to do what should be done at a higher level.
The higher level being that of the build system.
The CMake support for Cpp1 modules already went in the direction of a dependency scanner
(along with a long trail of papers for proper modules support).
I think it'd be unwise to go in the other direction,
which doesn't even seem to have build system support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions