Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"sources" in a library() or static_library() should trigger rebuild #10196

Closed
Jehan opened this issue Mar 27, 2022 · 12 comments
Closed

"sources" in a library() or static_library() should trigger rebuild #10196

Jehan opened this issue Mar 27, 2022 · 12 comments

Comments

@Jehan
Copy link
Contributor

Jehan commented Mar 27, 2022

Describe the bug

In GIMP, we have generated code which we implement through a custom_target(). When this custom target rebuild is trigger, we want it to trigger a rebuild of the library() and static_library() using it.

Since the custom_target() outputs a stamps file because we regenerate the code directly into the source dir (the generated code is committed in the source tree, because it allows to check diffs and investigate changes much better so this is absolutely needed), I thought of just adding the custom_tgt object as sources in the library() and static_library().

Then the new generated source is properly rebuilt when I change the base source (this part is fine), yet the new version is not taken into account by the library build (which is actually not even triggered, even though the custom_tgt source was updated!).

To Reproduce

Here is our custom_target() meson file: https://gitlab.gnome.org/GNOME/gimp/-/blob/10afff93196a1c9b1daf0f699d593a42885685eb/pdb/meson.build#L122

This is the places where we need to use it as dependency:

(it is needed in 2 places, because 2 code files are generated from the same source, as it creates a protocol API for inter-process communication between GIMP core and its plug-ins)

As you can see, it's very simple code. We produce some stuff with a custom_target and just want the *library targets to wait and be triggered when the custom one is triggered.

Well it doesn't. It can be verified this way:

  • Build GIMP once successfully.
  • Modify a .pdb file to make it produce invalid code, for instance open pdb/groups/drawable_color.pdb and edit it (for instance remove a ; at the end of some code.
  • Run ninja.

Expected behavior

Since we made the code invalid, the 2 libraries using the generated code should be retriggered and the build should fail. The libs builds aren't even retriggered (a ninja -v shows so), even though a git diff would show that the source files were indeed re-generated (in our case, app/pdb/drawable-color-cmds.c would be regenerated, though depending on the type of change we do, libgimp/gimpdrawablecolor_pdb.c could also be regenerated). Yet the libappinternalprocs is not retriggered and therefore the build passes.

Only a second call to ninja would fail the build as it should have previously.

Note that I understand that it could miss the fact that the source it relies on was changed (since it happened after the build start), but what it should definitely not miss is that a custom_tgt added in "sources" was rebuilt, which therefore should do a chain reaction and rebuild other targets depending on it. Otherwise, what's the point of "sources".

Note that I also tried adding the custom_tgt() as a "sources" to declare_dependency(), then use this dependency object in the *library(). Unfortunately it didn't work any better even though "sources" was defined like this:

sources to add to targets (or generated header files that should be built before sources including them are built)

It clearly says we could use this for building other needed sources. Yet it doesn't work.

Also it's dangerous because my example of producing broken code is simple to spot, but if any change need 2 rebuilds to be taken into account, I could imagine developers not understanding why their code doesn't work and wasting time investigating while it was simply not rebuilt. So what they see in the repo and what was built by meson is simply different.

system parameters

  • Is this a cross build or just a plain native build (for the same computer)?
    Native build
  • what operating system (e.g. MacOS Catalina, Windows 10, CentOS 8.0, Ubuntu 18.04, etc.)
    PureOS buster/sid
  • what Python version are you using e.g. 3.8.0
    Python 3.9.2
  • what meson --version
    0.56.2
  • what ninja --version if it's a Ninja build
    1.10.1
@eli-schwartz
Copy link
Member

Since the custom_target() outputs a stamps file because we regenerate the code directly into the source dir (the generated code is committed in the source tree, because it allows to check diffs and investigate changes much better so this is absolutely needed), I thought of just adding the custom_tgt object as sources in the library() and static_library().

Well, there is your problem. You added a stamp file as source code to the library -- no .c files. How do you usually pass them through GCC/clang into object files and then on into your library? If they aren't sources, they don't get recompiled, and if they don't get recompiled, they don't trigger a relink.

Note that I understand that it could miss the fact that the source it relies on was changed (since it happened after the build start), but what it should definitely not miss is that a custom_tgt added in "sources" was rebuilt, which therefore should do a chain reaction and rebuild other targets depending on it. Otherwise, what's the point of "sources".

To add source code files (.c, .cpp) which are compiled into the library.

@eli-schwartz
Copy link
Member

Quite frankly though, I don't actually understand what you are trying to do, or how you are trying to do it. :)

You reported a bug about Meson not doing what it's documented to do -- Meson does do what it's documented to do, I can prove it because lots of projects provably use it correctly and it works there for me.

You didn't explain what your project is doing, what files that stamp command actually creates, how they (as opposed to your stamp file itself) are used, or why you think that your project qualifies as what Meson says you can do. The mention of stamp files is an immediate red flag -- it means you're going outside of Meson's documented parameters, so I cannot assume anything in order to fill in the gaps in my knowledge.

Please clarify.

@eli-schwartz eli-schwartz added the needs-info waiting information from user. May close if no response for long time. label Mar 27, 2022
@eli-schwartz
Copy link
Member

What Meson is doing here, basically, is as follows:

  • adding sources with an extension of .c or .cpp will get compiled by the relevant compiler, then linked into the final target (library)
  • adding sources with an extension of .h creates a bootstrap dependency, what ninja calls an order-only, or "header", dep. The file must exist, but its timestamp is irrelevant. This is useless, unless the file is generated, in which case it enforces that any build rule to create the order-only dependency file must run first, before any sources are compiled
  • The compiler, when compiling real sources such as .c or .cpp, will emit a depfile containing additional dependencies on each indirect source that that file had -- headers that are included, possibly from /usr/include and possibly from your source tree. This ensures that incremental rebuilds work.

Without order-only dependencies, when you generate a header as part of your build, ninja would not know that the header needs to be generated before compilation starts.

On the other hand, headers cannot be a direct dependency -- then editing one header would cause a full project-wide rebuild.

Without depfile support, incremental rebuilds would not work, because edits to headers would not trigger rebuilds of the files that use them since that information is not encoded on a per-file level in meson.build (and generally for autotools Makefiles, you don't want to track headers that closely, so depfiles are used there too).

So adding a stamp file to a library source is expected to ensure that the custom_target is built at least once, before the library. At that point, the first build is expected to succeed, and it is the job of the compiler to emit depfiles which guarantee incremental rebuilds shall work, depending on what the "real" sources actually needed during the preprocessor and codegen stages.

This works everywhere that I've ever seen.

@Jehan
Copy link
Contributor Author

Jehan commented Mar 27, 2022

You didn't explain what your project is doing, what files that stamp command actually creates

It creates:

how they (as opposed to your stamp file itself) are used

  • The .c and .h files in libgimp/ are used for the libgimp library() (which is used by plug-ins).
  • The .c files in app/pdb/ are used by a temporary static library which will end up into the main GIMP binary.

The stamp file is not used. It's just here as a witness of the build as is customary for stamp files in builds.

why you think that your project qualifies as what Meson says you can do

I'm not really saying anything other than believing it would work reading the docs. If it's not right, I don't mind, then I would appreciate a pointer on how I can do what I need. Basically it once again all comes back to issue #8123 where a generic way to define target dependency would be useful. Here we have 3 parts: the pdb/ (which is the generated files for our inter-process protocol), the app/ and libgimp/.

We need pdb/ to be run before the 2 others and anything triggering pdb rebuild should also trigger the 2 other part's rebuild. It's a very simple dependency need, very basic. Unfortunately this is one of the core part of GIMP which we never managed to have working in meson and blocks us to make meson our main build system, ever since our first meson build 3 years ago. Several very skilled contributors tried to make it work so far and failed to do so. That's why I finally tried too today but so far didn't find the solution either.

As I understand it, the main issue is that we need to generate the source files in the source dir, not the build dir. This is important because we also commit these generated files, which allows us to review any changes properly (if these were not versioned, it would be impossible). Still this should not be a problem for meson. We just need to tell it "this custom target is a dependency of this library, please rebuild if the custom target is rebuilt". That's it. Simple.

@eli-schwartz
Copy link
Member

So if I understand correctly:

  • a library depends on foo.c, foo.h, and stamp.h
  • foo.c, foo.h exist in the source tree
  • stamp.h is created by a custom_target

You want stamp.h to be created first, before foo.o, and adding it as a header-only dep does cause this to happen.

Then stamp.h gets updated by a ninja rule, and in the meantime, ninja thinks that foo.c was edited by a human with a text editor.

Now, libsomething.so does depend on foo.o, and foo.o does depend on foo.c, so this should definitely get built. Is it not getting built at all?

@Jehan
Copy link
Contributor Author

Jehan commented Mar 27, 2022

a library depends on foo.c, foo.h, and stamp.h

We don't need the library to actually build the stamp. This stamp is merely an indicator. Though we could make it build an empty/no-op stamp.h or stamp.c if it'd help (but I tested both cases and it didn't or I didn't do it right 😕).

The other 2 points are correct.

You want stamp.h to be created first, before foo.o, and adding it as a header-only dep does cause this to happen.

Yes we want the stamp to be created before foo.o, because foo.c might actually change in the meantime. But no, I never made it to happen so far. It looks like foo.o is created first, or sometimes it's not even re-created because meson/ninja thinks that no change occurred in foo.c (but then, when the stamp is created, foo.c was actually updated, but too late for meson/ninja to process it into foo.o).

Now, libsomething.so does depend on foo.o, and foo.o does depend on foo.c, so this should definitely get built. Is it not getting built at all?

It is built, but only after a second ninja run. The first ninja run does change foo.c but builds foo.o off the old foo.c (not the new version).

Here is what happens:

  • I edit foo.pdb (which is my source for foo.c and foo.h).
  • I run ninja.
  • ninja creates foo.o based on the old foo.c, it also creates the new foo.c (too late) and the stamp.
  • Then it creates the library (but using foo.o, hence based on outdated code which is not even in the repository anymore at the end of the ninja call).

Now if I call ninja a second time (without touching anything more), the good foo.o off the new foo.c is compiled and the library is rebuilt with actual new code.

@eli-schwartz
Copy link
Member

Now if I call ninja a second time (without touching anything more), the good foo.o off the new foo.c is compiled and the library is rebuilt with actual new code.

Okay so yes, this is definitely an issue about stamps.

ninja runs. First thing it does is check timestamps for absolutely everything.

  • stamp.h is out of date. foo.o is out of date. libsomething.so is out of date.
    • not part of the flow we are examining, but bar.o, baz.o, something.o, coolthing.o, amazingthing.o, and stupendousthing.o are also out of date.
  • ninja rebuilds stamp.h first, it's a header only dep.
  • then it rebuilds foo.o
    • (and also bar.o, baz.o, something.o, coolthing.o, amazingthing.o, stupendousthing.o)
  • then it rebuilds libsomething.so

All is well.

Next ninja run, after editing foo.pdb

  • stamp.h is out of date. foo.o is up to date. libsomething.so is out of date
  • ninja rebuilds stamp.h, and since it is a header-only dep, it doesn't rebuild foo.o (but nor does it rebuild bar.o, baz.o, something.o, coolthing.o, amazingthing.o, or stupendousthing.o, i.e. all the other inputs to libsomething.so)
  • ninja finishes

Next ninja run, after doing nothing

  • stamp.h is up to date. foo.o is out of date,. libsomething.so is also out of date
    • it only knows this now, after it rescans timestamps
  • ninja rebuilds foo.o
  • ninja rebuilds libsomething.so

So, the trick is saying that foo.o depends on stamp.h, in addition to foo.c, in order to trigger a dependency rule after timestamp scanning. But, we do NOT want bar.o, baz.o, something.o, coolthing.o, amazingthing.o, or stupendousthing.o to be rebuilt.

I can see two obvious solutions here:

  • generate two copies of foo.c, one in the source directory, one in the build directory, and tell meson that the latter is the true output of this custom_target. Do away with the stamp. ninja will build from the builddir version of foo.c, and the source version is not ever used. It's just... test data, basically.
  • make foo.c #include "stamp.h" so that the compiler depfile dynamically calculates another rule for stamp.h -> foo.o; stamp.h can of course be an empty file which does nothing (it probably already is) so it does no harm, and in fact does nothing at all.

The former I have seen done by people (harfbuzz) who don't actually care about the source copy, except that they want to dist it -- and particularly, they want to support their reverse dependencies, some of whom apparently are vendoring harfbuzz from git, not from release tarballs, and at the same time have no interest in using Meson because they are called Google and are quite happy with GN and suchlike.

The latter is probably a better option if you want to absolutely guarantee that you're building the same code that you git diff.

@eli-schwartz
Copy link
Member

Note that all this is in accordance with the documentation for library(..., sources: 'stamp.h')

These input files can be sources, objects, libraries, or any other file. Meson will automatically categorize them based on the extension and use them accordingly. For instance, sources (.c, .cpp, .vala, .rs, etc) will be compiled and objects (.o, .obj) and libraries (.so, .dll, etc) will be linked.

With the Ninja backend, Meson will create a build-time order-only dependency on all generated input files, including unknown files. This is needed to bootstrap the generation of the real dependencies in the depfile generated by your compiler to determine when to rebuild sources. Ninja relies on this dependency file for all input files, generated and non-generated. The behavior is similar for other backends.

@eli-schwartz
Copy link
Member

gnomesysadmins pushed a commit to GNOME/gimp that referenced this issue Mar 28, 2022
Fix the dependency by making the stamp an actual (yet empty/no-op)
header file which is included by all generated source file. This way, we
ensure that meson rebuild .o files when the .pdb sources are changed.

This is the second solution proposed by eli-schwartz here:
mesonbuild/meson#10196 (comment)
@Jehan
Copy link
Contributor Author

Jehan commented Mar 28, 2022

Ok! So the first proposed solution, I had thought of it already, but I really didn't like to duplicate the source files into the build dir indeed. It could have been some day cause of headaches and hard-to-diagnose bugs if the copy was not proper and you ended up with discrepancy (so you would get visible source different from actually built source, non-understandable unless you think of diff the files explicitly, possibly after having wasted hours already!).

The second solution is much better though. I had no idea that meson would generate dependency trees from inspecting source #include-s. If these were hand-made source though, it would have been problematic (as someone could forget the #include), but since it's generated source, all I had to do was to update the generation scripts. Now it works very well.

Now as I said at the end of the GIMP report:

I still believe that meson definitely needs more generic dependency systems too. The fact that it can read the includes is very useful and powerful for C code, but it also means that this solution would not have worked if you wanted to build a project in a language not special-cased by meson (hence "genericity" needed, simply supplemented by special-casing for improving well-known cases like C).

Also even for C, rather than having to create a fake source, which is still boderline-workaround, having the ability to just control your build system and tell it to rebuild in a given case "no question asked" (because you know what you are doing) is a definitive base feature of a build system IMO. So it fixes our issue here, but doesn't mean we don't believe meson could be much better with more genericity. Just saying. :-)

As a consequence, I don't know if this report should be closed. Maybe it can be closed as we could consider my remaining requests as part of #8123 about more-generic ability to handle dependency. Not always having to workaround with fake empty headers or using special-cases like the fact this was C code and that meson processes includes (there is not only C in life; myself, I like to code in a variety of languages).

@eli-schwartz
Copy link
Member

I had no idea that meson would generate dependency trees from inspecting source #include-s. If these were hand-made source though, it would have been problematic (as someone could forget the #include), but since it's generated source, all I had to do was to update the generation scripts. Now it works very well.

Meson doesn't inspect includes, per se. Instead, it (together with ninja) has a powerful and fully generic ability to have any command specify that it also creates a depfile. Then ninja reads the depfile as a Makefile fragment (because this unofficial standard started in the Make and C compiler arena) and supports the limited subset of Make that is

output: input \
    input2 \
    input3

(Forgive my incorrect Makefile syntax, I cannot type a tab into my phone.)

Remember that any command can say it has a depfile! Meson doesn't allow you to specify commands for building C/C++ executables and libraries, because that's the point of a built-in executable() / library() function. It knows about compilers and it knows when they support depfiles.

custom_target() does specify its own command, though. And if that custom command supports depfile generation, there's a kwarg to tell that to meson.

Meson has built-in support for some other special cases too. For example, the gnome module supports glib-compile-resoueces and the qt5 module supports the qrc (qt resource compiler) via dedicated functions, these functions detects if your gnome/Qt version is new enough to have the qrc --depfile argument or the gresource --dependency-file argument and use them if so.

It's useful for cython too, the python to C transpiler, and we're currently blocked on reliably rebuilding cython files until cython gets a depfile argument (they added one to cythonize, but it turns out that isn't good enough).

Custom project scripts can use this too, just by building and saving a list of each significant file they needed to open and parse along the way to create whatever output they wanted to create.

Not always having to workaround with fake empty headers or using special-cases like the fact this was C code and that meson processes includes (there is not only C in life; myself, I like to code in a variety of languages).

Well sure, again this is usable from any language as long as your build program knows how to or can be taught to emit "an augmented list of build dependency rules".

And you need to create a stamp file anyway so that ninja knows your command completed successfully, so making that double as a header in the case where the things which depend on it are files which natively understand headers, is a pretty simple solution.

It's harder when you don't control the build program and it doesn't support depfile generation. That's when having an escape hatch and being able to manually specify dependencies is nice.

@eli-schwartz eli-schwartz removed the needs-info waiting information from user. May close if no response for long time. label Mar 28, 2022
@eli-schwartz
Copy link
Member

Closing as mostly resolved, with some things that are already tracked in #8123

gnomesysadmins pushed a commit to GNOME/gimp that referenced this issue Aug 1, 2022
Our meson build system was not properly building the enums.c file,
because they are versionned.

I did a similar trick as what I did for the pdbgen, which is that I used
a wrapper script around the existing perl script, which sets proper
options and generate a stamp file in the end (which is considered by
meson as the actual custom target, not the C file since it is generated
in the source dir).

The most important part is that the stamp file is a generated header
source (not just a random text file) which is **included** by the
generated C file. This is what will force meson to regenerate the C file
if the header is updated, **then** build using this new version, not use
an outdated versionned version (which would make for hard to diagnose
bugs), through the indirection of the intermediate stamp header.

See #4201.
See also: mesonbuild/meson#10196 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants