Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meson poc #19

Closed
wants to merge 80 commits into from
Closed

Meson poc #19

wants to merge 80 commits into from

Conversation

lithomas1
Copy link
Owner

@lithomas1 lithomas1 commented Jul 14, 2022

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Build & install by doing (in the pandas directory)

pip install git+https://github.com/mesonbuild/meson.git@master
pip install git+https://github.com/mesonbuild/meson-python.git@main
pip install Cython, wheel, oldest-supported-numpy>=0.10
pip install . -v --no-build-isolation

Builds are automatically multithreaded(thanks to ninja).

Known issues:

  • Sometimes the pxi's don't get generated (for me on macOS). [Maybe patched]
  • Addding the directory where your c files are will result in confusing errors about stuff defined in Python.h not being found. meson implicitly adds this dir.

'arrays',
cython_sources['arrays.pyx'],
include_directories: [inc_np],
dependencies: py_dep,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mesonbuild.com/Release-notes-for-0-63-0.html#python-extension-modules-now-depend-on-the-python-library-by-default

Based on an SciPy discussion with @rgommers, Meson 0.63.0 now implicitly adds that py_dep for you. This reduces boilerplate in build definitions, as long as you bump your minimum requirement to the latest Meson release. Of course, the SciPy example you based this on, predates this change, so SciPy has not adapted to this yet or raised the minimum build requirements.

As a brand-new Meson port, you'd be ideally placed to take advantage of this and in doing so avoid adding that boilerplate to any version of meson.build.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up, will include it in my next clean-up commit.

endforeach

# TODO: build a shared library for np_datetime and np_datetime_strings
# and use lto?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared libraries can be tricky in python packaging, so another option is to just build it into a static_library() once, and link that static library to each extension that needs it. There would still be a copy of the code statically included in each extension that uses it, but you'd avoid compiling it multiple times.

Meson will handle the lto for you if you set b_lto=true in the default_options.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I specify a .c source for example twice, is meson/ninja smart enough to avoid compiling it multiple times?

This was an issue previously for us with setuptools since in a multi-threaded compile, there was a race in writing the object file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick test, there's no difference in generated build.ninja files if I add a duplicate .c file to an extension module list of sources, so Meson seems to de-duplicate the input files. This shouldn't happen in the first place though?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meson explicitly deduplicates the list of source files for a target, so you can add the same source file multiple times. IIRC this is supposed to be a protection for the case where you have if/else logic defining the file list, and there's multiple "causes" that can result in a given file being included; like this, you do not have to check whether you already added it.

That being said, this occurs per-target, and each target has a private object directory, so if you add a duplicate .c file one time each to multiple extensions, it will compile it multiple times (it often has to, because each target i.e. an extension_module() can have different c_args and include directories that can change the results of compilation). But, each time it compiles it, it is smart enough to write to a different output file.

This was an issue previously for us with setuptools since in a multi-threaded compile, there was a race in writing the object file.

It's "impossible" for Meson to have races in writing output files, since each compile rule specifies the name of its output file(s) and Meson will error out during configure time if multiple compile rules have the same output file, and ask you to rename it.

@@ -0,0 +1,179 @@
# TODO: can this be removed, I wish meson copied .pyx source to the build dir automatically

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to get rid of __init__.py file copying after cython/cython#4906 is in a released Cython version.

For the .pxd files, are you copying them because they need to work with generated .pyx files? Or because you want to install them, or for some other reason?

I've been fighting with Cython quite a lot, but I've never found the need to copy .pyx files to the build dir - why does that help?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated .pxi files are in the build directory since meson forces out-of-tree builds. Unfortunately, that means the Cython will not find them by default, since it only looks in the current directory where the .pyx files are.

I'm trying to find an option for the cython command that tells it to look in other directories, but haven't found any luck there, so I'm hacking around it by copying and compiling the pyx files inside the build directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that makes sense. Generated Cython code is a pain, we're trying to actively improve that in Cython and Meson. SciPy has a lot of generated Cython files; we use a Meson generator to automate dealing with copying files to the build dir: https://github.com/scipy/scipy/blob/8837756d06218936353895e6c30e5ca2b12c2e14/scipy/meson.build#L201-L211

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is something that structured sources would fix? (mesonbuild/meson#9339)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that confident to be honest; it's pretty complex and generated file objects are sometimes not treated the same as files specified as strings. Also, the conceptual issue is that it's baking in what I consider a design flaw in Cython; I'm more enthusiastic about fixing that. If with --module-name we get rid of needing to care about __init__.py files, and we can get Meson to understand what .pxd and .pxi files are and generate the appropriate --include-dir argument for them, then Cython is no longer structured.

meson.build Outdated
license: 'BSD-3',
meson_version: '>=0.63',
default_options: [
'buildtype=debugoptimized',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this normal to do this with meson? With setuptools you build release by default and have to opt in to debug

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meson's default buildtype is "debug" (especially useful for getting started on a wide variety of generic C/C++ projects), but e.g. the PEP 517 builder will then set a release buildtype. So this poc, when built with pip install, will still build release by default. But running meson setup && ninja will use this setting to build with debug.

People running meson setup instead of pip install are probably hacking on pandas anyway, so that might not be a bad distinction.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that numpy.distutils does build in debug mode by default (it has hardcoded -g flags), and it indeed seems like a sensible default for development.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense in theory, but in practice to limit the scope of changes I would suggest we still now build release by default and come back to a later change to make a clear delineation between when we want a debug versus release build.

@rgommers
Copy link

rgommers commented Sep 2, 2022

@lithomas1 maybe you want to take over rgommers@9cd411e? With that plus the PR I just opened, this PR builds with zero warnings (at least with GCC on Linux).

@lithomas1 lithomas1 marked this pull request as ready for review September 20, 2022 21:49
@lithomas1 lithomas1 marked this pull request as draft September 20, 2022 21:52
@lithomas1 lithomas1 closed this Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants