-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meson poc #19
Conversation
pandas/_libs/meson.build
Outdated
'arrays', | ||
cython_sources['arrays.pyx'], | ||
include_directories: [inc_np], | ||
dependencies: py_dep, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on an SciPy discussion with @rgommers, Meson 0.63.0 now implicitly adds that py_dep for you. This reduces boilerplate in build definitions, as long as you bump your minimum requirement to the latest Meson release. Of course, the SciPy example you based this on, predates this change, so SciPy has not adapted to this yet or raised the minimum build requirements.
As a brand-new Meson port, you'd be ideally placed to take advantage of this and in doing so avoid adding that boilerplate to any version of meson.build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads-up, will include it in my next clean-up commit.
pandas/_libs/tslibs/meson.build
Outdated
endforeach | ||
|
||
# TODO: build a shared library for np_datetime and np_datetime_strings | ||
# and use lto? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shared libraries can be tricky in python packaging, so another option is to just build it into a static_library()
once, and link that static library to each extension that needs it. There would still be a copy of the code statically included in each extension that uses it, but you'd avoid compiling it multiple times.
Meson will handle the lto for you if you set b_lto=true
in the default_options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I specify a .c source for example twice, is meson/ninja smart enough to avoid compiling it multiple times?
This was an issue previously for us with setuptools since in a multi-threaded compile, there was a race in writing the object file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick test, there's no difference in generated build.ninja
files if I add a duplicate .c
file to an extension module list of sources, so Meson seems to de-duplicate the input files. This shouldn't happen in the first place though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meson explicitly deduplicates the list of source files for a target, so you can add the same source file multiple times. IIRC this is supposed to be a protection for the case where you have if/else logic defining the file list, and there's multiple "causes" that can result in a given file being included; like this, you do not have to check whether you already added it.
That being said, this occurs per-target, and each target has a private object directory, so if you add a duplicate .c
file one time each to multiple extensions, it will compile it multiple times (it often has to, because each target i.e. an extension_module()
can have different c_args and include directories that can change the results of compilation). But, each time it compiles it, it is smart enough to write to a different output file.
This was an issue previously for us with setuptools since in a multi-threaded compile, there was a race in writing the object file.
It's "impossible" for Meson to have races in writing output files, since each compile rule specifies the name of its output file(s) and Meson will error out during configure time if multiple compile rules have the same output file, and ask you to rename it.
@@ -0,0 +1,179 @@ | |||
# TODO: can this be removed, I wish meson copied .pyx source to the build dir automatically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to get rid of __init__.py
file copying after cython/cython#4906 is in a released Cython version.
For the .pxd
files, are you copying them because they need to work with generated .pyx
files? Or because you want to install them, or for some other reason?
I've been fighting with Cython quite a lot, but I've never found the need to copy .pyx
files to the build dir - why does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated .pxi files are in the build directory since meson forces out-of-tree builds. Unfortunately, that means the Cython will not find them by default, since it only looks in the current directory where the .pyx files are.
I'm trying to find an option for the cython
command that tells it to look in other directories, but haven't found any luck there, so I'm hacking around it by copying and compiling the pyx files inside the build directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that makes sense. Generated Cython code is a pain, we're trying to actively improve that in Cython and Meson. SciPy has a lot of generated Cython files; we use a Meson generator
to automate dealing with copying files to the build dir: https://github.com/scipy/scipy/blob/8837756d06218936353895e6c30e5ca2b12c2e14/scipy/meson.build#L201-L211
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is something that structured sources would fix? (mesonbuild/meson#9339)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that confident to be honest; it's pretty complex and generated file objects are sometimes not treated the same as files specified as strings. Also, the conceptual issue is that it's baking in what I consider a design flaw in Cython; I'm more enthusiastic about fixing that. If with --module-name
we get rid of needing to care about __init__.py
files, and we can get Meson to understand what .pxd
and .pxi
files are and generate the appropriate --include-dir
argument for them, then Cython is no longer structured.
meson.build
Outdated
license: 'BSD-3', | ||
meson_version: '>=0.63', | ||
default_options: [ | ||
'buildtype=debugoptimized', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this normal to do this with meson? With setuptools you build release by default and have to opt in to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meson's default buildtype is "debug" (especially useful for getting started on a wide variety of generic C/C++ projects), but e.g. the PEP 517 builder will then set a release buildtype. So this poc, when built with pip install
, will still build release by default. But running meson setup && ninja
will use this setting to build with debug.
People running meson setup instead of pip install are probably hacking on pandas anyway, so that might not be a bad distinction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll note that numpy.distutils
does build in debug mode by default (it has hardcoded -g
flags), and it indeed seems like a sensible default for development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense in theory, but in practice to limit the scope of changes I would suggest we still now build release by default and come back to a later change to make a clear delineation between when we want a debug versus release build.
@lithomas1 maybe you want to take over rgommers@9cd411e? With that plus the PR I just opened, this PR builds with zero warnings (at least with GCC on Linux). |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Build & install by doing (in the pandas directory)
Builds are automatically multithreaded(thanks to ninja).
Known issues: