Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1st class Cython language support #8706

Merged
merged 13 commits into from
Jun 7, 2021

Conversation

dcbaker
Copy link
Member

@dcbaker dcbaker commented Apr 27, 2021

Okay, I stopped thinking about it and started writing code. This is not complete, there's a fair number of things not done (listed below). Currently, this is just enough to build one trivial test case, but I think it touches all the places that need to be changed, even if there is more work to do. I'm not sure how much more time I'll be able to spend on this, but here's a starting point if someone else wants to take it up.

TODO:

  • handle a single static source
  • handle a single generated source
  • handle a mixture of static and generated sources
  • work out the best way to handle python 2 support (if it's needed at all)
  • work out the best way to handle Cython -> C++
  • deal with the fact that filesystem layout matters (also a problem for Rust and Zig)

@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2021

This pull request introduces 1 alert when merging 33e06a7 into a493c95 - view on LGTM.com

new alerts:

  • 1 for Unused import

@dcbaker
Copy link
Member Author

dcbaker commented Apr 27, 2021

For C++ I'm think using a -Dcython_language=(c|cpp) might be best, then if you want to override it, you can do something like:

cpp_files = static_library(
  'static_library
  ['foo.pyx'],
  override_options : ['cython_language=cpp'],
)

python.extension_module(..., link_with : static_lib)

@lgtm-com
Copy link

lgtm-com bot commented Apr 27, 2021

This pull request introduces 1 alert when merging a5bb3eb into a493c95 - view on LGTM.com

new alerts:

  • 1 for Unused import

@rgommers
Copy link
Contributor

rgommers commented May 2, 2021

I've been testing this PR in https://github.com/rgommers/scipy/tree/meson, which starts porting SciPy submodules to Meson. So far everything works as expected.

@dcbaker
Copy link
Member Author

dcbaker commented May 3, 2021

I think we'll need to solve #8725 to completely make cython a 1st class language, but I'm glad this is step in the right direction.

@@ -1757,6 +1759,30 @@ def detect_cs_compiler(self, for_machine):

self._handle_exceptions(popen_exceptions, compilers)

def detect_cython_compiler(self, for_machine: MachineChoice) -> CythonCompiler:
"""Search for a cython compiler."""
compilers = self.lookup_binary_entry(for_machine, 'cython')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this should better use the cython or the cythonize frontend?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure it must be cython. Parallel builds, caching, etc. should come from Meson not Cython, and we want Meson to invoke the C/C++ compiler rather than have Cython use setuptools under the hood.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to build on cython not cythonize, we can easily generate a .pyx -> .c rule and a second .c -> .o rule ourselves. Vala already works this way.

@jpakkane
Copy link
Member

How useful would this be in the main tree without the generated code bits? I.e. could that be dropped from this MR's scope and added afterwards?

@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

I could fix the generated code bits to at least be on par with Rust pretty easily, ie you could have a purely generated or purely static build, but not a mix. I would be happy to land this at that point, and we can finish the structured input stuff separately. Let me update this today.

@dcbaker dcbaker force-pushed the wip/2021-04/cython-language branch from a5bb3eb to ef5327f Compare May 19, 2021 20:19
@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

This is now based on #8735

@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

So, the only things that don't work currently that probably should are:

  1. we always output C not C++
  2. we always compile in python3 mode, with no way to get python2 mode.

@lgtm-com
Copy link

lgtm-com bot commented May 19, 2021

This pull request introduces 1 alert when merging ef5327f into 4ca9a16 - view on LGTM.com

new alerts:

  • 1 for Unused import

@dcbaker dcbaker force-pushed the wip/2021-04/cython-language branch 2 times, most recently from 5f1ab76 to e1e1e2a Compare May 19, 2021 21:04
@dcbaker dcbaker marked this pull request as ready for review May 19, 2021 21:04
@dcbaker dcbaker requested a review from jpakkane as a code owner May 19, 2021 21:04
@scoder
Copy link

scoder commented May 19, 2021

So, the only things that don't work currently that probably should are:
1. we always output C not C++

This is something that the directive # distutils: language=c++ would solve, which the cythonize frontend picks up. The cython frontend is intentionally dumb and requires this option on the command line. (Although there is a bit of room for debate on this difference.)

2. we always compile in python3 mode, with no way to get python2 mode.

The can always be changed per-file with the language_level directive. (If I understand correctly that that's what you are referring to here.)

@dcbaker dcbaker force-pushed the wip/2021-04/cython-language branch from 26ad37b to 3f0d023 Compare May 19, 2021 22:59
@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

I've added the an option for setting the python target (2 vs 3). I've called it cython_language, but I'm opened to painting the bike some other color.

@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

For C vs C++ I have something that works, but I have some annoying corners to sort out.

@dcbaker
Copy link
Member Author

dcbaker commented May 19, 2021

@rgommers, for y'all how important is C++ support? I.e, can we land support for cython without it and get back to that later?

@rgommers
Copy link
Contributor

@rgommers, for y'all how important is C++ support? I.e, can we land support for cython without it and get back to that later?

It's a small percentage of all files, so seems fine to land without it - we can still use custom_target for the .pyx files that target C++.

Here is the list of files containing "distutils: language=c++". Note that most are in .pxd files, which we don't need special handling for (they simply get used when you run cython --cplus on some .pyx file).

$ rg "distutils: language"
scipy/stats/_qmc_cy.pyx
6:# distutils: language = c++

scipy/stats/biasedurn.pyx.templ
1:# distutils: language = c++

scipy/spatial/ckdtree.pyx
7:# distutils: language = c++

scipy/stats/_boost/include/code_gen.py
60:            # distutils: language = c++

scipy/stats/_boost/include/templated_pyufunc.pxd
1:# distutils: language = c++

scipy/special/_generate_pyx.py
1329:        f.write("\n# distutils: language = c++\n")

scipy/optimize/_highs/cython/src/HighsLp.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/_highs_wrapper.pyx
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HConst.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsModelUtils.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsOptions.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/Highs.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsIO.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/highs_c_api.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsStatus.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/_highs_constants.pyx
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/SimplexConst.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsRuntimeOptions.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsLpUtils.pxd
1:# distutils: language=c++

scipy/optimize/_highs/cython/src/HighsInfo.pxd
1:# distutils: language=c++

Here is the list of all our Cython files:

$ find . -type f -name "*.pyx"
./cluster/_vq.pyx
./cluster/_optimal_leaf_ordering.pyx
./cluster/_hierarchy.pyx
./ndimage/src/_cytest.pyx
./ndimage/src/_ni_label.pyx
./linalg/cython_lapack.pyx
./linalg/_matfuncs_sqrtm_triu.pyx
./linalg/_decomp_update.pyx
./linalg/_solve_toeplitz.pyx
./linalg/cython_blas.pyx
./optimize/_trlib/_trlib.pyx
./optimize/_group_columns.pyx
./optimize/_highs/cython/src/_highs_constants.pyx
./optimize/_highs/cython/src/_highs_wrapper.pyx
./optimize/_lsq/givens_elimination.pyx
./optimize/cython_optimize/_zeros.pyx
./optimize/_bglu_dense.pyx
./io/matlab/mio_utils.pyx
./io/matlab/streams.pyx
./io/matlab/mio5_utils.pyx
./_lib/_ccallback_c.pyx
./_lib/_test_deprecation_def.pyx
./_lib/_test_deprecation_call.pyx
./_lib/messagestream.pyx
./special/_ufuncs_cxx.pyx
./special/cython_special.pyx
./special/_ellip_harm_2.pyx
./special/_comb.pyx
./special/_test_round.pyx
./special/_ufuncs.pyx
./fftpack/convolve.pyx
./interpolate/interpnd.pyx
./interpolate/_bspl.pyx
./interpolate/_ppoly.pyx
./sparse/csgraph/_shortest_path.pyx
./sparse/csgraph/_traversal.pyx
./sparse/csgraph/_flow.pyx
./sparse/csgraph/_tools.pyx
./sparse/csgraph/_matching.pyx
./sparse/csgraph/_reordering.pyx
./sparse/csgraph/_min_spanning_tree.pyx
./sparse/_csparsetools.pyx
./spatial/ckdtree.pyx
./spatial/_voronoi.pyx
./spatial/_hausdorff.pyx
./spatial/transform/rotation.pyx
./spatial/qhull.pyx
./signal/_max_len_seq_inner.pyx
./signal/_peak_finding_utils.pyx
./signal/_upfirdn_apply.pyx
./signal/_spectral.pyx
./signal/_sosfilt.pyx
./stats/_boost/src/beta_ufunc.pyx
./stats/_boost/src/nbinom_ufunc.pyx
./stats/_boost/src/binom_ufunc.pyx
./stats/_stats.pyx
./stats/_qmc_cy.pyx
./stats/_sobol.pyx
./stats/biasedurn.pyx

@scoder
Copy link

scoder commented May 20, 2021

files containing "distutils: language=c++". Note that most are in .pxd files

The distutils directive isn't looked at in .pxd files, simply because they are not handled by distutils.

@scoder
Copy link

scoder commented May 20, 2021

how important is C++ support? I.e, can we land support for cython without it and get back to that later?

Is there a reason not to use the cythonize frontend? For the simple case of compiling one source file, it's just as good as the cython frontend, but it considers the #distutils directive. If you pass --force, it will also ignore the file timestamps and always overwrite what's there.

@rgommers
Copy link
Contributor

Is there a reason not to use the cythonize frontend? For the simple case of compiling one source file, it's just as good as the cython frontend,

Not sure I agree with that. cythonize invokes setuptools under the hood right? Invoking a second build system from within another build system is not a good idea, you're going to get different compile flags, issues with caching, and all sorts of hard to debug issues down the line.

@scoder
Copy link

scoder commented May 20, 2021

cythonize invokes setuptools under the hood right?

Only if you ask it to build a complete extension module for you. If you just invoke it like cythonize module.pyx, without -i or -b, then it will just generate the .c/.cpp file for it and be done.

@rgommers
Copy link
Contributor

If you just invoke it like cythonize module.pyx, without -i or -b, then it will just generate the .c/.cpp file for it and be done.

Ah, didn't realize that. It seems to be missing flags though, like --output-file (this one is a hard necessity, because Meson doesn't allow in-tree builds) and --include-dir. It's not a superset of cython it seems, just a different thing altogether?

@scoder
Copy link

scoder commented May 20, 2021

It seems to be missing flags though, like --output-file

Oh, yeah, right. That sort-of goes against the idea of a "here's a bunch of files, do the right thing" kind of tool. Single file compilation is the normal case for cython, but more of a special case for cythonize. All of these settings go into generic options.

For the include path, there's -s include_path=…, for the output directory -s output_dir=…, for a single output file -s output_file=….

@jpakkane
Copy link
Member

What's the current plan on getting this merged?

@dcbaker
Copy link
Member Author

dcbaker commented May 31, 2021

I need to get the CI passing, then it's ready from my point of view

@dcbaker dcbaker force-pushed the wip/2021-04/cython-language branch from 92f86e3 to 0bc18f2 Compare June 7, 2021 16:26
@jpakkane
Copy link
Member

jpakkane commented Jun 7, 2021

@rgommers is this looking good for you?

@rgommers
Copy link
Contributor

rgommers commented Jun 7, 2021

@rgommers is this looking good for you?

Yes, I'm pretty happy with this. It works for all the simpler cases I'd expect it to work on. And the more complicated ones, like codegen of multiple files that depend on each other and other files in the tree, is the separate "structured inputs" PR gh-8775.

@jpakkane jpakkane merged commit 40e8a67 into mesonbuild:master Jun 7, 2021
@dcbaker dcbaker deleted the wip/2021-04/cython-language branch June 7, 2021 20:36
@rgommers
Copy link
Contributor

rgommers commented Jun 7, 2021

Awesome to have this merged. Thanks @dcbaker, all!

rgommers added a commit to rgommers/meson that referenced this pull request Jun 12, 2021
This is a follow-up to mesonbuildgh-8706, which contained the initial fix
to ninjabackend.py but somehow lost it. This re-applies the fix
and adds a test for it.

Without the fix, the error is:

  ninja: error: 'ct2.pyx', needed by 'libdir/ct2.cpython-39-x86_64-linux-gnu.so.p/ct2.pyx.c',
  missing and no known rule to make it
dcbaker pushed a commit that referenced this pull request Jun 14, 2021
This is a follow-up to gh-8706, which contained the initial fix
to ninjabackend.py but somehow lost it. This re-applies the fix
and adds a test for it.

Without the fix, the error is:

  ninja: error: 'ct2.pyx', needed by 'libdir/ct2.cpython-39-x86_64-linux-gnu.so.p/ct2.pyx.c',
  missing and no known rule to make it
@eli-schwartz eli-schwartz mentioned this pull request Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants