Skip to content

Support editable installs of Cython pxd files #516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Nov 29, 2023

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Oct 1, 2023

Editable installs of packages containing Cython currently do not properly support downstream Cython packages finding the pxd files from the editable install. The issue is that editable installs are managed using a meta path finder, and Cython does not appear to support that for pxd files. However, Cython does support pth files adding include paths directly, so this PR solves the pxd search issue by adding the necessary paths directly to the pth file that currently adds the meta path finder.

Resolves #442

@vyasr
Copy link
Contributor Author

vyasr commented Oct 1, 2023

@LecrisUT @henryiii here's my first pass at working on #442. I'm running into an issue with it that I can't quite understand though. The scaffolding for the MRE is in place, but for some reason I get a build failure of the first package (pkg1) when I turn off build isolation (see https://github.com/scikit-build/scikit-build-core/pull/516/files#diff-d5c95d2fd1275d136b05a4336ce0df3dea361e1003672881d6068d5e05a6385dR56). @henryiii would you have an idea what that's about? I suspect that it's a paths issue because I see output indicating that the Cython compilation succeeded, but then the actual C compilation step fails saying that the one.c file is missing, which to me indicates that something is wrong with the paths and it's being put in the wrong place. When I inspect the temp directories pytest produced for the test I don't find the Cython-generated C file anywhere. However, the isolated build of pkg1 works just fine.

Once we sort that out, uncommenting the installation of pkg2 below that should demonstrate the issue. I've parametrized the test on editability, so in theory we should see one passed test (when editable mode is off) and one failed test (when editable mode is on).

Ignore the above, see my next comment. I resolved the issue I was originally running into.

The predefined pytest fixtures aren't really designed for multiple packages being installed in a single test, so I had to import some stuff manually from conftest to produce a reasonable test. Happy to look into cleaner solutions later, but for now I don't think it's critical for an initial demo of the problem.

To make it easier to see the differences between my PR and @LecrisUT's original that this is based off of (#399) I've opened a corresponding PR from my branch onto his at LecrisUT#1. That PR can be used to view just the relevant diff, while this PR will run CI for us.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 1, 2023

OK weird, I found the issue although I don't quite understand the underlying cause. It seems like there's a bug where it works with ninja but not make? Installing ninja into the environment allows the unisolated build to run. On the surface that sounds like a CMake bug, but given that it has to do with Python editable installs and build isolation maybe it's a situation where scikit-build-core needs to pass some extra information to CMake when the generator is Unix Makefiles.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 1, 2023

@LecrisUT 64b615d shows the error arising with the original layout, while 31c27cc shows that the same error appears when switching to a src/ layout for both packages. Locally I've tested rebasing out your changes from #399 and I see the same error as well.

@LecrisUT
Copy link
Collaborator

LecrisUT commented Oct 1, 2023

Quick note about src-layout, it's only relevant tothe original source files. The installed files have to strip out the src/ prefix. Built files don't care about src format either. It should look like

$ tree
.
├── pkg1
│   ├── CMakeLists.txt
│   ├── src
│   │   └── pkg1
│   │       └── one.pyx
│   ├── python
│   │   └── pkg1
│   │       ├── __init__.py  # This file is empty
│   │       └── one.pxd
│   └── pyproject.toml
└── pkg2
    ├── CMakeLists.txt
    ├── src
    │   └── pkg2
    │       └── two.pyx
    ├── python
    │   └── pkg2
    │       └── __init__.py  # This file is empty
    └── pyproject.toml
#CMakeLists.txt
add_custom_command(
    OUTPUT pkg1/one.c
    DEPENDS pkg1/one.pyx
    VERBATIM
    COMMAND "${CYTHON}" "${CMAKE_CURRENT_SOURCE_DIR}/src/pkg1/one.pyx" --output-file
    "${CMAKE_CURRENT_BINARY_DIR}/pkg1/one.c")

python_add_library(one MODULE "${CMAKE_CURRENT_BINARY_DIR}/pkg1/one.c"
    WITH_SOABI)

install(TARGETS one DESTINATION pkg1/)
#pyproject.toml
[tool.scikit-build]
wheel.packages = ["python/pkg1"]

The key part her is that .pyx is not installed through the pyproject.toml file.

Although now that I am looking at the error more carefully it is odd, it should still be able to pick it up

@henryiii
Copy link
Collaborator

henryiii commented Oct 6, 2023

#399 is in, but it doesn't work if there's an __init__.py or on Python 3.9. So this probably won't be working yet.

@LecrisUT
Copy link
Collaborator

LecrisUT commented Oct 6, 2023

#399 is in, but it doesn't work if there's an __init__.py or on Python 3.9. So this probably won't be working yet.

This one does not use importlib.resources, so just the additional submodule_search_path should in principal make this navigation doable as long as it only navigates within python recognized source files.

I think the issue is that there is some priority to how the the modules are loaded against .pxd + .so since the modules have the same name, or it could be using Path since it tests against both pkg1/one.pxd and pkg1/one/one.pxd. Maybe solving the __init__.py compatibility would also resolve the source of the issue for this one.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 11, 2023

A bit swamped this week but will try to get back into this in the next week or two.

@vyasr vyasr force-pushed the feat/editable_cython branch from 31c27cc to 480d720 Compare October 27, 2023 23:29
@vyasr
Copy link
Contributor Author

vyasr commented Oct 27, 2023

@henryiii OK, when you get a chance could you give me some pointers on how you think this feature should be implemented? You seemed to have some thoughts on #442 about what we should and shouldn't do that I didn't follow.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 28, 2023

Depending on your level of familiarity with Cython, we may need to get some help from a Cython developer here. From a quick inspection, it seems like since Cython is essentially a preprocessing transpiler of the pyx files, I'm not sure whether cimport statements respect meta path modifications done via the mechanisms used by the _$pkg_editable.py file. Naively I might have expected them to do so, but if I add some print statements inside find_spec I'm not seeing them during the Cython compilation step.

Furthermore, pxd files would change the 1-1 mapping from module to source file into a many-to-one mapping. I've done that by manually modifying the installed _pkg1_editable.py file and it works in the specific test case, but it makes the behavior ill-defined if find_spec goes into the if fullname in self.known_source_files block and then has to only return a single spec from a list of files, which doesn't make sense. We need some way to differentiate pxd from py files, and since find_spec doesn't provide that it makes me extra skeptical that this is the right mechanism for making pxd files searchable.

On the issue you also commented

I'm pretty sure you don't want to "install" the header files, you just want to be able to find them in-place. Similarly, I'd expect you'd not want to "install" .pyx files in your CMakeLists, only the .so's need to be "installed". Ahh, you are probably trying to put them in-place, so they get auto-discovered. Not fond of that (mixing compiled and .py files in source expecting a different mixture in the wheel), but it should be possible with a few wheel ignores (**.pyx).

I agree that the pyx files don't need to be added at all. I didn't do anything special to add them, but we could try and clean up whatever in scikit-build-core is doing that. FWIW one hacky solution I tried was to replace the pyx file with the pxd file in the known source files provided to the install function, but that didn't fix Cython compilation which again suggests to me that Cython isn't actually using this finder.

@LecrisUT
Copy link
Collaborator

Furthermore, pxd files would change the 1-1 mapping from module to source file into a many-to-one mapping. I've done that by manually modifying the installed _pkg1_editable.py file and it works in the specific test case

Seems you've identified the issue and it might be a way to approach it. I'm not familiar with cython, so I could use a breakdown of how you've edited, maybe a tree view of the files and the dictionary passed to the editable file that makes it work.

From what I read, we should filter the modules that have .pyx and not parse them in the editable file?

@vyasr
Copy link
Contributor Author

vyasr commented Nov 7, 2023

Sorry, my wording was confusing. When I said "and it works in this specific case", I was referring to the fact that in general it would not be well-defined for the known_source_files argument to the ScikitBuildRedirectingFinder to be a dict[str, list[str]] instead of a dict[str, str] because there needs to be a unique file to associate with the module. However, in hindsight I think this is a non-issue because the reason you end up with multiple files in the current situation is that we include pyx files, and we really shouldn't be including them at all. So in a world where the module loader approach worked for Cython, I think just removing pyx files from the source list while including pxd files when they are available would be sufficient. However, as I said above I'm not yet convinced that the module loader approach is even valid for Cython.

From what I read, we should filter the modules that have .pyx and not parse them in the editable file?

Essentially Cython can be thought of as a compiled superset of Python. In analogy to C, .pyx files are akin to .c source files, and .pxd files are akin to .h header files. Since every source file is compiled into a shared object, a given pyx file module's lookup in the redirecting finder should always be to an .so file in the known_wheel_files list. You should never include the .pyx files in the known_source_files list because .pyx files aren't directly importable; known_source_files should only contain .py files.

Cython supports cimport statements, which are basically import statements for other Cython modules. In order for one Cython module to use another one in certain contexts (basically any context that isn't pure Python) it has to cimport rather than import that module. However, Cython modules are not cimportable by default. To enable that, you have to put declarations into a corresponding pxd file (unlike C, this isn't an optional separation; the Cython compiler will not allow you to directly import from pyx at all, unlike a C compiler where this is possible but bad practice).

To debug, I've created a local environment with scikit-build-core, then done an editable install of pkg1 into it (which works fine). I then do an editable install of pkg2, which fails as expected. The following approaches all work to remedy this issue:

  • Set the PYTHONPATH directly to the pkg1/src directory, i.e. PYTHONPATH=/path/to/pkg1/src pip install -e . --no-build-isolation --config-settings=build_dir=build/{wheel_tag} -vv in the pkg2 directory
  • Add a test.pth file into my site-packages dir containing /path/to/pkg1/src
  • Add a test.pth file into my site-packages dir containing import mypath, and then adding another file mypath.py to site-packages that just does import sys;sys.path.append("/path/to/pkg1/src")

However, if I add a test.pth file into my site-packages dir containing import mypath and then have mypath.py containing:

import importlib.abc
import sys


class Finder(importlib.abc.MetaPathFinder):
    def find_spec(self, fullname, path=None, target=None):
        print(f"find_spec: {fullname=}, {path=}, {target=}", file=sys.stderr)
        return None


sys.meta_path.insert(0, Finder())

None of the prints of "find_spec..." mention pkg1 during Cython compilation. All of the mentions that I do see are of Python builds or Cython internals, suggesting that since Cython is itself implemented in Python we're seeing the bits that imports in order to do the transpiling, but the transpiling itself does not respect modifications of sys.meta_path.

Does that make sense?

@vyasr vyasr force-pushed the feat/editable_cython branch from 751eac1 to 3ce543b Compare November 20, 2023 19:18
@vyasr
Copy link
Contributor Author

vyasr commented Nov 20, 2023

All right, after some discussion at the last dev meeting we came up with an alternative solution that involves just adding the extra paths directly to the pth file. My diagnosis in #516 (comment) is correct, and the new solution leverages the fact that Cython does support pth files for paths even though it does not support meta path finders. I will follow up with Cython to try and get that support added in the future, but this solution seems workable in scikit-build-core for now.

@henryiii I've added a test, but I'm not sure how to get it to pass. How should I get installation of scikit-build-core in the virtual environment to install the current local package instead of pulling from PyPI? We could invoke pip install /path/to/scikit-build-core, but I'm not sure if that is the best option or if this is a problem that's already solved in other tests that I'm not seeing. I've verified that this solution works if I manually replicate the test steps into my own environments.

@vyasr vyasr marked this pull request as ready for review November 20, 2023 21:46
@henryiii henryiii mentioned this pull request Nov 29, 2023
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
@henryiii henryiii force-pushed the feat/editable_cython branch from 903f951 to 3b6f925 Compare November 29, 2023 18:20
@vyasr
Copy link
Contributor Author

vyasr commented Nov 29, 2023

Thanks for the fix! I guess the isolated fixture and the associated pep518 wheelhouse are designed exactly for the use case of pulling the current version of the code, makes sense. Let me know if you need anything else from me on this PR.

@henryiii
Copy link
Collaborator

Yes, they are. :)

@henryiii henryiii merged commit 91d9e60 into scikit-build:main Nov 29, 2023
@vyasr vyasr deleted the feat/editable_cython branch November 29, 2023 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support editable installs of packages containing Cython declaration files
3 participants