-
Notifications
You must be signed in to change notification settings - Fork 522
feat: pyo3 support module prefix + naming #3726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7a660ad to
3a98fed
Compare
35bc452 to
e19bd9c
Compare
UebelAndre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just one question
| load("//:defs.bzl", "pyo3_extension") | ||
|
|
||
| pyo3_extension( | ||
| name = "module_prefix", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add these attributes and not change the target name to foo/bar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that first, e.g. for string_sum:
pyo3_extension(
name = "foo/string_sum",
srcs = ["string_sum.rs"],
edition = "2021",
)
Building that target yields this error:
Error in fail: Crate name 'foo/string_sum' contains invalid character(s): /
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were several quirks with getting the names to line up (pymodule <-> output binary <-> crate name) + getting the outputs in the right directory so Python registers it as nested module rather than a top level module. Adding these (hopefully simple) knobs seemed like the cleanest approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nervous about the amount of indirection this introduces. What is the use case here that directory structure can't be made to what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow w.r.t. indirection. This change tackles two related issues on how the native extension gets imported:
(1)
Per the pyo3 docs, the lib name must match the pymodule definition.
[lib]
# The name of the native library. This is the name which will be used in Python to import the
# library (i.e. `import string_sum`). If you change this, you must also change the name of the
# `#[pymodule]` in `src/lib.rs`.
name = "string_sum"
As it stands with rules_rust the target name was being used to determine the final extension file path. Imagine you have a python library "foo" that has regular python code + a private extension you want to import as "_foo_internal". You'd have to declare your extension as follows:
pyo3_extension(
name = "_thing_internal",
# ...
)
The underscore prefix on the bazel target implies that it's not a first class target in the build graph when this is definitely not the case. Related, build generators often want to follow nice patterns for generated targets. So imagine we generated <name>_pyo3_ext for all pyo3 extensions-- this would mean out python import path also has the same suffix. The current behavior is tolerable but somewhat quirky and undesirable.
(2)
As for the second issue, to get a rust extension importable/loadable at a full module path you need to place the lib in the nested directory structure. Roughly:
import foo.bar # ---> foo/libfoo.so
import foo.bar.baz # --> foo/bar/libbaz.so
The current rules don't give you control over the output path of the extension, so the best you could do is to manually move the file in a subsequent target. But injecting my own copy on top of these rules would nuke the python providers (and the stub generation) as they just assume a root module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I can think of to make the UX of this change a bit cleaner is to squash the two params into one, e.g. module = "foo.bar" and then the rules/macros can split accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By indirection I mean a developers ability to understand where a target came from. Seeing //:well_hello_there in as a bazel target and then in python code seeing the.rebel.alliance I think is going to lead to more confusion than it's worth for the alternate import path. What is the story for how folks are expected to understand where code is being defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing //:well_hello_there in as a bazel target and then in python code seeing the.rebel.alliance
Sure, but that's a pretty egregious example I think. The naming is purposefully confusing and something folks can already do freely across most Bazel repos. Take rust_binary or any binary rule for that matter:
rust_binary(
name = "well_hello_there",
binary_name = "the.rebel.alliance.exe",
)
I don't want to configure my pyo3 extension targets in any purposefully confusing manner; I want to name them so the Bazel side is idiomatic for Bazel and the Python side is idiomatic for our Python usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to configure my pyo3 extension targets in any purposefully confusing manner; I want to name them so the Bazel side is idiomatic for Bazel and the Python side is idiomatic for our Python usage.
I'm not suggesting you're trying to do something malicious or anything haha. I've had a similar experience with pybind11 rules and it was incredibly frustrating where two different teams adopted two different naming conventions and debugging issues became harder for no meaningful reason. I opted to have them define a file with the name they wanted and reexport all symbols so there was a clear path to follow.
However, if this feature is strongly desired then I'd prefer there to just be module_name (matching the crate_name attribute) to be the fully qualified module name (combined prefix and name).
Adds support for specifying the module name + module prefix for pyo3 extensions.