Skip to content

Conversation

@andyscott
Copy link

Adds support for specifying the module name + module prefix for pyo3 extensions.

@andyscott andyscott force-pushed the ags/pyo3-module-path branch from 7a660ad to 3a98fed Compare November 13, 2025 01:56
@andyscott andyscott changed the title pyo3: support module prefix + naming feat: pyo3 support module prefix + naming Nov 13, 2025
@andyscott andyscott force-pushed the ags/pyo3-module-path branch from 35bc452 to e19bd9c Compare November 15, 2025 05:43
Copy link
Collaborator

@UebelAndre UebelAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one question

load("//:defs.bzl", "pyo3_extension")

pyo3_extension(
name = "module_prefix",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add these attributes and not change the target name to foo/bar?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that first, e.g. for string_sum:

pyo3_extension(
    name = "foo/string_sum",
    srcs = ["string_sum.rs"],
    edition = "2021",
)

Building that target yields this error:

Error in fail: Crate name 'foo/string_sum' contains invalid character(s): /

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were several quirks with getting the names to line up (pymodule <-> output binary <-> crate name) + getting the outputs in the right directory so Python registers it as nested module rather than a top level module. Adding these (hopefully simple) knobs seemed like the cleanest approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm nervous about the amount of indirection this introduces. What is the use case here that directory structure can't be made to what you want?

Copy link

@ags-openai ags-openai Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow w.r.t. indirection. This change tackles two related issues on how the native extension gets imported:

(1)

Per the pyo3 docs, the lib name must match the pymodule definition.

[lib]
# The name of the native library. This is the name which will be used in Python to import the
# library (i.e. `import string_sum`). If you change this, you must also change the name of the
# `#[pymodule]` in `src/lib.rs`.
name = "string_sum"

As it stands with rules_rust the target name was being used to determine the final extension file path. Imagine you have a python library "foo" that has regular python code + a private extension you want to import as "_foo_internal". You'd have to declare your extension as follows:

pyo3_extension(
    name = "_thing_internal",
    # ...
)

The underscore prefix on the bazel target implies that it's not a first class target in the build graph when this is definitely not the case. Related, build generators often want to follow nice patterns for generated targets. So imagine we generated <name>_pyo3_ext for all pyo3 extensions-- this would mean out python import path also has the same suffix. The current behavior is tolerable but somewhat quirky and undesirable.

(2)

As for the second issue, to get a rust extension importable/loadable at a full module path you need to place the lib in the nested directory structure. Roughly:

import foo.bar # ---> foo/libfoo.so
import foo.bar.baz # --> foo/bar/libbaz.so

The current rules don't give you control over the output path of the extension, so the best you could do is to manually move the file in a subsequent target. But injecting my own copy on top of these rules would nuke the python providers (and the stub generation) as they just assume a root module.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I can think of to make the UX of this change a bit cleaner is to squash the two params into one, e.g. module = "foo.bar" and then the rules/macros can split accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By indirection I mean a developers ability to understand where a target came from. Seeing //:well_hello_there in as a bazel target and then in python code seeing the.rebel.alliance I think is going to lead to more confusion than it's worth for the alternate import path. What is the story for how folks are expected to understand where code is being defined?

Copy link

@ags-openai ags-openai Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing //:well_hello_there in as a bazel target and then in python code seeing the.rebel.alliance

Sure, but that's a pretty egregious example I think. The naming is purposefully confusing and something folks can already do freely across most Bazel repos. Take rust_binary or any binary rule for that matter:

rust_binary(
   name = "well_hello_there",
   binary_name = "the.rebel.alliance.exe",
)

I don't want to configure my pyo3 extension targets in any purposefully confusing manner; I want to name them so the Bazel side is idiomatic for Bazel and the Python side is idiomatic for our Python usage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to configure my pyo3 extension targets in any purposefully confusing manner; I want to name them so the Bazel side is idiomatic for Bazel and the Python side is idiomatic for our Python usage.

I'm not suggesting you're trying to do something malicious or anything haha. I've had a similar experience with pybind11 rules and it was incredibly frustrating where two different teams adopted two different naming conventions and debugging issues became harder for no meaningful reason. I opted to have them define a file with the name they wanted and reexport all symbols so there was a clear path to follow.

However, if this feature is strongly desired then I'd prefer there to just be module_name (matching the crate_name attribute) to be the fully qualified module name (combined prefix and name).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants