Skip to content

Conversation

@Icxolu
Copy link
Contributor

@Icxolu Icxolu commented Jul 28, 2024

This adds the second lifetime to FromPyObject and removes FromPyObjectBound. I hope I adjusted all the bounds correctly. I believe the equivalent to our current implementation is use HRTB for the input lifetime on any generic bound. But this should only be necessary for containers that create temporary Python references during extraction. For wrapper types it should be possible to just forward the relaxed bound.

For easier trait bounds FromPyObjectOwned is introduced. This is blanket implemented for any FromPyObject type that does not depend on the input lifetime. It is intended to be used as a trait bound, the idea is inspired by serde (Deserialize <=> DeserializeOwned)

I tried to document different cases in the migration guide, but it can probably still be extended. Changelog entry is still missing, will write that tomorrow.

Breaking changes:

  • the second lifetime
  • the extract method without default
  • the addition lifetime bound on extract_bound to make the default work

@mejrs
Copy link
Member

mejrs commented Jul 28, 2024

I propose that we add a FromPyObjectOwned trait similar to serde's DeserializeOwned. It's much easier to explain "just use this trait if the output doesn't borrow from Python" rather than "you need higher ranked trait bounds".

@Icxolu
Copy link
Contributor Author

Icxolu commented Jul 29, 2024

Thanks! I think that's a great idea. It's essentially the same thing, but it's much easier to grasp by giving it a distinct name, plus we can better document it in the API docs. I will adapt this to make use of that.

@Icxolu Icxolu force-pushed the from-pyobject branch 3 times, most recently from c960124 to a1c0bed Compare July 29, 2024 18:16
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing on with this! Overall it looks good to me, and I only have a few docs suggestions.

Before we definitely commit to this, I want to take a brief moment just to reflect on the choice of Borrowed over &Bound. There's at least two good technical reasons, which is the lifetime constraint and performance by avoiding pointer-to-pointer. The codspeed branch does show a few slight perf improvements in the ~4% range on tuple extraction and we have further possibilities like PyRef containing Borrowed once we do this.

That said, we deliberately kept the Borrowed smart pointer out of the way as much as possible in the original 0.21 design to try to keep new concepts down. Is it fine to increase its use? I think the upsides justify this, though we might want to increase the quality of documentation and examples on Borrowed (which I'd left quite minimal so far).

#[automatically_derived]
impl #trait_generics #pyo3_path::FromPyObject<#lt_param> for #ident #generics #where_clause {
fn extract_bound(obj: &#pyo3_path::Bound<#lt_param, #pyo3_path::PyAny>) -> #pyo3_path::PyResult<Self> {
impl #trait_generics #pyo3_path::FromPyObject<'_, #lt_param> for #ident #generics #where_clause {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be an interesting question for later if we ever support the 'a lifetime in our #[derive(FromPyObject)]. Definitely not for now 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something to experiment with in the future, but out of scope for now for sure 😅

@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 3, 2024

Thanks for the review! You are right, the choice of Borrowed over &Bound is definitely something to discuss about. I used it here because it was the final form of FromPyObjectBound that we landed on. Looking back, the change to Borrowed happened in #3959, which was related to the gil-refs API, so it might not be strictly necessary anymore (haven't checked). The points you mentioned seem like good justification why it still makes sense to introduce additional complexity here.

I think the upsides justify this, though we might want to increase the quality of documentation and examples on Borrowed (which I'd left quite minimal so far).

I agree, if we put Borrowed in such a core part of the infrastructure it deserves a documentation overhaul. We might also want to take a look at the current (public) API and additionally expose some helpers. Something like to_any comes to mind, which is currently internal only.

Copy link
Member

@mejrs mejrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the recently added FromPyObject documentation is too "implementation detail"-y.

I'd prefer if this documentation is kept short and concise, one paragraph max.

For example:

Depending on the python version and implementation some `FromPyObject`
 implementations may produce Rust types that point into the Python type.
For example `PyString` and `PyBytes` may convert into `str` and `[u8]` references
 that borrow from the original Python object.

Types that do not borrow from the input should use `FromPyObjectOwned` instead.

@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 4, 2024

Thanks for the feedback! I took another pass over it and tried to make it a bit more concise. I do however think that it's worth to talk about the individual lifetimes at play here, since this trait is right in the center of PyO3. I moved the paragraph about collections into the FromPyObjectOwned docs, because it works much better there with the code example, and replaced it with a smaller note for more info. I reworded my examples a bit and moved them under a # Details section, maybe this works better?

Let me know what you think about this.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for keeping this branch alive, I guess let's move forward with this before it gets too painful to keep updating.

I had yet another reflection on &Bound vs Borrowed and I'm feeling reasonably comfortable that Borrowed is the correct choice.


impl<'py, $($T: FromPyObject<'py>),+> FromPyObject<'py> for ($($T,)+) {
fn extract_bound(obj: &Bound<'py, PyAny>) -> PyResult<Self>
impl<'a, 'py, $($T: FromPyObject<'a, 'py>),+> FromPyObject<'a, 'py> for ($($T,)+) {
Copy link
Member

@davidhewitt davidhewitt Aug 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record and the decision to stick with Borrowed, the reason is that if we used &'a Bound here for the input argument, then tuple extraction would be forced to restrict elements to FromPyObjectOwned.

That maybe wouldn't be a terrible thing, as this is currently a special case when compared with every other container, which is forced to use FromPyObjectOwned due to their mutability.

But I think it would be a breaking change, so it's probably not a good change overall. (E.g. users would lose the ability to extract tuples containing &str, &Bound and Borrowed)

I think as a secondary effect, by using Borrowed here we might be able to change extract_argument.rs to make more use of Borrowed, which IIRC is a possible performance win.

@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 31, 2024

I believe this now also finishes the _bound methods deprecations 🎉

@davidhewitt
Copy link
Member

Fantastic, thank you so much for your help on pushing that over the line. When I get some time I'll try to write up a list of what we might have left for 0.23 (hopefully not much).

@davidhewitt davidhewitt added this pull request to the merge queue Aug 31, 2024
@davidhewitt davidhewitt removed this pull request from the merge queue due to a manual request Aug 31, 2024
@davidhewitt
Copy link
Member

Ah, I just started playing with this locally and found one additional finding: having Borrowed<'a, 'py, T> as the input type means that it's impossible to implement FromPyObject<'a, 'py> for &'a Bound<'py, T> - because the input borrowed can't be cast into a &'a Bound<'py, T> for a sufficient lifetime (pointer goes in, pointer-to-pointer goes out).

Is that a problem? I think not, because that's what .downcast() is for (and it's potentially nice to separate them like this), however just worth a brief double-check that others agree.

@davidhewitt
Copy link
Member

Oh, and one more case which comes up in local testing. At the moment from_py_with extractors take &Bound<'py, PyAny>. Should we be considering changing those to Borrowed too, for consistency?

@davidhewitt
Copy link
Member

Should we be considering changing those to Borrowed too, for consistency?

It feels to me like we might want to consider macro tricks such that users can use both &Bound and Borrowed.

@Icxolu
Copy link
Contributor Author

Icxolu commented Sep 1, 2024

having Borrowed<'a, 'py, T> as the input type means that it's impossible to implement FromPyObject<'a, 'py> for &'a Bound<'py, T>

That's true, but as you mention below, I actually think this a good thing and makes a clearer use case between .extract() and .downcast().

Oh, and one more case which comes up in local testing. At the moment from_py_with extractors take &Bound<'py, PyAny>. Should we be considering changing those to Borrowed too, for consistency?

Hmm, good question. I guess that is part of the question about how prominent we want Borrowed to be. Providing some flexibility via macro magic as you suggested sounds like a good compromise. I can explore that as a followup.

@davidhewitt
Copy link
Member

Yes agreed, it's definitely a follow-up to make changes to expand from_py_with.

I wonder, are there any other options we have here? E.g. does it work to have a trait with both extract and extract,_borroeed, maybe with a default implementation of extract,_borrowed? I suspect that might not work because of the same reason we can't currently have &Bound as an output.

I am still feeling a little uneasy about the choice to commit Borrowed more directly into the face of users. I think it is probably the right choice still, as long as we improve the documentation and API on borrowed. Maybe I will quickly try to get the perf wins out of extract_argument.rs later today to add another data point.

@davidhewitt
Copy link
Member

And (sorry to keep dangling ideas on here), inspired by the new IntoPyObject, I have two further perf related thoughts while we are making breaking changes here:

  • In the past we have noted that the PyResult return type forced conversion of errors to PyErr and made .extract() a performance footgun compared to .downcast(). There is still this problem for FromPyObject for Bound<T>. Should we add a type Error in the same way we are adding for IntoPyObject?
  • It is possible to call .extract() on e.g. Bound<PyString> to get a String, but this comes at the cost of a redundant Python type check because FromPyObject always works on PyAny. Could we add a type Source, and add a way to make Bound<T> directly offer extract() where U: FromPyObject<Source = T>?

Both changes would significantly complicate the trait. I wonder, is there a way that we can support a simple trait like FromPyObject and a more advanced FromPyObjectImpl, which has a blanket from FromPyObject, so users can just implement the simple trait if that's good enough?

All food for thought and I think worth discussing while we're committing to breaking changes here.

@Icxolu
Copy link
Contributor Author

Icxolu commented Sep 1, 2024

And (sorry to keep dangling ideas on here), inspired by the new IntoPyObject, I have two further perf related thoughts while we are making breaking changes here:

No problem 😄

  • In the past we have noted that the PyResult return type forced conversion of errors to PyErr and made .extract() a performance footgun compared to .downcast(). There is still this problem for FromPyObject for Bound<T>. Should we add a type Error in the same way we are adding for IntoPyObject?

If we're willing to do more breaking changes here, that sounds like a good idea to me. The only downsides I can see

  • It is possible to call .extract() on e.g. Bound<PyString> to get a String, but this comes at the cost of a redundant Python type check because FromPyObject always works on PyAny. Could we add a type Source, and add a way to make Bound<T> directly offer extract() where U: FromPyObject<Source = T>?

This would be really cool indeed. I guess we would maybe somehow need to take advantage of method precedence between trait and inherent methods, to make the fall through to PyAny work...

@davidhewitt davidhewitt mentioned this pull request Sep 13, 2024
3 tasks
@Icxolu Icxolu added this to the 0.24 milestone Oct 5, 2024
@Icxolu Icxolu force-pushed the from-pyobject branch 2 times, most recently from c865e7f to c366603 Compare August 8, 2025 22:42
@Icxolu Icxolu force-pushed the from-pyobject branch 2 times, most recently from 8332c17 to 42dd0d4 Compare August 31, 2025 15:13
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work here, I think this is pretty much ready to go! I had a few minor comments, mostly I think documentation related.

I guess there are a few follow-ups we should probably throw into issues so we don't forget to complete for 0.27:

  • Better documentation on borrowed
  • We should explore if we can make some more error types to delay PyErr creation, e.g. on the set extraction it looks like the error is essentially a downcast error or the key extraction error. I guess there's a tradeoff here of API complexity vs performance.
  • I think the functions in extract_argument.rs probably can take Borrowed rather than &Bound but I think that's probably going to involve macro changes which would preferably not be a part of this diff.

pub trait FromPyObject<'py>: Sized {
/// Types that must not borrow from the input can use [`FromPyObjectOwned`] as a restriction. This
/// is most often the case for collection types. See its documentation for more details.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have a section to this doc "Implementing FromPyObject" which lists the choices available:

  • #[derive(FromPyObject)]
  • manual implementation for types without a lifetime (i.e. impl FromPyObject<'_, '_> for X)
  • manual implementation for types with a lifetime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I added a section with these. Do you have a suggestion for an example type with lifetimes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!

For the type with lifetimes, I suggest using either &'a str or Bound<'py, T> and saying "this is roughly how this is implemented in PyO3", I think just choosing one of the two is good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@davidhewitt
Copy link
Member

Oh, and I think there's a note in the guide about performance of .cast() vs .extract() which is presumably not true once we've landed this change?

@Icxolu
Copy link
Contributor Author

Icxolu commented Sep 2, 2025

Thanks for the review ❤️ . I've opened some sub-issues in #5390 to track various followups from here.

@Icxolu Icxolu force-pushed the from-pyobject branch 2 times, most recently from 2f6fb19 to 52dba29 Compare September 2, 2025 18:55
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject to final documentation fixups, let's merge!

Thank you for all the iteration and R&D done here 🚀

pub trait FromPyObject<'py>: Sized {
/// Types that must not borrow from the input can use [`FromPyObjectOwned`] as a restriction. This
/// is most often the case for collection types. See its documentation for more details.
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!

For the type with lifetimes, I suggest using either &'a str or Bound<'py, T> and saying "this is roughly how this is implemented in PyO3", I think just choosing one of the two is good enough.

@Icxolu
Copy link
Contributor Author

Icxolu commented Sep 2, 2025

Great, all green 🎉 I'm going to send this now. We can handle any touchups that are left in followups.

@Icxolu Icxolu added this pull request to the merge queue Sep 2, 2025
Merged via the queue into PyO3:main with commit 5226450 Sep 2, 2025
42 of 44 checks passed
@Icxolu Icxolu deleted the from-pyobject branch September 2, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants