Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move observability-relevant structure fields to the top #105271

Merged

Conversation

P403n1x87
Copy link
Contributor

Some structures have fields that are used by out-of-process tools, like Austin. Having these fields defined after some more complex structures makes it hard to maintain these tools. With this change, we move the declaration of the most useful fields to the top of the structure definition. This reduces the amount of irrelevant information that the mentioned tools have to replicate to retrieve the actually useful data

@P403n1x87
Copy link
Contributor Author

@markshannon @pablogsal I hope you don't mind if I tag you on this PR. I was looking into the changes required to support 3.12 in Austin and thought of proposing this change to make my life (and presumably that of those who maintain similar tools) easier.

@P403n1x87 P403n1x87 force-pushed the refactor/observability-fields-first branch from 3f41f83 to 60d2b16 Compare June 3, 2023 15:43
Some structures have fields that are used by out-of-process tools, like
Austin. Having these fields defined after some more complex structures
makes it hard to maintain these tools. With this change, we move the
declaration of the most useful fields to the top of the structure
definition. This reduces the amount of irrelevant information that the
mentioned tools have to replicate to retrieve the actually useful data
@pablogsal
Copy link
Member

Unfortunately this is not a bugfix so technically we cannot backport this unless @Yhg1s gives us greenlight.

@pablogsal pablogsal changed the title Move observability-relevant structure fields to top Move observability-relevant structure fields to the top Jun 8, 2023
@pablogsal pablogsal merged commit 410c2f1 into python:main Jun 8, 2023
@pablogsal
Copy link
Member

@Yhg1s are you ok backporting this to 3.12?

@Yhg1s
Copy link
Member

Yhg1s commented Jun 8, 2023

Hrrm, not really, partly because I'm really not sold on the usefulness of the change, and partly because I'm aware of software poking at these things so this is an ABI change. I'd rather not backport, but if you want to try and sell me on the usefulness feel free :)

@markshannon
Copy link
Member

Did anyone benchmark this?

The interpreter frame is the hottest piece of memory in the VM, so care needs to be taken rearranging the fields.
The eval_breaker field is also quite hot, so moving it away from the front of the of the interpreter state may well harm performance. This PR is undoing the layout change in #104581.

The order of fields should not be part of the API or ABI (apart from not moving them during a release). They should (IMO) be arranged for performance.
In fact there is no guarantee that the fields will even continue to exist. We have added and removed fields freely in the past, and want the freedom to continue to do so.

Also, what fields are important, and too whom?
What if tool A wants field X and tool B wants field Y, then what?
The order of fields needs to be an internal implementation detail. Things are just too fragile otherwise.

If we want to expose the offset of certain fields to out-of-process tools, then we could put the offsets in a table.

@markshannon
Copy link
Member

Why is there no issue for this?
Which fields "are used by of out process tools"?
This PR doesn't make it clear what is being moved before what and why.

@pablogsal
Copy link
Member

Well, as I can see that this change is highly controversial, let's revert it until we all agree.

@pablogsal
Copy link
Member

#105512 for the revert

@pablogsal
Copy link
Member

This has been reverted

@P403n1x87
Copy link
Contributor Author

@markshannon are there specific benchmark designed to measure the performance the structures involved in this change.

Did anyone benchmark this?

I did not benchmark this, nor I have an idea of why fields are where they currently are, since there are no comments in the source code. Without that information available, I took the liberty to move them at will. I appreciate one might want to invoke the principle of locality here, but without extra information I had to take a guess.

The order of fields should not be part of the API or ABI (apart from not moving them during a release). They should (IMO) be arranged for performance.

This is not a request to make fields part of the A(B/P)I

In fact there is no guarantee that the fields will even continue to exist. We have added and removed fields freely in the past, and want the freedom to continue to do so.

It is quite likely that the most interesting fields will always be there (they might change name, as they have done in the past, but they are essentially still there).

Also, what fields are important, and too whom? What if tool A wants field X and tool B wants field Y, then what? The order of fields needs to be an internal implementation detail. Things are just too fragile otherwise.

It would be hard to take opinions out of this. As the maintainer of Austin, I have moved fields in the order that I makes the more sense to me (but hopefully to others as well). However I don't have a scientific answer to give for the ordering.

If we want to expose the offset of certain fields to out-of-process tools, then we could put the offsets in a table.

That would be great, but as far as I understand this is something that will perhaps come in the future. Here I'm looking for something that can go in 3.12.

@markshannon
Copy link
Member

That would be great, but as far as I understand this is something that will perhaps come in the future. Here I'm looking for something that can go in 3.12.

Adding a new symbol might be more acceptable than moving fields around, regarding ABI compatibility.
But that's @Yhg1s decision.

@markshannon
Copy link
Member

@P403n1x87
Want to make a list of the fields you need to access?
What's the best way to make information about offsets and the like visible to out-of-process tools?
We discussed a similar problem in #100987

@P403n1x87
Copy link
Contributor Author

@P403n1x87 Want to make a list of the fields you need to access?

The list of fields that I'm currently relying on can be inferred from https://github.com/P403n1x87/austin/blob/b48d980f6f3d0808d9221ee611a9c3094e02bb50/src/version.h#L196-L306

What's the best way to make information about offsets and the like visible to out-of-process tools? We discussed a similar problem in #100987

I've shared my thoughts in this comment #100987 (comment). I think a symbol/section like _PyRuntime pointing to an offset structure would be enough.

@markshannon
Copy link
Member

Want to make an issue for adding that info?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants