Skip to content

Make UseRefs more memory efficient #44495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 15, 2022
Merged

Make UseRefs more memory efficient #44495

merged 1 commit into from
Mar 15, 2022

Conversation

Keno
Copy link
Member

@Keno Keno commented Mar 7, 2022

With #44494, this cuts about 22M allocations (out of 59M) from the
compiler benchmark in #44492. Without #44494, it still reduces
the number of allocations, but not as much. This was originally
optimized in 100666b, but the
behavior of our compiler has changed to allow inling the Tuple{UseRef, Int}
into the outer struct, forcing a reallocation on every iteration.

With #44494, this cuts about 22M allocations (out of 59M) from the
compiler benchmark in #44492. Without #44494, it still reduces
the number of allocations, but not as much. This was originally
optimized in 100666b, but the
behavior of our compiler has changed to allow inling the Tuple{UseRef, Int}
into the outer struct, forcing a reallocation on every iteration.
else
return OOB_TOKEN
end
end
@inline getindex(x::UseRef) = _useref_getindex(x.urs.stmt, x.op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had hoped our calling convention would have been enough already to make this inlining awkwardness unnecessary. What causes it to be needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want UseRef to be SROA'd, so that UseRefIterator can be SROA'd. Without it UseRefIterator gets allocated.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem slightly odd that that needs to be mutable, since that implies we eventually need to copy the stmt back to the Instruction steam.

@Keno
Copy link
Member Author

Keno commented Mar 7, 2022

It does seem slightly odd that that needs to be mutable, since that implies we eventually need to copy the stmt back to the Instruction steam.

Yes, that's how this API works. At the end you need to put urs[] back into the instruction stream (or whatever else you were modifying).

@Keno
Copy link
Member Author

Keno commented Mar 15, 2022

Merging this now - the test for zero allocation will be added in #44557.

@Keno Keno merged commit e082917 into master Mar 15, 2022
@Keno Keno deleted the kf/userefsallocs branch March 15, 2022 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
latency Latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants