-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] Cache render target texture allocations across frames. #44527
[Impeller] Cache render target texture allocations across frames. #44527
Conversation
…still_hacky_cache
impeller/renderer/context.h
Outdated
| /// | ||
| /// This may or may not indicate a new frame, or there may be multiple | ||
| /// presents within a single frame due to platform view presence. | ||
| virtual void DidAcquireSurfaceFrame() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the surface, it doesn't seem like the HAL is a good place to cache user-allocated GPU resources, partly due to this frame concept we need to snake through the context in order to drive cache lifetimes. Would there be any downside to just letting applications choose when/how they want to reuse user-managed textures explicitly?
I don't think this new cache will be suitable for most renderers that will be built with Impeller. (Including EntityPass, as we continue to optimize -- especially if we want to use larger render target textures for rendering smaller passes, which would inevitably require viewport and sampling/UV adjustments in EntityPass).
We could add a convenient cache manager utility under Entity (or maybe core) that caches textures based on descriptor w/ a configurable lifetime, for example. This kind of thing would be driven by EntityPass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the context the HAL? and what is the application?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in terms of where does the cache live, we could definitely put it in entity pass. But I'm not sure how we live without the frame concept. I don't think its the case that 1 entity pass = 1 presentation cycle, or is it the case that the current cache is based around frames.
FYI @dnfield , does you AIKS canvas patch cause us to create multiple entity passes per frame (non-platform view)? or are impeller picutres always combined into a single entity pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can't store it in entity pass - but we could store it on the content context and access via entity pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the context the HAL?
Yeah, impeller::Context is the central resource of the HAL, since you use it to spawn all GPU resources and behaviors. By HAL I mean pretty much everything under impeller/renderer, or our equivalent of gfx_hal or WebGPU.
and what is the application?
Renderers built atop the HAL. EntityPass, Flutter GPU renderers, and my stupid side projects 😛.
But I'm not sure how we live without the frame concept. I don't think its the case that 1 entity pass = 1 presentation cycle, or is it the case that the current cache is based around frames.
One call to EntityPass::Render() should happen per-platform view per-frame.
does you AIKS canvas patch cause us to create multiple entity passes per frame (non-platform view)?
Oh, do you mean the DrawPicture stuff? DrawPicture merges a cloned EntityPass from a picture into another and won't result in EntityPass::Render() being called more than once per-platform view per-frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you probably mean #44248.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, I'll move the allocation cache to a structure on the content context. I think I'll adjust the render target constructors a bit so that they also accept an allocator - and I can wrap the context allocator in a caching content_context allocator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH the RenderTarget::CreateOffscreen[MSAA] static factories wound up being super Entities-specific in the end and belong in something like ContentContext anyhow. Moving those factories up into Entities and passing in some optional Entities-specific cache through EntityPass CreateRenderTarget seems like a potentially good route.
| VALIDATION_LOG << "Could not acquire current drawable."; | ||
| return nullptr; | ||
| } | ||
| context->DidAcquireSurfaceFrame(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we end up making the Impeller context unique per-surface/platform view?
| if (!context) { | ||
| return false; | ||
| } | ||
| context->DidFinishSurfaceFrame(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call these in the playground and between golden PNG generation in order to not leak/risk GPU OOM?
impeller/renderer/context.h
Outdated
| /// | ||
| /// This may or may not indicate a new frame, or there may be multiple | ||
| /// presents within a single frame due to platform view presence. | ||
| virtual void DidFinishSurfaceFrame() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call these BeginFrame() and EndFrame(), since surfaces aren't necessarily involved here?
We should also document that GPU resources may leak if they aren't called properly.
impeller/core/allocator.cc
Outdated
| data_to_recycle_.clear(); | ||
| data_to_recycle_.insert(data_to_recycle_.end(), retain.begin(), retain.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| data_to_recycle_.clear(); | |
| data_to_recycle_.insert(data_to_recycle_.end(), retain.begin(), retain.end()); | |
| data_to_recycle_.swap(retain); |
impeller/core/texture_descriptor.h
Outdated
| /// @brief Whether the [other] texture descriptor has the same properties. | ||
| /// | ||
| /// This is used to check compatibility of already allocated textures | ||
| /// in the render target cache.. | ||
| constexpr bool IsCompatibleWith(const TextureDescriptor& other) const { | ||
| return size == other.size && // | ||
| storage_mode == other.storage_mode && // | ||
| format == other.format && // | ||
| usage == other.usage && // | ||
| sample_count == other.sample_count && // | ||
| type == other.type; // | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just constexpr bool operator==(?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Marking WIP while I move the caching around. |
|
There should only be one EntityPass::Render per frame after my patch as well. But we should probably document this and DCHECK it somehow if possible. |
|
There will be more than one entity pass per frame if there are platform views, that is fine as we're not attempting to recycle the swapchain textures. |
bdero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just minor nits!
| /// allocated texture data for one frame. | ||
| /// | ||
| /// Any textures unused after a frame are immediately discarded. | ||
| class RenderTargetCache : public RenderTargetAllocator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like this idea!
impeller/core/texture_descriptor.h
Outdated
| sample_count != other.sample_count || // | ||
| type != other.type || // | ||
| compression_type != other.compression_type || // | ||
| mip_count != other.mip_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be return !(*this == other)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a struct ... does that work? I don't know C++ 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it should work; classes are just structs with private members by default. :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, duh. I thought you were saying to do a ptr comparison or something like an implict equal operator. I didn't realize you were specifically talking about using the == operation to define the != operation. Makes sense!
Co-authored-by: Brandon DeRosier <x@bdero.me>
…132612) flutter/engine@7409ce4...659cdfc 2023-08-16 jonahwilliams@google.com [Impeller] Cache render target texture allocations across frames. (flutter/engine#44527) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC jsimmons@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
…ayouts (#45088) This refactor makes recycling render passes easier since it starts to split out setting the layout of textures with creating the render pass description. I didn't fully implement the split since it would technically be slower. Until we get caching of render passes working it would require us to calculate the attachment descriptions twice, once to make the render pass and once to set the texture layout. teases out some of the work from: #44527 in the service of implementing: #44861 issue: flutter/flutter#133182 The final split will have this function for recycled render passes: ```c++ void SetTextureLayouts(const RenderTarget& render_target, const std::shared_ptr<CommandBufferVK>& command_buffer) { for (const auto& [bind_point, color] : render_target.GetColorAttachments()) { SetTextureLayout(color, CreateAttachmentDescription(color, &Attachment::texture), command_buffer, &Attachment::texture); if (color.resolve_texture) { SetTextureLayout( color, CreateAttachmentDescription(color, &Attachment::resolve_texture), command_buffer, &Attachment::resolve_texture); } } if (auto depth = render_target.GetDepthAttachment(); depth.has_value()) { SetTextureLayout( depth.value(), CreateAttachmentDescription(depth.value(), &Attachment::texture), command_buffer, &Attachment::texture); } if (auto stencil = render_target.GetStencilAttachment(); stencil.has_value()) { SetTextureLayout( stencil.value(), CreateAttachmentDescription(stencil.value(), &Attachment::texture), command_buffer, &Attachment::texture); } } ``` ## Pre-launch Checklist - [x] I read the [Contributor Guide] and followed the process outlined there for submitting PRs. - [x] I read the [Tree Hygiene] wiki page, which explains my responsibilities. - [x] I read and followed the [Flutter Style Guide] and the [C++, Objective-C, Java style guides]. - [x] I listed at least one issue that this PR fixes in the description above. - [x] I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See [testing the engine] for instructions on writing and running engine tests. - [x] I updated/added relevant documentation (doc comments with `///`). - [x] I signed the [CLA]. - [x] All existing and new tests are passing. If you need help, consider asking for advice on the #hackers-new channel on [Discord]. <!-- Links --> [Contributor Guide]: https://github.com/flutter/flutter/wiki/Tree-hygiene#overview [Tree Hygiene]: https://github.com/flutter/flutter/wiki/Tree-hygiene [Flutter Style Guide]: https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo [C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style [testing the engine]: https://github.com/flutter/flutter/wiki/Testing-the-engine [CLA]: https://cla.developers.google.com/ [flutter/tests]: https://github.com/flutter/tests [breaking change policy]: https://github.com/flutter/flutter/wiki/Tree-hygiene#handling-breaking-changes [Discord]: https://github.com/flutter/flutter/wiki/Chat
…utter#44527) When allocating a non-host visible texture for a render target, the allocator will hold onto a reference to this texture descriptor. On the immediate next frame, this texture is made available to replace requested allocations for new render target textures. if this texture is not used, then at the end of the next frame it is discarded. This removes the vast majority of allocations in most flutter gallery and wonderous. This does not attempt to use different sized textures for the cache. There are two caveats noted in the PR contents. Fixes flutter/flutter#131515
…ayouts (flutter#45088) This refactor makes recycling render passes easier since it starts to split out setting the layout of textures with creating the render pass description. I didn't fully implement the split since it would technically be slower. Until we get caching of render passes working it would require us to calculate the attachment descriptions twice, once to make the render pass and once to set the texture layout. teases out some of the work from: flutter#44527 in the service of implementing: flutter#44861 issue: flutter/flutter#133182 The final split will have this function for recycled render passes: ```c++ void SetTextureLayouts(const RenderTarget& render_target, const std::shared_ptr<CommandBufferVK>& command_buffer) { for (const auto& [bind_point, color] : render_target.GetColorAttachments()) { SetTextureLayout(color, CreateAttachmentDescription(color, &Attachment::texture), command_buffer, &Attachment::texture); if (color.resolve_texture) { SetTextureLayout( color, CreateAttachmentDescription(color, &Attachment::resolve_texture), command_buffer, &Attachment::resolve_texture); } } if (auto depth = render_target.GetDepthAttachment(); depth.has_value()) { SetTextureLayout( depth.value(), CreateAttachmentDescription(depth.value(), &Attachment::texture), command_buffer, &Attachment::texture); } if (auto stencil = render_target.GetStencilAttachment(); stencil.has_value()) { SetTextureLayout( stencil.value(), CreateAttachmentDescription(stencil.value(), &Attachment::texture), command_buffer, &Attachment::texture); } } ``` ## Pre-launch Checklist - [x] I read the [Contributor Guide] and followed the process outlined there for submitting PRs. - [x] I read the [Tree Hygiene] wiki page, which explains my responsibilities. - [x] I read and followed the [Flutter Style Guide] and the [C++, Objective-C, Java style guides]. - [x] I listed at least one issue that this PR fixes in the description above. - [x] I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See [testing the engine] for instructions on writing and running engine tests. - [x] I updated/added relevant documentation (doc comments with `///`). - [x] I signed the [CLA]. - [x] All existing and new tests are passing. If you need help, consider asking for advice on the #hackers-new channel on [Discord]. <!-- Links --> [Contributor Guide]: https://github.com/flutter/flutter/wiki/Tree-hygiene#overview [Tree Hygiene]: https://github.com/flutter/flutter/wiki/Tree-hygiene [Flutter Style Guide]: https://github.com/flutter/flutter/wiki/Style-guide-for-Flutter-repo [C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style [testing the engine]: https://github.com/flutter/flutter/wiki/Testing-the-engine [CLA]: https://cla.developers.google.com/ [flutter/tests]: https://github.com/flutter/tests [breaking change policy]: https://github.com/flutter/flutter/wiki/Tree-hygiene#handling-breaking-changes [Discord]: https://github.com/flutter/flutter/wiki/Chat
When allocating a non-host visible texture for a render target, the allocator will hold onto a reference to this texture descriptor. On the immediate next frame, this texture is made available to replace requested allocations for new render target textures. if this texture is not used, then at the end of the next frame it is discarded.
This removes the vast majority of allocations in most flutter gallery and wonderous. This does not attempt to use different sized textures for the cache.
There are two caveats noted in the PR contents.
Fixes flutter/flutter#131515