-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] Begin migrating to RenderPass command recording API. #49480
[Impeller] Begin migrating to RenderPass command recording API. #49480
Conversation
…ine into example-of-new-cmd-api
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact "@test-exemption-reviewer" in the #hackers channel in Chat (don't just cc them here, they won't see it! Use Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
Do you want to land this, or is this just for demo purposes? |
auto result = AddCommand(std::move(pending_)); | ||
pending_ = Command{}; | ||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are probably opportunities to cut down API traffic by not having draw calls reset all state, but starting with this seems like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah in fact I think you mentioned this earlier w.r.t. the viewport and scissor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this change is mostly pretty dumb as its keeping the exact same implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mind keeps wandering to this patch tonight, so figured I'd jot down my thonks in more detail. ;) (No action necessary)
I think the winning long game here is to have no implicit draw state management behavior in RenderPass
at all (except for the default state, of course).
It would be great if we end up in a place where we don't implicitly reset any state when recording a draw call, since that would maximally open the door for us to optimize traffic within high context parts of the pass workload (for example, any Contents
that performs multiple draws).
Of course, we'd still probably want utilities like RenderPass::ResetState()
for low context transitions (like going from one Contents
to another).
The refactor in Entities for this would be ezpz. All we'd need to get started is something like:
- Refactor
Contents
to use a stub pattern -- non-virtualContents::Render()
calls virtualContents::OnRender()
-- and haveContents::Render()
callrender_pass->ResetState()
(or maybe have an Entities-specific utility for this that ignores stuff like viewport/scissor because we know we're not using them right now) just before dispatching toContents::OnRender()
. - Then, for each
Contents::OnRender()
impl we have that records multiple draws, don't reset stuff in cases where it's obviously unnecessary. For example, if the draws are using the same pipeline, you probably don't need to reset anything at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with not resetting the state. Really, that is just working against ourselves since we also have the pass binding ccache essentially undoing that work at the render pass level. I suspect we will still need some state tracking for bindings, but perhaps we can get sufficiently clever as to make that irrelevant.
cmd.pipeline = renderer.GetSolidFillPipeline(options); | ||
cmd.BindVertices(std::move(geometry_result.vertex_buffer)); | ||
|
||
#ifdef IMPELLER_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be automated away by performing this ifdef in SetCommandLabel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we make this accept a string_view, then there won't be any copy if it is passed a const char*, and we can ifdef out before setting the label.
const SampledImageSlot& slot, | ||
const ShaderMetadata& metadata, | ||
std::shared_ptr<const Texture> texture, | ||
std::shared_ptr<const Sampler> sampler) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API to encode commands on the RenderPass
directly LGTM. (And the generated reflection headers will also change to operate on the RenderPass
instead of Command
, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever want multiple command buffers for one render pass?
I'm not a huge fan of multiplexing Command and RenderPass. Seems like we could make a rule that Command either moves or has references to its fields and get the same performance with a more clear abstraction. That may mean bending the rule against rvalue in the style guide though.
Having 2 concepts and dissolving one concept and making an uber concept, ignoring performance, seems like we are losing something.
We're not going to have two concepts, there will only be the render pass. But that is a big change, and so this PR is keeping around command until I remove it. |
I think I can either land the migration in chunks or all at once, whatever you'd prefer to review. |
I meant having a [RenderPass, Command] and going to [RenderPass'], where RenderPass' encompasses all the concepts that [RenderPass, Command] represented. Looking at this, we are just moving the fields from Command to the RenderPass and replacing the scope of the Command object with the requirement of calling I thought you were you were just going to remove the |
I tried to propose removing the bindings from the cmd so that we didn't have hundreds of tiny heap allocations, but you and @chinmaygarde very specifically did not like that API. We can't reduce the cost of the heap allocations by moving the cmd, instead we need to bind to the render pass where we can either avoid the heap allocations or use one bigger heap allocation. Now how specifically we split up the command state TBD. |
Althrough Viewport and Scissor are very specifically cmd buffer state and not really cmd state. We should be able to set the viewport and scissor and not set it every single cmd. |
If you had: fml::Status RenderPass::AddCommand(Command&& cmd) {
SetPipeline(cmd.pipeline);
//..
return {};
} The heap allocations would be equivalent to what you have now but you wouldn't have removed the concept of Command or expanded the RenderPass public api. edit: Also the fact that Command is copying fields. But if Command becomes ephemeral like this, just living on the stack, we can safely not copy. |
impeller/renderer/render_pass.h
Outdated
bool SetVertexBuffer(VertexBuffer buffer); | ||
|
||
/// Record the currently pending command. | ||
bool Dispatch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use fml::Status instead of using bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bool all the way up the stack, I could change this to status but then I'd convert from fml::status to bool in each contents or complete the migration.
I like the idea of fml::status in general, but I'm not sure if this is the right usage for it. Right now Draw() essentially error checks for two conditions:
- If you set a scissor (which we never do) is it a valid scissor?
- Is the pipeline object defined before draw was called?
I think the first check should be moved into the setter, but really that is something we should VALIDATION_LOG and not worry about. The second one seems like it should be an FML_DCHECK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine if this gets converted to bool in the next level up. I have an issue out to convert all of these bools over to fml::Status. We need to get these all migrated at some point bool
is ambiguous. So even at just this one layer it's worthwhile. Once everything is using it and StatusOr, magic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - i'll be able to make the message better once I finish migrating the API since I can just use the validation message.
You're missing the bindings. The std::vectors in the bindings need to go. if you want to see an example for how that works, you can check out https://github.com/flutter/engine/compare/main...jonahwilliams:engine:testbed?expand=1 where I've complely wired this up for Vulkan. |
Right Command cmd;
FS::BindFragInfo(cmd, pass.GetTransientsBuffer().EmplaceUniform(frag_info));
pass.AddCommand(std::move(cmd)); becomes: FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info));
pass.Dispatch(); Where struct TextureAndSampler {
SampledImageSlot slot;
TextureResource texture;
std::shared_ptr<const Sampler> sampler;
};
struct BufferAndUniformSlot {
ShaderUniformSlot slot;
BufferResource view;
};
struct Bindings {
std::vector<TextureAndSampler> sampled_images;
std::vector<BufferAndUniformSlot> buffers;
};
struct Command {
Bindings fragment_bindings;
}; You can get the equivalent performance and keep the concept of the struct TextureAndSampler {
TextureAndSampler(const TextureAndSampler&) = delete;
SampledImageSlot slot;
TextureResource texture;
std::shared_ptr<const Sampler> sampler;
};
struct BufferAndUniformSlot {
BufferAndUniformSlot(const BufferAndUniformSlot&) = delete;
ShaderUniformSlot slot;
BufferResource view;
};
struct Bindings {
Bindings(const Bindings&) = delete;
int num_sampled_images;
std::array<TextureAndSampler, kMaxSampledImages> sampled_images;
int num_buffers;
std::array<BufferAndUniformSlot, kMaxBuffers> buffers;
};
struct Command {
Command(const Command&) = delete;
Bindings fragment_bindings;
}; The benefit is that it doesn't blow out the RenderPass's public API and we don't have to manually scope objects with |
Right, that is essentially what this next change does, but it puts the binding state arrays on the render pass. I guess I just disagree that keeping cmd is a better API. |
Can you explain why you think The following usage of the proposed API is valid: FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info1));
// pass.Dispatch();
FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info2));
pass.Dispatch(); But it is a logical bug. With Also, having 2 concepts seems cleaner since the size of the public API of RenderPass is minimized. You end up with a hierarchy of concepts where RenderPass uses a Command, which is a useful way to think about it. |
Its not that I think Dispatch is better than AddCommand. But dispatch is a natural consequence of splitting up the command state parts. Some specific points:
So we don't necessarily have to go all the way like I've done now. We could split up comand differently or 🤷♂️ .
I would like our APIs to be easy to use, but I'd like to lean all the way on the performance spectrum. Most end users will never write a flutter app, and most flutter app developers will never touch the engine. That said, after discussing this, I'm not really convinced that this API is necessary to maximize the potential performance improvement. So I think I'm going to icebox this for now and come back to it if there is something I can't fix. |
impeller/renderer/render_pass.cc
Outdated
return pending_.BindVertices(std::move(buffer)); | ||
} | ||
|
||
bool RenderPass::Dispatch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: RenderPass::Draw()
, perhaps? :)
Ahh yea, eliminating the heap allocations doesn't solve the problem that the Command abstraction is wasteful for things like stencil/viewport. They're 2 separate issues. Pulling some (or all) of that stuff out of Command makes sense to me. It doesn't hurt to let design changes marinate if it isn't blocking performance, sgtm. |
I investigated using the stack based allocation strategy you described and unfortunately it was a large regression: Stack or no stack, allocation 1KB per command is too expensive. |
It looks like there are some unaddressed comments, but I think regardless of how we cut it, our lowest level/de-facto command encoding interface aught to be granular state updates like the interface that's being proposed here. The graphics APIs we target all avoid waste by recording granular state changes, and we stand to eliminate waste by doing the same. Renderers are then free to optimize for performing fewer state updates. For how this interface ultimately communicates with the backend, there are a few options, but two are:
If we wanted to preserve the behavior of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
bool BindResource(ShaderStage stage, | ||
const ShaderUniformSlot& slot, | ||
const ShaderMetadata& metadata, | ||
BufferView view) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these operations are idempotent except for the resource bindings which are additive, so we'll probably want something like a ResetBindings
specifically if we eventually don't force bindings to get reset for every draw. (No action needed, just noting that's why Flutter GPU has this)
They're current idempotent, but that will change once I start mapping things directly to the command buffer. I'm not sure how we do a reset in that case? you can't bind nothing, right? |
The command resource binding methods append a binding to a vector, but that vector isn't accessible in your stateful API, so AFAICT there'd be no way to remove binding slots if you were to get rid of the state resetting behavior of |
…ine into example-of-new-cmd-api
…141432) flutter/engine@ecdaed7...44a0a6e 2024-01-12 skia-flutter-autoroll@skia.org Roll Skia from d65d18ac1e8d to e9de42588e54 (1 revision) (flutter/engine#49738) 2024-01-11 matanlurey@users.noreply.github.com Add `SurfaceTextureSurfaceProducer` (flutter/engine#49653) 2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from b63e5a3c14e1 to d65d18ac1e8d (1 revision) (flutter/engine#49735) 2024-01-11 jonahwilliams@google.com [Impeller] Begin migrating to RenderPass command recording API. (flutter/engine#49480) 2024-01-11 15619084+vashworth@users.noreply.github.com Run tests on macOS 13 only (flutter/engine#49722) 2024-01-11 jonahwilliams@google.com [Impeller] dont accidentally copy shared ptr. (flutter/engine#49731) 2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 56f28f137507 to b63e5a3c14e1 (1 revision) (flutter/engine#49733) 2024-01-11 jason-simmons@users.noreply.github.com Revert Dart SDK to 3245b92a5930 (flutter/engine#49727) 2024-01-11 skia-flutter-autoroll@skia.org Roll Fuchsia Linux SDK from gJuGKaWsKazIrvQeO... to Klxww53tA4-TG5pA9... (flutter/engine#49730) 2024-01-11 103135467+sealesj@users.noreply.github.com Google GitHub mirrors - reland (flutter/engine#49716) 2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 440f5f849008 to 56f28f137507 (1 revision) (flutter/engine#49723) 2024-01-11 103135467+sealesj@users.noreply.github.com Osv-scanner integration (flutter/engine#49470) 2024-01-11 jason-simmons@users.noreply.github.com [Impeller] Add DrawLine/DrawOval/ClipOval operations to the canvas recorder (flutter/engine#49697) 2024-01-11 skia-flutter-autoroll@skia.org Roll Dart SDK from 7f3ea1c3bc27 to 6f5bd26b293d (1 revision) (flutter/engine#49720) 2024-01-11 dnfield@google.com Try to get GLES tests running... (flutter/engine#49701) 2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 2a5710618fc3 to 440f5f849008 (1 revision) (flutter/engine#49715) 2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from c46008e5a839 to 2a5710618fc3 (1 revision) (flutter/engine#49712) 2024-01-11 skia-flutter-autoroll@skia.org Roll Dart SDK from 3245b92a5930 to 7f3ea1c3bc27 (1 revision) (flutter/engine#49710) Also rolling transitive DEPS: fuchsia/sdk/core/linux-amd64 from gJuGKaWsKazI to Klxww53tA4-T If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC bdero@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://issues.skia.org/issues/new?component=1389291&template=1850622 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
Part of flutter/flutter#140804 proposes removing the Command object. This demonstrates the new API without changing the implementation, which can be done separately per backend. This also doesn't change every single callsite for reasons...