Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

[Impeller] Begin migrating to RenderPass command recording API. #49480

Merged
merged 10 commits into from
Jan 11, 2024

Conversation

jonahwilliams
Copy link
Member

@jonahwilliams jonahwilliams commented Jan 3, 2024

Part of flutter/flutter#140804 proposes removing the Command object. This demonstrates the new API without changing the implementation, which can be done separately per backend. This also doesn't change every single callsite for reasons...

@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact "@test-exemption-reviewer" in the #hackers channel in Chat (don't just cc them here, they won't see it! Use Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@bdero
Copy link
Member

bdero commented Jan 4, 2024

Do you want to land this, or is this just for demo purposes?

auto result = AddCommand(std::move(pending_));
pending_ = Command{};
return result;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably opportunities to cut down API traffic by not having draw calls reset all state, but starting with this seems like a good idea.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah in fact I think you mentioned this earlier w.r.t. the viewport and scissor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this change is mostly pretty dumb as its keeping the exact same implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mind keeps wandering to this patch tonight, so figured I'd jot down my thonks in more detail. ;) (No action necessary)

I think the winning long game here is to have no implicit draw state management behavior in RenderPass at all (except for the default state, of course).

It would be great if we end up in a place where we don't implicitly reset any state when recording a draw call, since that would maximally open the door for us to optimize traffic within high context parts of the pass workload (for example, any Contents that performs multiple draws).

Of course, we'd still probably want utilities like RenderPass::ResetState() for low context transitions (like going from one Contents to another).
The refactor in Entities for this would be ezpz. All we'd need to get started is something like:

  • Refactor Contents to use a stub pattern -- non-virtual Contents::Render() calls virtual Contents::OnRender() -- and have Contents::Render() call render_pass->ResetState() (or maybe have an Entities-specific utility for this that ignores stuff like viewport/scissor because we know we're not using them right now) just before dispatching to Contents::OnRender().
  • Then, for each Contents::OnRender() impl we have that records multiple draws, don't reset stuff in cases where it's obviously unnecessary. For example, if the draws are using the same pipeline, you probably don't need to reset anything at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not resetting the state. Really, that is just working against ourselves since we also have the pass binding ccache essentially undoing that work at the render pass level. I suspect we will still need some state tracking for bindings, but perhaps we can get sufficiently clever as to make that irrelevant.

cmd.pipeline = renderer.GetSolidFillPipeline(options);
cmd.BindVertices(std::move(geometry_result.vertex_buffer));

#ifdef IMPELLER_DEBUG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be automated away by performing this ifdef in SetCommandLabel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we make this accept a string_view, then there won't be any copy if it is passed a const char*, and we can ifdef out before setting the label.

const SampledImageSlot& slot,
const ShaderMetadata& metadata,
std::shared_ptr<const Texture> texture,
std::shared_ptr<const Sampler> sampler) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API to encode commands on the RenderPass directly LGTM. (And the generated reflection headers will also change to operate on the RenderPass instead of Command, of course.)

Copy link
Member

@gaaclarke gaaclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever want multiple command buffers for one render pass?

I'm not a huge fan of multiplexing Command and RenderPass. Seems like we could make a rule that Command either moves or has references to its fields and get the same performance with a more clear abstraction. That may mean bending the rule against rvalue in the style guide though.

Having 2 concepts and dissolving one concept and making an uber concept, ignoring performance, seems like we are losing something.

@jonahwilliams
Copy link
Member Author

We're not going to have two concepts, there will only be the render pass. But that is a big change, and so this PR is keeping around command until I remove it.

@jonahwilliams
Copy link
Member Author

Do you want to land this, or is this just for demo purposes?

I think I can either land the migration in chunks or all at once, whatever you'd prefer to review.

@gaaclarke
Copy link
Member

We're not going to have two concepts, there will only be the render pass. But that is a big change, and so this PR is keeping around command until I remove it.

I meant having a [RenderPass, Command] and going to [RenderPass'], where RenderPass' encompasses all the concepts that [RenderPass, Command] represented.

Looking at this, we are just moving the fields from Command to the RenderPass and replacing the scope of the Command object with the requirement of calling Dispatch(). Having to manually scope the concept of Command instead of having it scoped over an object seems like a step backwards.

I thought you were you were just going to remove the std::vector<Command> from the RenderPass and interpret the Commands directly so they don't have to be copied, they can just live on the stack for the whole encoding as rvalues.

@jonahwilliams
Copy link
Member Author

I tried to propose removing the bindings from the cmd so that we didn't have hundreds of tiny heap allocations, but you and @chinmaygarde very specifically did not like that API. We can't reduce the cost of the heap allocations by moving the cmd, instead we need to bind to the render pass where we can either avoid the heap allocations or use one bigger heap allocation.

Now how specifically we split up the command state TBD.

@jonahwilliams
Copy link
Member Author

Althrough Viewport and Scissor are very specifically cmd buffer state and not really cmd state. We should be able to set the viewport and scissor and not set it every single cmd.

@gaaclarke
Copy link
Member

gaaclarke commented Jan 4, 2024

If you had:

fml::Status RenderPass::AddCommand(Command&& cmd) {
  SetPipeline(cmd.pipeline);
  //..
  return {};
}

The heap allocations would be equivalent to what you have now but you wouldn't have removed the concept of Command or expanded the RenderPass public api. Command isn't the problem, it's how we are using it.

edit: Also the fact that Command is copying fields. But if Command becomes ephemeral like this, just living on the stack, we can safely not copy.

bool SetVertexBuffer(VertexBuffer buffer);

/// Record the currently pending command.
bool Dispatch();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use fml::Status instead of using bool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bool all the way up the stack, I could change this to status but then I'd convert from fml::status to bool in each contents or complete the migration.

I like the idea of fml::status in general, but I'm not sure if this is the right usage for it. Right now Draw() essentially error checks for two conditions:

  1. If you set a scissor (which we never do) is it a valid scissor?
  2. Is the pipeline object defined before draw was called?

I think the first check should be moved into the setter, but really that is something we should VALIDATION_LOG and not worry about. The second one seems like it should be an FML_DCHECK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine if this gets converted to bool in the next level up. I have an issue out to convert all of these bools over to fml::Status. We need to get these all migrated at some point bool is ambiguous. So even at just this one layer it's worthwhile. Once everything is using it and StatusOr, magic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - i'll be able to make the message better once I finish migrating the API since I can just use the validation message.

@jonahwilliams
Copy link
Member Author

The heap allocations would be equivalent to what you have now but you wouldn't have removed the concept of Command or expanded the RenderPass public api. Command isn't the problem, it's how we are using it.

You're missing the bindings. The std::vectors in the bindings need to go. if you want to see an example for how that works, you can check out https://github.com/flutter/engine/compare/main...jonahwilliams:engine:testbed?expand=1 where I've complely wired this up for Vulkan.

@gaaclarke
Copy link
Member

Right

Command cmd;
FS::BindFragInfo(cmd, pass.GetTransientsBuffer().EmplaceUniform(frag_info));
pass.AddCommand(std::move(cmd));

becomes:

FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info));
pass.Dispatch();

Where

struct TextureAndSampler {
  SampledImageSlot slot;
  TextureResource texture;
  std::shared_ptr<const Sampler> sampler;
};

struct BufferAndUniformSlot {
  ShaderUniformSlot slot;
  BufferResource view;
};

struct Bindings {
  std::vector<TextureAndSampler> sampled_images;
  std::vector<BufferAndUniformSlot> buffers;
};

struct Command {
  Bindings fragment_bindings;
};

You can get the equivalent performance and keep the concept of the Command by making the Command only ever live on the stack.

struct TextureAndSampler {
  TextureAndSampler(const TextureAndSampler&) = delete;
  SampledImageSlot slot;
  TextureResource texture;
  std::shared_ptr<const Sampler> sampler;
};

struct BufferAndUniformSlot {
  BufferAndUniformSlot(const BufferAndUniformSlot&) = delete;
  ShaderUniformSlot slot;
  BufferResource view;
};

struct Bindings {
  Bindings(const Bindings&) = delete;
  int num_sampled_images;
  std::array<TextureAndSampler, kMaxSampledImages> sampled_images;
  int num_buffers;
  std::array<BufferAndUniformSlot, kMaxBuffers> buffers;
};

struct Command {
  Command(const Command&) = delete;
  Bindings fragment_bindings;
};

The benefit is that it doesn't blow out the RenderPass's public API and we don't have to manually scope objects with Dispatch() calls. There is an object that encompasses that scope.

@jonahwilliams
Copy link
Member Author

Right, that is essentially what this next change does, but it puts the binding state arrays on the render pass. I guess I just disagree that keeping cmd is a better API.

@gaaclarke
Copy link
Member

Can you explain why you think Dispatch() is preferable to Command? Dispatch() is the main source of my complaint.

The following usage of the proposed API is valid:

FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info1));
// pass.Dispatch();
FS::BindFragInfo(pass, pass.GetTransientsBuffer().EmplaceUniform(frag_info2));
pass.Dispatch();

But it is a logical bug. With Command you can't get it wrong, it better communicates how the API is supposed to be used.

Also, having 2 concepts seems cleaner since the size of the public API of RenderPass is minimized. You end up with a hierarchy of concepts where RenderPass uses a Command, which is a useful way to think about it.

@jonahwilliams
Copy link
Member Author

Its not that I think Dispatch is better than AddCommand. But dispatch is a natural consequence of splitting up the command state parts. Some specific points:

  1. Some cmd state, like stencil/viewport/scissor is actually render pass state and not per-command state. Today, we need every single command to set its stencil reference - even though only clips/render pass start/end modify it. With per-command buffer state we could remove this responsibility from many indivudal content objects.

  2. Splitting up the cmd state allows the setters to pass through directly to the native command buffer. This doesn't matter so much for most fields, but specifically for labels it means we should be able to lose all the weird struct ifdefing and ifdef in the implementation. We could change the API to only accept std::string_view and ensure no allocations are happening even with labels.

  3. (Maybe) I think it will eventually be possible to map the bindings directly to the cmd buffer, which means we could entirely drop the TextureAndSampler/BufferAndUniformSlot structs and let the backend read the binding information it actually needs. This may not be possible and may not have a substantial performance improvement.

So we don't necessarily have to go all the way like I've done now. We could split up comand differently or 🤷‍♂️ .

But it is a logical bug. With Command you can't get it wrong, it better communicates how the API is supposed to be used.

I would like our APIs to be easy to use, but I'd like to lean all the way on the performance spectrum. Most end users will never write a flutter app, and most flutter app developers will never touch the engine.

That said, after discussing this, I'm not really convinced that this API is necessary to maximize the potential performance improvement. So I think I'm going to icebox this for now and come back to it if there is something I can't fix.

@jonahwilliams jonahwilliams marked this pull request as draft January 4, 2024 23:54
return pending_.BindVertices(std::move(buffer));
}

bool RenderPass::Dispatch() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: RenderPass::Draw(), perhaps? :)

@gaaclarke
Copy link
Member

Ahh yea, eliminating the heap allocations doesn't solve the problem that the Command abstraction is wasteful for things like stencil/viewport. They're 2 separate issues. Pulling some (or all) of that stuff out of Command makes sense to me. It doesn't hurt to let design changes marinate if it isn't blocking performance, sgtm.

@jonahwilliams
Copy link
Member Author

I investigated using the stack based allocation strategy you described and unfortunately it was a large regression:

#49569

Stack or no stack, allocation 1KB per command is too expensive.

@jonahwilliams jonahwilliams marked this pull request as ready for review January 8, 2024 19:38
@bdero
Copy link
Member

bdero commented Jan 8, 2024

It looks like there are some unaddressed comments, but I think regardless of how we cut it, our lowest level/de-facto command encoding interface aught to be granular state updates like the interface that's being proposed here. The graphics APIs we target all avoid waste by recording granular state changes, and we stand to eliminate waste by doing the same. Renderers are then free to optimize for performing fewer state updates.

For how this interface ultimately communicates with the backend, there are a few options, but two are:

  • Least overhead: Eliminate the temp buffer on Metal/Vulkan. Record everything to the actual actual backend buffer directly. It sounds like this is what @jonahwilliams is hoping to achieve, which is great.
  • Keep an intermediary buffer, but reduce memory/time waste. Make our state updates more granular, reducing the temp buffer size and getting rid of unnecessary backend API traffic and/or state delta checks. A solution for this kind of temp buffer is mentioned in [Impeller] Make RenderPass commands more granular. flutter#133179.

If we wanted to preserve the behavior of AddCommand(Command&&) after removing/making the temp buffer more efficient (even if just temporarily to make refactoring easier), we could easily do so by having AddCommand translate into this new lower level interface.

@jonahwilliams jonahwilliams requested a review from bdero January 10, 2024 16:56
Copy link
Member

@bdero bdero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

bool BindResource(ShaderStage stage,
const ShaderUniformSlot& slot,
const ShaderMetadata& metadata,
BufferView view) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these operations are idempotent except for the resource bindings which are additive, so we'll probably want something like a ResetBindings specifically if we eventually don't force bindings to get reset for every draw. (No action needed, just noting that's why Flutter GPU has this)

@jonahwilliams
Copy link
Member Author

They're current idempotent, but that will change once I start mapping things directly to the command buffer. I'm not sure how we do a reset in that case? you can't bind nothing, right?

@bdero
Copy link
Member

bdero commented Jan 10, 2024

The command resource binding methods append a binding to a vector, but that vector isn't accessible in your stateful API, so AFAICT there'd be no way to remove binding slots if you were to get rid of the state resetting behavior of Draw() (again, not necessary to solve this now).

@jonahwilliams jonahwilliams changed the title [Impeller] create example of new cmd API. [Impeller] Begin migrating to RenderPass command recording API. Jan 11, 2024
@jonahwilliams jonahwilliams added the autosubmit Merge PR when tree becomes green via auto submit App label Jan 11, 2024
@auto-submit auto-submit bot merged commit cded76e into flutter:main Jan 11, 2024
@jonahwilliams jonahwilliams deleted the example-of-new-cmd-api branch January 11, 2024 23:27
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 11, 2024
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 12, 2024
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Jan 12, 2024
auto-submit bot pushed a commit to flutter/flutter that referenced this pull request Jan 12, 2024
…141432)

flutter/engine@ecdaed7...44a0a6e

2024-01-12 skia-flutter-autoroll@skia.org Roll Skia from d65d18ac1e8d to e9de42588e54 (1 revision) (flutter/engine#49738)
2024-01-11 matanlurey@users.noreply.github.com Add `SurfaceTextureSurfaceProducer` (flutter/engine#49653)
2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from b63e5a3c14e1 to d65d18ac1e8d (1 revision) (flutter/engine#49735)
2024-01-11 jonahwilliams@google.com [Impeller] Begin migrating to RenderPass command recording API. (flutter/engine#49480)
2024-01-11 15619084+vashworth@users.noreply.github.com Run tests on macOS 13 only (flutter/engine#49722)
2024-01-11 jonahwilliams@google.com [Impeller] dont accidentally copy shared ptr. (flutter/engine#49731)
2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 56f28f137507 to b63e5a3c14e1 (1 revision) (flutter/engine#49733)
2024-01-11 jason-simmons@users.noreply.github.com Revert Dart SDK to 3245b92a5930 (flutter/engine#49727)
2024-01-11 skia-flutter-autoroll@skia.org Roll Fuchsia Linux SDK from gJuGKaWsKazIrvQeO... to Klxww53tA4-TG5pA9... (flutter/engine#49730)
2024-01-11 103135467+sealesj@users.noreply.github.com Google GitHub mirrors - reland (flutter/engine#49716)
2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 440f5f849008 to 56f28f137507 (1 revision) (flutter/engine#49723)
2024-01-11 103135467+sealesj@users.noreply.github.com Osv-scanner integration (flutter/engine#49470)
2024-01-11 jason-simmons@users.noreply.github.com [Impeller] Add DrawLine/DrawOval/ClipOval operations to the canvas recorder (flutter/engine#49697)
2024-01-11 skia-flutter-autoroll@skia.org Roll Dart SDK from 7f3ea1c3bc27 to 6f5bd26b293d (1 revision) (flutter/engine#49720)
2024-01-11 dnfield@google.com Try to get GLES tests running... (flutter/engine#49701)
2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from 2a5710618fc3 to 440f5f849008 (1 revision) (flutter/engine#49715)
2024-01-11 skia-flutter-autoroll@skia.org Roll Skia from c46008e5a839 to 2a5710618fc3 (1 revision) (flutter/engine#49712)
2024-01-11 skia-flutter-autoroll@skia.org Roll Dart SDK from 3245b92a5930 to 7f3ea1c3bc27 (1 revision) (flutter/engine#49710)

Also rolling transitive DEPS:
  fuchsia/sdk/core/linux-amd64 from gJuGKaWsKazI to Klxww53tA4-T

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC bdero@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
autosubmit Merge PR when tree becomes green via auto submit App e: impeller
Projects
No open projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants