Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

[Impeller] Use a squircle-sdf-based algorithm for fast blurs #55604

Merged
merged 10 commits into from
Oct 4, 2024

Conversation

flar
Copy link
Contributor

@flar flar commented Oct 3, 2024

Use a Squircle Signed Distance Field based algorithm to do a very fast approximation of rrect blurs.

This PR is to provide reference within the team to discuss the future of the algorithm compared to the Gaussian approximation functions that are currently in use. It isn't a complete solution, but can be completed easily with a little more work.

Notably, it doesn't handle elliptical round rects, only circular corners.
Could stand to include an attribution to the source (https://raphlinus.github.io/graphics/2020/04/21/blurred-rounded-rects.html)

@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption, contact "@test-exemption-reviewer" in the #hackers channel in Discord (don't just cc them here, they won't see it!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. The test exemption team is a small volunteer group, so all reviewers should feel empowered to ask for tests, without delegating that responsibility entirely to the test exemption group.

@flar flar requested a review from gaaclarke October 3, 2024 03:25
@gaaclarke
Copy link
Member

The falloff looks a bit abrupt with the current math. I'm working on getting the side-by-side to verify.

Screenshot 2024-10-03 at 10 23 29 AM

@gaaclarke
Copy link
Member

Here is the golden changes: https://flutter-engine-gold.skia.org/search?issue=55604&crs=github&patchsets=2&corpus=flutter-engine. It looks really close. Here is one of the tests that show the oval not working though (see green shapes):

Screenshot 2024-10-03 at 11 24 23 AM

@gaaclarke
Copy link
Member

Here is an example of some of the artifacts this removes:
Screenshot 2024-10-03 at 11 36 48 AM

@gaaclarke
Copy link
Member

I was chatting with jonah in office. He was saying for ovals we should maybe just fall back to the full gaussian blur and drop the old rrect_blur completely (instead of keeping it around for ovals).

@gaaclarke
Copy link
Member

gaaclarke commented Oct 3, 2024

I've came up with a test that shows what we need to fix to get this landed. There is the known issue of the oval which we knew about needs to fall back to regular blur, but also looks like small sigmas aren't rendered. We can probably just clamp that to 1:

TEST_P(AiksTest, SolidColorCirclesOvalsRRectsMaskBlurTinySigma) {
  auto callback = [&]() -> sk_sp<DisplayList> {
    if (AiksTest::ImGuiBegin("Controls", nullptr,
                             ImGuiWindowFlags_AlwaysAutoResize)) {
      ImGui::End();
    }
    DisplayListBuilder builder;
    builder.Scale(GetContentScale().x, GetContentScale().y);

    std::vector<float> sigmas = {0.0, 0.01, 1.0};
    std::vector<DlColor> colors = {DlColor::kGreen(), DlColor::kYellow(),
                                   DlColor::kRed()};
    for (uint32_t i = 0; i < sigmas.size(); ++i) {
      DlPaint paint;
      paint.setColor(colors[i]);
      paint.setMaskFilter(
          DlBlurMaskFilter::Make(DlBlurStyle::kNormal, sigmas[i]));

      builder.Save();
      builder.Translate(100 + (i * 100), 100);
      SkRRect rrect =
          SkRRect::MakeRectXY(SkRect::MakeXYWH(0, 0, 60.0f, 160.0f), 50, 100);
      builder.DrawRRect(rrect, paint);
      builder.Restore();
    }

    return builder.Build();
  };

  ASSERT_TRUE(OpenPlaygroundHere(callback));
}

expected output

Screenshot 2024-10-03 at 1 25 35 PM

seen output

Screenshot 2024-10-03 at 1 28 49 PM

@gaaclarke
Copy link
Member

@flar I can't remember where you asked but we do have a rrect blur benchmark in flutter/dev/benchmarks/macrobenchmarks, it was added here: flutter/flutter#149261

@gaaclarke
Copy link
Member

There are artifacts in the benchmark too, this is probably the same issue we see above with the small sigma.

sweepartifact.mov

@gaaclarke
Copy link
Member

The render problem seems to happen around sigma=0.57 and I believe it is because of frag_info.scale == 0. That's calculated here:

  frag_info.scale =
      0.5 * computeErf7(frag_info.sInv * 0.5 *
                        (std::max(rSize.x, rSize.y) - 0.5 * radius));

In computeErf7, at the sqrt x ~ 2e19 so x * x = inf.

static Scalar computeErf7(Scalar x) {
  x *= kTwoOverSqrtPi;
  float xx = x * x;
  x = x + (0.24295 + (0.03395 + 0.0104 * xx) * xx) * (x * xx);
  return x / sqrt(1.0 + x * x);
}

@gaaclarke
Copy link
Member

Here's the measurements:

before

after

skia

@gaaclarke
Copy link
Member

Funny thing though is there is a bug in skia like the one I had to find for rendering small sigmas.

@gaaclarke
Copy link
Member

gaaclarke commented Oct 4, 2024

Here is the regular gaussian blur for reference. It looks about the same gpu-wise with the current rrect_blur, but the gaussian blur would sometimes have cpu hiccups.

@gaaclarke
Copy link
Member

Funny thing though is there is a bug in skia like the one I had to find for rendering small sigmas.

Skia issue filed: https://g-issues.skia.org/issues/371571787

Copy link
Member

@gaaclarke gaaclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@gaaclarke
Copy link
Member

@jonahwilliams do you want to give this another look? I've contributed to it quite a bit so might be good to have @flar look over my additions or another person. I tried to avoid touching jims code as much as possible. I made the following changes:

  1. Fixed ios compilation
  2. Fixed bug with small sigmas
  3. Added a test that show we don't use this for ovals
  4. Added a test to show the smalls sigmas work now

@gaaclarke
Copy link
Member

I'm not sure what is going on with the golden images. They aren't showing up yet but I think they should have been processed. I looked at previous versions of this PR and they all looked good to me except when the corner radii are not equal, of course.

@jonahwilliams
Copy link
Member

Taking a look

@jonahwilliams
Copy link
Member

They bot won't ding until Mac mac_unopt finished. Otherwise maybe they are identical?

@flutter-dashboard
Copy link

Golden file changes have been found for this pull request. Click here to view and triage (e.g. because this is an intentional change).

If you are still iterating on this change and are not ready to resolve the images on the Flutter Gold dashboard, consider marking this PR as a draft pull request above. You will still be able to view image results on the dashboard, commenting will be silenced, and the check will not try to resolve itself until marked ready for review.

Changes reported for pull request #55604 at sha e511f69

@jonahwilliams
Copy link
Member

A suggestion based on @gaaclarke 's work to vectorize the previous blur:

diff --git a/impeller/entity/shaders/rrect_blur.frag b/impeller/entity/shaders/rrect_blur.frag
index d0bba43561..c5368daf6a 100644
--- a/impeller/entity/shaders/rrect_blur.frag
+++ b/impeller/entity/shaders/rrect_blur.frag
@@ -12,7 +12,7 @@ precision highp float;
 #include <impeller/types.glsl>
 
 uniform FragInfo {
-  f16vec4 color;
+  vec4 color;
   vec2 center;
   vec2 adjust;
   float minEdge;
@@ -26,7 +26,7 @@ frag_info;
 
 in vec2 v_position;
 
-out f16vec4 frag_color;
+out vec4 frag_color;
 
 const float kTwoOverSqrtPi = 2.0 / sqrt(3.1415926);
 
@@ -34,19 +34,17 @@ float maxXY(vec2 v) {
   return max(v.x, v.y);
 }
 
-// use crate::math::compute_erf7;
-float computeErf7(float x) {
-  x *= kTwoOverSqrtPi;
-  float xx = x * x;
-  x = x + (0.24295 + (0.03395 + 0.0104 * xx) * xx) * (x * xx);
+vec2 computeErf7(vec2 x) {
+  x *= vec2(kTwoOverSqrtPi);
+  vec2 xx = x * x;
+  x = x + (vec2(0.24295) + (vec2(0.03395) + vec2(0.0104) * xx) * xx) * (x * xx);
   return x / sqrt(1.0 + x * x);
 }
 
 // The length formula, but with an exponent other than 2
 float powerDistance(vec2 p) {
-  float xp = pow(p.x, frag_info.exponent);
-  float yp = pow(p.y, frag_info.exponent);
-  return pow(xp + yp, frag_info.exponentInv);
+  vec2 pp = pow(p, vec2(frag_info.exponent));
+  return pow(pp.x + pp.y, frag_info.exponentInv);
 }
 
 void main() {
@@ -55,9 +53,10 @@ void main() {
   float dPos = powerDistance(max(adjusted, 0.0));
   float dNeg = min(maxXY(adjusted), 0.0);
   float d = dPos + dNeg - frag_info.r1;
+
+  vec2 erfs = computeErf7(vec2(frag_info.sInv * (frag_info.minEdge + d), frag_info.sInv * d));
   float z =
-      frag_info.scale * (computeErf7(frag_info.sInv * (frag_info.minEdge + d)) -
-                         computeErf7(frag_info.sInv * d));
+      frag_info.scale * (erfs.x - erfs.y);
 
-  frag_color = frag_info.color * float16_t(z);
+  frag_color = frag_info.color * z;
 }

@@ -498,6 +498,11 @@ bool Canvas::AttemptDrawBlurredRRect(const Rect& rect,
return false;
}

// The current rrect blur math doesn't work on ovals.
if (fabsf(corner_radii.width - corner_radii.height) > kEhCloseEnough) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this case you could also adjust the logic in IsNearlySimpleRRect to force all corner radii to be equal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would also eliminate fast tessellation of simple rrects. The only deficiency here is that we can no longer fast_blur them so it's a test specific to this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agh, yes don't do that then! :)

Copy link
Member

@jonahwilliams jonahwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gaaclarke
Copy link
Member

A suggestion based on @gaaclarke 's work to vectorize the previous blur:

Jim and I were looking at this a bit yesterday and I poked around a bit this morning. We should do this in a follow up PR to establish this as the baseline.

@gaaclarke gaaclarke added the autosubmit Merge PR when tree becomes green via auto submit App label Oct 4, 2024
@auto-submit auto-submit bot merged commit bc5f9fc into flutter:main Oct 4, 2024
33 checks passed
@flar
Copy link
Contributor Author

flar commented Oct 4, 2024

A suggestion based on @gaaclarke 's work to vectorize the previous blur:

Jim and I were looking at this a bit yesterday and I poked around a bit this morning. We should do this in a follow up PR to establish this as the baseline.

Another optimization is that we don't need to subtract the center inside the shader, that should be done by changing the initial mapping of the coordinates, then the shader just needs to do abs(position) - adjust.

Actually, if we were willing to emit 4 quads then we could do that abs(position) - adjust on the CPU by setting up the 4 quads with their absolute (adjusted) values.

@jonahwilliams
Copy link
Member

I would do the subtraction in the vertex shader, that would only execute ~4 times or so. But the subtraction is likely not going to add much overhead

@flar
Copy link
Contributor Author

flar commented Oct 4, 2024

I would do the subtraction in the vertex shader, that would only execute ~4 times or so. But the subtraction is likely not going to add much overhead

That's not needed, you just have to change 4 values in the C++ code and all of the coordinates will be correct without any more work in the vertex shader. Right now it has to do math to make everything relative to the rect origin, just swap in rect center instead and it's good to go.

Could the vertex shader convert the 1 quad into 4 to do the quadrant slicing? Actually if it could slice it (1 -> 9) then we could get rid of the abs, the -adjust and the subsequent elimination of the negative values in the fragment shader.

@jonahwilliams
Copy link
Member

You could slice up the geometry however you'd like

@jonahwilliams
Copy link
Member

If you want data to vary per vertex or group of vertexes you'd usually do that by defining a new varying instead of using a uniform.

@gaaclarke
Copy link
Member

So I tried to vectorize this and tried some other tricks. Vectorization didn't pay off. The patch above actually makes things slower, but for a large part of the output space powerDistance is just a slow version of length when the exponent is 2. This happens for rendering circles or any sufficiently blurred rrect. This is 15% faster than what we landed with an average gpu time of 3.82ms for the benchmark.

I tried to branch (if exponent == 2.0)) in the fragment shader but that wasn't any faster. We would have to probably maintain a separate fragment shader for this special case.

For reference, our exponents appear to be in the range [2, 3.5].

--- a/impeller/entity/shaders/rrect_blur.frag
+++ b/impeller/entity/shaders/rrect_blur.frag
@@ -44,9 +44,13 @@ float computeErf7(float x) {
 
 // The length formula, but with an exponent other than 2
 float powerDistance(vec2 p) {
-  float xp = pow(p.x, frag_info.exponent);
-  float yp = pow(p.y, frag_info.exponent);
-  return pow(xp + yp, frag_info.exponentInv);
+    return length(p);
 }

@gaaclarke
Copy link
Member

You could also use specialization constants to avoid having a separate shader but those only exist in vulkan I think.

@jonahwilliams
Copy link
Member

spec constants are supported for all backends. I would not pursue a specialization constant for this.

@gaaclarke
Copy link
Member

spec constants are supported for all backends. I would not pursue a specialization constant for this.

What's your take in general on trying to get that 15% boost on circles and really blurry rounded rects? I know adding shaders has a maintenance and startup time cost.

@jonahwilliams
Copy link
Member

Doesn't seem worthwhile

auto-submit bot pushed a commit to flutter/flutter that referenced this pull request Oct 4, 2024
…156252)

flutter/engine@eece6c3...bc5f9fc

2024-10-04 flar@google.com [Impeller] Use a squircle-sdf-based algorithm for fast blurs (flutter/engine#55604)
2024-10-04 skia-flutter-autoroll@skia.org Roll Skia from 59f512b47cc0 to 4aff9603622d (1 revision) (flutter/engine#55663)

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC matanl@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
@gaaclarke
Copy link
Member

For future reference I measured with the fragment shader just setting a solid color and the GPU time was 1.6ms, so that's the floor. Considering that; the length variant is 24% faster (1-(3.82-1.6)/(4.54-1.6)).

Doesn't seem worthwhile

I'm fine with that estimation, we can always revisit if we need to.

@flar
Copy link
Contributor Author

flar commented Oct 4, 2024

So I tried to vectorize this and tried some other tricks. Vectorization didn't pay off. The patch above actually makes things slower, but for a large part of the output space powerDistance is just a slow version of length when the exponent is 2. This happens for rendering circles or any sufficiently blurred rrect. This is 15% faster than what we landed with an average gpu time of 3.82ms for the benchmark.

That's the definition of a squircle and why this algorithm works. Yes, we can replace those powers by "2" and just use length, but then they aren't plotting a squircle, they are plotting a circle - and a circle is not an approximation of a blurred circle/rrect.

Ah, I see, you are suggesting testing the constant to see if it happens to be "2", how often is that the case?

@gaaclarke
Copy link
Member

Ah, I see, you are suggesting testing the constant to see if it happens to be "2", how often is that the case?

The exponent is 2 when:

  • if corner_radius >= 50
  • if corner_radius = 25 && blur_radius >= 26
  • if corner_radius = 1 && blur_radius >= 30.9

So, about half of the output space.

@gaaclarke
Copy link
Member

gaaclarke commented Oct 4, 2024

@jonahwilliams
Copy link
Member

This is a phenomenal improvement - Great work @flar + @gaaclarke !

nick9822 pushed a commit to nick9822/flutter that referenced this pull request Dec 18, 2024
…/engine#55604)

Use a Squircle Signed Distance Field based algorithm to do a very fast approximation of rrect blurs.

This PR is to provide reference within the team to discuss the future of the algorithm compared to the Gaussian approximation functions that are currently in use. It isn't a complete solution, but can be completed easily with a little more work.

Notably, it doesn't handle elliptical round rects, only circular corners.
Could stand to include an attribution to the source (https://raphlinus.github.io/graphics/2020/04/21/blurred-rounded-rects.html)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
autosubmit Merge PR when tree becomes green via auto submit App e: impeller will affect goldens
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants