Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor display: contents optimizations #1736

Closed
wants to merge 1 commit into from

Conversation

NickGerleman
Copy link
Contributor

Summary:
LayoutableChildren<yoga::Node>::Iterator showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with -O2, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations

  1. Removes redundant copies
  2. Removes redundant index keeping
  3. Mark which branches are likely vs unlikely
  4. Shrink iterator size from 6 pointers to 3 pointers
  5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148

Copy link

vercel bot commented Nov 1, 2024

Deployment failed with the following error:

You don't have permission to create a Preview Deployment for this project.

View Documentation: https://vercel.com/docs/accounts/team-members-and-roles

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65336148

NickGerleman added a commit to NickGerleman/react-native that referenced this pull request Nov 1, 2024
Summary:
X-link: facebook/yoga#1736


`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148
Summary:

X-link: facebook/react-native#47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148
Copy link

vercel bot commented Nov 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
yoga-website ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 1, 2024 8:34pm

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65336148

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8f69ac7.

facebook-github-bot pushed a commit to facebook/litho that referenced this pull request Nov 5, 2024
Summary:
X-link: facebook/yoga#1736

X-link: facebook/react-native#47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change.

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.77ms | 10.39ms | 10.17ms |

Changelog: [Internal]

Reviewed By: rozele

Differential Revision: D65336148

fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56
facebook-github-bot pushed a commit to facebook/react-native that referenced this pull request Nov 5, 2024
Summary:
X-link: facebook/yoga#1736

Pull Request resolved: #47358

`LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change.

This change makes a few different optimizations
1. Removes redundant copies
2. Removes redundant index keeping
3. Mark which branches are likely vs unlikely
4. Shrink iterator size from 6 pointers to 3 pointers
5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.77ms | 10.39ms | 10.17ms |

Changelog: [Internal]

Reviewed By: rozele

Differential Revision: D65336148

fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants