Minor display: contents optimizations #1736

NickGerleman · 2024-11-01T18:17:35Z

Summary:
LayoutableChildren<yoga::Node>::Iterator showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node.

I ran Yoga microbenchmark with Yoga compiled with -O2, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%).

I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these!

This change makes a few different optimizations

Removes redundant copies
Removes redundant index keeping
Mark which branches are likely vs unlikely
Shrink iterator size from 6 pointers to 3 pointers
Avoid usage in pixel grid rounding (so we don't need to have cache read for style)

In "Huge nested layout" example

| Before display: contents support | After display: contents support | After optimizations |
| 9.84ms | 10.39ms | 10.23ms |

Changelog: [Internal]

Differential Revision: D65336148

vercel · 2024-11-01T18:17:39Z

Deployment failed with the following error:

You don't have permission to create a Preview Deployment for this project.

View Documentation: https://vercel.com/docs/accounts/team-members-and-roles

facebook-github-bot · 2024-11-01T18:17:49Z

This pull request was exported from Phabricator. Differential Revision: D65336148

Summary: X-link: facebook/yoga#1736 `LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node. I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%). I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these! This change makes a few different optimizations 1. Removes redundant copies 2. Removes redundant index keeping 3. Mark which branches are likely vs unlikely 4. Shrink iterator size from 6 pointers to 3 pointers 5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style) In "Huge nested layout" example | Before display: contents support | After display: contents support | After optimizations | | 9.84ms | 10.39ms | 10.23ms | Changelog: [Internal] Differential Revision: D65336148

Summary: X-link: facebook/react-native#47358 `LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node. I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%). I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these! This change makes a few different optimizations 1. Removes redundant copies 2. Removes redundant index keeping 3. Mark which branches are likely vs unlikely 4. Shrink iterator size from 6 pointers to 3 pointers 5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style) In "Huge nested layout" example | Before display: contents support | After display: contents support | After optimizations | | 9.84ms | 10.39ms | 10.23ms | Changelog: [Internal] Differential Revision: D65336148

vercel · 2024-11-01T20:32:07Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
yoga-website	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 1, 2024 8:34pm

facebook-github-bot · 2024-11-01T20:32:41Z

This pull request was exported from Phabricator. Differential Revision: D65336148

facebook-github-bot · 2024-11-05T19:32:08Z

This pull request has been merged in 8f69ac7.

Summary: X-link: facebook/yoga#1736 X-link: facebook/react-native#47358 `LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node. I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%). I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these! Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change. This change makes a few different optimizations 1. Removes redundant copies 2. Removes redundant index keeping 3. Mark which branches are likely vs unlikely 4. Shrink iterator size from 6 pointers to 3 pointers 5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style) In "Huge nested layout" example | Before display: contents support | After display: contents support | After optimizations | | 9.77ms | 10.39ms | 10.17ms | Changelog: [Internal] Reviewed By: rozele Differential Revision: D65336148 fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56

Summary: X-link: facebook/yoga#1736 Pull Request resolved: #47358 `LayoutableChildren<yoga::Node>::Iterator` showed up to a surprising extent on a recent trace. Part of this was during pixel grid rounding, which does full tree traversal (we should fix that...), where the iterator is the first thing to read from the node. I ran Yoga microbenchmark with Yoga compiled with `-O2`, where we saw a regression of synthetic performance by ~10%, but it turns out this build also had ASAN and some other heavy bits enabled, so the real impact was quite lower (~6%). I was able to make some optimizations in the meantime against that, which still show some minor wins, reducing that overhead to ~4% in the properly optimized build (and a bit more before that). This is still measurable on the beefy server, and the code is a bit cleaner, so let's commit these! Note that, in real scenarios, measure functions may dominate layout time, so display: contents does not mean end-to-end 4% regression, even after this change. This change makes a few different optimizations 1. Removes redundant copies 2. Removes redundant index keeping 3. Mark which branches are likely vs unlikely 4. Shrink iterator size from 6 pointers to 3 pointers 5. Avoid usage in pixel grid rounding (so we don't need to have cache read for style) In "Huge nested layout" example | Before display: contents support | After display: contents support | After optimizations | | 9.77ms | 10.39ms | 10.17ms | Changelog: [Internal] Reviewed By: rozele Differential Revision: D65336148 fbshipit-source-id: 01c592771ed7accf2d87dddd5a3a9e0225098b56

facebook-github-bot added the CLA Signed label Nov 1, 2024

facebook-github-bot added the fb-exported label Nov 1, 2024

NickGerleman force-pushed the export-D65336148 branch from 53a20df to dab6486 Compare November 1, 2024 20:32

vercel bot deployed to Preview November 1, 2024 20:34 View deployment

facebook-github-bot closed this in 8f69ac7 Nov 5, 2024

facebook-github-bot added the Merged label Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor display: contents optimizations #1736

Minor display: contents optimizations #1736

NickGerleman commented Nov 1, 2024

vercel bot commented Nov 1, 2024

facebook-github-bot commented Nov 1, 2024

vercel bot commented Nov 1, 2024 •

edited

Loading

facebook-github-bot commented Nov 1, 2024

facebook-github-bot commented Nov 5, 2024

Minor display: contents optimizations #1736

Minor display: contents optimizations #1736

Conversation

NickGerleman commented Nov 1, 2024

vercel bot commented Nov 1, 2024

facebook-github-bot commented Nov 1, 2024

vercel bot commented Nov 1, 2024 • edited Loading

facebook-github-bot commented Nov 1, 2024

facebook-github-bot commented Nov 5, 2024

vercel bot commented Nov 1, 2024 •

edited

Loading