Use next sibling in SyntaxNode.GetChildPosition() if available #66876

cston · 2023-02-14T18:20:50Z

When iterating a SyntaxList.SeparatedWithManyChildren collection in reverse, use the next sibling to calculate the position in SyntaxNode.GetChildPosition() rather than relying on the previous sibling which may not be cached.

Fixes #66475

cston · 2023-02-14T18:22:23Z

cc @CyrusNajmabadi for suggesting the approach.

CyrusNajmabadi · 2023-02-14T18:24:36Z

Do you ahve perf before/after?

CyrusNajmabadi · 2023-02-14T18:26:00Z

src/Compilers/Core/Portable/Syntax/SyntaxList.SeparatedWithManyChildren.cs

+                    && _children[(valueIndex - 2) >> 1].Value is null
+                    && (valueIndex >= Green.SlotCount - 2 || _children[(valueIndex + 2) >> 1].Value is { });
+
+                return GetChildPosition(index, useNextNotPrevious);


i legit do not understand any of this :D

We decide whether to look at previous siblings or following siblings in GetChildPosition() by checking whether the nearest siblings are included in the _children cache. If the nearest previous sibling is not cached, but the nearest following sibling is cached, we'll look at the following siblings; otherwise, we'll use the existing behavior of looking at the previous siblings.

To check for the nearest siblings though, we need to ignore separators, because separators are not represented in the cache.

CyrusNajmabadi · 2023-02-14T18:26:32Z

src/Compilers/Core/Portable/Syntax/SyntaxNode.cs

        {
            int offset = 0;
            var green = this.Green;
-            while (index > 0)
+
+            if (!useNextNotPrevious)


I used !useNextNotPrevious to minimize the differences in the PR, and also because !useNextNotPrevious is the common case.

CyrusNajmabadi · 2023-02-14T18:29:06Z

src/Compilers/Core/Portable/Syntax/SyntaxNode.cs

            }
-
-            return this.Position + offset;
        }


overall, i have no issue with the approach. but i think we def need tests around ensuring that reverse iterating produces nodes/tokens in the correct location. can you add a bunch of tests that show that if we reverse iterate the positions of things are all correct?

Added verifying results from GetChildPosition() and GetChildPositionFromEnd() to tests in IOperation verification.

cston · 2023-02-14T18:31:47Z

The perf of the C# and VB tests were similar.

EnumerateWithManyChildren_Forward: 400ms
EnumerateWithManyChildren_Reverse: before 4.1mins; after 340ms

CyrusNajmabadi · 2023-02-15T22:01:52Z

src/Compilers/Test/Core/Compilation/CompilationExtensions.cs

+                for (int i = n - 1; i >= 0; i--)
+                {
+                    positions2[i] = separatedList.GetChildPositionFromEnd(i);
+                }


and just to check. this actually computes, and is unaffected by anything potentially cached with the first for-loop?

Good catch, thanks.

Actually, neither of the two loops cache any nodes, but both loops get the positions of the nodes using the cache is there at the time the verify method is called.

Add comments and also added verification without empty caches for both directions.

AlekseyTs · 2023-02-16T16:05:02Z

src/Compilers/Core/Portable/Syntax/SyntaxList.SeparatedWithManyChildren.cs

+                int valueIndex = (index & 1) != 0 ? index - 1 : index;
+                if (valueIndex > 1
+                    && _children[(valueIndex - 2) >> 1].Value is null
+                    && (valueIndex >= Green.SlotCount - 2 || _children[(valueIndex + 2) >> 1].Value is { }))


valueIndex >= Green.SlotCount - 2

Could you elaborate what is the meaning of this condition? #Closed

The preceding conditions have determined the previous sibling is not cached. This condition checks if the index represents the last item or the next to last item. If so, we use GetChildPositionFromEnd() to calculate the position from the end of the list rather than the start. If the index is the last item (a separator) or next to last item (followed by a separator which are never cached), there are no following siblings to cache so we can skip the || condition.

I think it would be good capturing this in a comment and explaining why it doesn't matter whether the next item is cached in this case.

CyrusNajmabadi · 2023-02-16T16:46:48Z

Can you add a test where the list has a trailing separator?

AlekseyTs · 2023-02-16T17:11:29Z

src/Compilers/Core/Portable/Syntax/SyntaxList.SeparatedWithManyChildren.cs

+                    && _children[(valueIndex - 2) >> 1].Value is null
+                    && (valueIndex >= Green.SlotCount - 2 || _children[(valueIndex + 2) >> 1].Value is { }))
+                {
+                    return GetChildPositionFromEnd(index);


GetChildPositionFromEnd

It feels like the correlation between cache checks performed by GetChildPositionFromEnd and cache checks performed here would be more obvious if we were using GetCachedSlot helper rather than working with _children on the low level. Also, could we instead adjust implementation of GetChildPosition to perform the same cache tests for the following child? Then all the logic would be in one place, and, perhaps, we wouldn't need to provide a general GetChildPositionFromEnd helper, since it looks like we really want to handle just 3 specific situations.

It looks like the optimization targets very limited set of scenarios. For example, when we are far from the end and the next sibling isn't cached, but the one after the next is cached, we won't take advantage of that cached sibling. Would it make sense to generalize the optimization. For example, even if nothing is cached, it still might be faster to calculate from end simply because the item is closer to the end than to the front. Similarly, we could make decision based on the proximity of the first cached item going backwards vs. going forward. Perhaps we could maintain a bitmap of cached items to speed-up the process of finding closest cached items. #Closed

It looks like the optimization targets very limited set of scenarios.

Yes, the optimization specifically targets the scenario where the first iteration through the child list is in reverse, so the previous siblings are not in the cache. That's the one scenario we have currently where there is a perf issue. The optimization here is simply to calculate the position based on the offset from the immediately following sibling when that sibling is in the cache; otherwise, we use the existing approach.

We should limit the change here to fixing this one scenario, to avoid perf regressions in other cases. If additional scenarios arise, we can consider further optimizations.

Also, could we instead adjust implementation of GetChildPosition to perform the same cache tests for the following child?

GetChildPosition() doesn't know about separated lists (where the separator is not in the cache) so it would be surprising for that method to skip over some siblings when determining whether using the following sibling is a better choice.

Then all the logic would be in one place, and, perhaps, we wouldn't need to provide a general GetChildPositionFromEnd helper ...

I added GetChildPositionFromEnd() next to GetChildPostion() so the two loops were together. And I think we'd still need the cases that are handled in GetChildPositionFromEnd(), even if the methods were combined. I'll keep the two methods separate so that both implementations are clear.

It feels like the correlation between cache checks performed by GetChildPositionFromEnd and cache checks performed here would be more obvious if we were using GetCachedSlot helper rather than working with _children on the low level.

Updated.

AlekseyTs · 2023-02-16T17:14:56Z

Done with review pass (commit 10), haven't looked at tests yet

AlekseyTs · 2023-02-16T21:12:09Z

src/Compilers/Core/Portable/Syntax/SyntaxNode.cs

+        internal int GetChildPositionFromEnd(int index)
+        {
+            var green = this.Green;
+            int offset = green.GetSlot(index)?.FullWidth ?? 0;


green.GetSlot(index)

It looks like we completely ignore the fact that the item could be cached by now and, therefore, there is no need to iterate. #Closed

This matches the behavior of GetChildPosition() above. It looks like GetChildPosition() for a syntax list is typically called when creating the corresponding red node, which is then cached.

This matches the behavior of GetChildPosition() above.

I won't object to adjusting that one too.

AlekseyTs · 2023-02-17T16:27:38Z

src/Compilers/CSharp/Test/Syntax/Syntax/SyntaxListTests.cs

+
+        [Theory]
+        [CombinatorialData]
+        public void EnumerateWithManyChildren_Forward(bool trailingSeparator)


EnumerateWithManyChildren_Forward

What would be the failure for these tests without the fix? Timeout? If so, consider adding a comment about that. #Resolved

AlekseyTs

LGTM (commit 13)

cston · 2023-02-24T16:05:04Z

@dotnet/roslyn-compiler, please review.

cston · 2023-03-06T18:17:51Z

@dotnet/roslyn-compiler for a second review, thanks.

333fred

Approach generally looks good to me, but I'm a bit concerned that the new tests won't catch performance regressions. Consider making the number of iterations more extreme (so that without this change, we'd definitely time out the CI build) or throw some kind of exception after a period of time in the test.

333fred · 2023-03-08T01:12:06Z

src/Compilers/Core/Portable/Syntax/SyntaxNode.cs

+
+            var green = this.Green;
+            int offset = green.GetSlot(index)?.FullWidth ?? 0;
+            int slotCount = green.SlotCount;


Consider moving the -1 to this expression.

cston · 2023-03-08T15:10:15Z

/azp run

azure-pipelines · 2023-03-08T15:10:35Z

Azure Pipelines successfully started running 2 pipeline(s).

cston · 2023-03-08T16:27:27Z

/azp run

azure-pipelines · 2023-03-08T16:27:49Z

Azure Pipelines successfully started running 2 pipeline(s).

Use next sibling in SyntaxNode.GetChildPosition() if available

6d82cde

dotnet-issue-labeler bot added the Area-Compilers label Feb 14, 2023

CyrusNajmabadi reviewed Feb 14, 2023

View reviewed changes

cston added 4 commits February 14, 2023 10:32

Update test

86786a2

Add test

4bea78e

Fix build

050f570

Verify child positions

14c639e

cston marked this pull request as ready for review February 15, 2023 21:47

cston requested a review from a team as a code owner February 15, 2023 21:47

CyrusNajmabadi reviewed Feb 15, 2023

View reviewed changes

cston added 2 commits February 15, 2023 15:54

Add verification without cache

95880f0

Update comments

8f18f02

AlekseyTs reviewed Feb 16, 2023

View reviewed changes

cston added 3 commits February 16, 2023 08:48

Add comment

9e8e9ce

Add tests with trailing separator

a98cb9a

Remove comment

bff2282

AlekseyTs reviewed Feb 16, 2023

View reviewed changes

cston added 2 commits February 16, 2023 12:29

Add comment

528a2a8

Use GetCachedSlot()

80f0ab8

AlekseyTs reviewed Feb 16, 2023

View reviewed changes

cston requested a review from a team February 16, 2023 22:32

build-analysis bot mentioned this pull request Feb 16, 2023

NuGet failing with Response status code does not indicate success: 503 (Service Unavailable) dotnet/arcade#11723

Open

5 tasks

Call GetCachedSlot(index)

724aecd

AlekseyTs reviewed Feb 17, 2023

View reviewed changes

AlekseyTs approved these changes Feb 17, 2023

View reviewed changes

cston requested a review from a team February 17, 2023 19:17

333fred approved these changes Mar 8, 2023

View reviewed changes

Update tests: add comments; increase iterations

c187e3d

cston merged commit 5b924e8 into dotnet:main Mar 8, 2023

cston deleted the 66475 branch March 8, 2023 17:44

ghost added this to the Next milestone Mar 8, 2023

cston mentioned this pull request Mar 16, 2023

SyntaxValueProvider: avoid performance issue with syntax list containing many items dotnet/runtime#83483

Merged

allisonchou modified the milestones: Next, 17.6 P3 Mar 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use next sibling in SyntaxNode.GetChildPosition() if available #66876

Use next sibling in SyntaxNode.GetChildPosition() if available #66876

cston commented Feb 14, 2023 •

edited

Loading

cston commented Feb 14, 2023

CyrusNajmabadi commented Feb 14, 2023

CyrusNajmabadi Feb 14, 2023

cston Feb 14, 2023 •

edited

Loading

CyrusNajmabadi Feb 14, 2023

cston Feb 14, 2023

CyrusNajmabadi Feb 14, 2023

cston Feb 15, 2023

cston commented Feb 14, 2023

CyrusNajmabadi Feb 15, 2023

cston Feb 15, 2023

AlekseyTs Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

AlekseyTs Feb 16, 2023

CyrusNajmabadi commented Feb 16, 2023

AlekseyTs Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023

AlekseyTs commented Feb 16, 2023

AlekseyTs Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023

AlekseyTs Feb 17, 2023

AlekseyTs Feb 17, 2023 •

edited by cston

Loading

AlekseyTs left a comment

cston commented Feb 24, 2023

cston commented Mar 6, 2023

333fred left a comment

333fred Mar 8, 2023

cston commented Mar 8, 2023

azure-pipelines bot commented Mar 8, 2023

cston commented Mar 8, 2023

azure-pipelines bot commented Mar 8, 2023

Use next sibling in SyntaxNode.GetChildPosition() if available #66876

Use next sibling in SyntaxNode.GetChildPosition() if available #66876

Conversation

cston commented Feb 14, 2023 • edited Loading

cston commented Feb 14, 2023

CyrusNajmabadi commented Feb 14, 2023

Choose a reason for hiding this comment

cston Feb 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cston commented Feb 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlekseyTs Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

cston Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CyrusNajmabadi commented Feb 16, 2023

AlekseyTs Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

cston Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

cston Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlekseyTs commented Feb 16, 2023

AlekseyTs Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlekseyTs Feb 17, 2023 • edited by cston Loading

Choose a reason for hiding this comment

AlekseyTs left a comment

Choose a reason for hiding this comment

cston commented Feb 24, 2023

cston commented Mar 6, 2023

333fred left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cston commented Mar 8, 2023

azure-pipelines bot commented Mar 8, 2023

cston commented Mar 8, 2023

azure-pipelines bot commented Mar 8, 2023

cston commented Feb 14, 2023 •

edited

Loading

cston Feb 14, 2023 •

edited

Loading

AlekseyTs Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

AlekseyTs Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

cston Feb 16, 2023 •

edited

Loading

AlekseyTs Feb 16, 2023 •

edited

Loading

AlekseyTs Feb 17, 2023 •

edited by cston

Loading