-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Remove strong reference of XNamespace from XName #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/libraries/System.Private.Xml.Linq/src/System/Xml/Linq/XName.cs
Outdated
Show resolved
Hide resolved
|
For a change like this, it's good to explain for the benefit of reviewers and future readers what the problem is and how you are solving it. As I understand it, (1) each In the repro, by creating lots of With your change here, the If I understand correctly then, you need to also avoid the strong reference from Incidentally it would be nice to update this code at the same time to use |
Thank you for pointing it out, updated the description
For me this change will be enough, as you pointed making XName or XHashtable to WeakReference would degrade performance and the purpose of caching would be lost IMO. Please let me know what you guys think
Sure, I was wondering should it update it with this PR |
src/libraries/System.Private.Xml.Linq/src/System/Xml/Linq/XNamespace.cs
Outdated
Show resolved
Hide resolved
| if (count == 0) return None; | ||
|
|
||
| // determine whether it's a special namespace | ||
| if (count == xmlPrefixNamespace.Length && string.CompareOrdinal(namespaceName, index, xmlPrefixNamespace, 0, count) == 0) return Xml; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this make any perf difference in putting this check here vs as before inside of the if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it because of the loop, but as we have no loop now I think its better to be inside if statement for perf, will move back
krwq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with couple of nits
|
After garbage collection Namespace do get collected but that introducing another issue of creating XNames with same value after GC is not same instance with the one before GC anymore, as new XName would create new Namespace for collected one. Need to find out another solution, so labeled as XName a = "Name2";
Thread.Sleep(1000);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
var b = new XElement($"Name2");
Assert.NotSame(a, b.Name); |
|
@buyaa-n regardless of the fix try to add something similar as above as a test case (without sleep) |
Sure i could do that, but i think i saw somewhere that never call GC in tests, don't remember the reason nor where i saw it, but i might be mistaken ... |
|
@buyaa-n haven't heard that before but I'm curious why if that's the case |
…XName> into XHastable<WeakReference<XName>>
OK, added the test With latest update following change made:
|
krwq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, as we talked offline leave the original repro running over night, if it still runs in the morning then feel free to merge
Is it fine? It shows this change makes it 6-15% slower. That's probably expected and inevitable, but I don't think you can just say it's "fine" -- I think you need to say a little more about the basis for that judgement. |
|
@stephentoub you signed off on a different approach than what we have now. Could you please confirm you're OK with this new approach before @buyaa-n merges? Please see my comment above, about the issue with the original approach. However my suggestion there to use WeakReference in both directions was bad, since it turns out that XLINQ depends on both XName and XNamespace being atomized. That breaks if XName does not have a strong reference to XNamespace, because the latter gets quickly collected if it was created when you do something like In this approach, the only change (other than using |
I meant relatively with the other solutions perf test buyaa-n/corefx#4 (comment), but yes i guess i should have said |
| XNamespace ns; | ||
|
|
||
| // Keep looping until a non-null namespace has been retrieved | ||
| do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this loop no longer necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before it was using non generic WeakReference and even weakReference of a Namespace found from table its target might be null, so it was need to loop again when target is was null. But now we are using WeakReference<XNamespace> which TryGetTarget(out ns) which returns target atomically, i. e. no need to check for null of out variable ns and loop again in case TryGetTarget(out ns) the method returns true. In case of ref not found or ns was not alive we are just creating new instance, so no need to loop for that case too
| private readonly string _namespaceName; | ||
| private readonly int _hashCode; | ||
| private readonly XHashtable<XName> _names; | ||
| private readonly XHashtable<WeakReference<XName>> _names; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change (and all of the follow-on changes from it)?
Presumably it's to avoid keeping a live an XName object? But this is still going to keep-alive a WeakReference<XName> object even if it loses its reference to the XName, so is there something deeper it's aiming to do? e.g. is the XName transitively keeping alive a large graph and that's the problem?
I'm also concerned about potential perf impact. Have you measured? The comments on XHashtable suggest this code is very perf sensitive, yet we're introducing additional allocations and indirections here to access the data in the WeakReference<T>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change (and all of the follow-on changes from it)?
Namespce keeps its related XNames into this table which is growing endlessly and causing memory leak when adding elements continuously, simple repro:
while (true)
{
XName _ = $"x{Guid.NewGuid()}";
}At first I have removed strong ref from XName into XNamespace, which fixed above problem by allowing XNamespace GC-ed, but that eliminates Atomicity of XName and XNamespace, no duplicate NameSpace should exist and no duplicate XName within a NS. If we are keeping a XName instance its Namespace should also kept. So I was need to restore the strong reference back.
Presumably it's to avoid keeping a live an XName object?
Yes, making XNames weak within hashtable enables non referenced XNames GC-ed.
But this is still going to keep-alive a WeakReference object even if it loses its reference to the XName, so is there something deeper it's aiming to do? e.g. is the XName transitively keeping alive a large graph and that's the problem?
The XHashtable used here built to be used with WeakRefence, it cleans up the key which value is released due to WeakReference and nulls the corresponding WeakReference. So XHashtable will not grow tremendously unless all XNames within it kept referenced
I'm also concerned about potential perf impact. Have you measured? The comments on XHashtable suggest this code is very perf sensitive, yet we're introducing additional allocations and indirections here to access the data in the WeakReference.
Yes I have measured performance, the results not that stable on my machine, below result could show average of all perf run results:
dotnet run -f netcoreapp2.1 --base "D:\dotnet\results2\before" --diff "D:\dotnet\results2\after3" --threshold 2%
summary:
better: 2, geomean: 1.073
worse: 5, geomean: 1.110
total diff: 7
| Slower | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
|---|---|---|---|---|
| System.Xml.Linq.Perf_XElement.CreateWithElements | 1.15 | 80.00 | 92.07 | several? |
| System.Xml.Linq.Perf_XElement.GetAttribute | 1.12 | 26.86 | 30.14 | several? |
| System.Xml.Linq.Perf_XElement.GetElement | 1.10 | 37.22 | 41.03 | several? |
| System.Xml.Linq.Perf_XName.GetLocalNameFromExpandedName | 1.09 | 28.65 | 31.26 | |
| System.Xml.Linq.Perf_XName.GetLocalName | 1.08 | 25.88 | 28.05 |
| Faster | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
|---|---|---|---|---|
| System.Xml.Linq.Perf_XName.NonEmptyNameSpaceToString | 1.09 | 28.11 | 25.84 | |
| System.Xml.Linq.Perf_XElementList.Enumerator | 1.06 | 239.92 | 226.58 | several? |
I have been testing perf with other possible solutions i had and compared to them this result looks acceptable, I didn't found any other/better solution, if you guys have any idea please let me know
Performance sensitivity of XHashtable is mostly from thread safety which I already existing. Additional overhead to the table will be removing invalid WeakRefernces which is as we know, for good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you experiment with using a ConditionalWeakTable instead of XHashtable?
this result looks acceptable
The perf tests demonstrate a 6-15% regression depending on the use case. It's not clear to me why that's acceptable. What other approaches did you explore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@buyaa-n do you perhaps have data on how much that 6-15% shows up in the E2E scenarios? (i.e. large XML document + some queries). I'm not feeling strongly either way on 15% on any of the specific APIs but more interested in E2E impact (if it's still 15% on E2E then that is rather large perf hit). From above data System.Xml.Linq.Perf_XElement.GetElement seems the most concerning to me as it's rather frequently used API (at least I personally use it quite a bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you experiment with using a ConditionalWeakTable instead of XHashtable?
No, will try with that and measure performance
The perf tests demonstrate a 6-15% regression depending on the use case. It's not clear to me why that's acceptable.
As i mentioned before the performance results was not that consistent, i had 3 before, 3 after measurements and checked their combinations. I will post result of those runs in next comment as its long content. Most impacted section should be XName related tests which were not increased more than 10% and some cases even get faster. I might do perf tests again by creating as much stable env as i can.
What other approaches did you explore?
Well most of the other approaches were not really solving the issue, one of them was my initial approach I had here. Here is some link to others buyaa-n/corefx#4, buyaa-n/corefx#5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are those numbers comparing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As perf results was stable i had measured before change and after change 3 times each and compared their combinations. Now I think better to setup Linux env and run the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you experiment with using a ConditionalWeakTable instead of XHashtable?
@stephentoub I tried but that is not really suitable for this issue.
|
@buyaa-n let's wait for @stephentoub to sign off again on your new approach, before merging. Just so we have 2 pairs of eyes. |
|
Sure, I am awaiting his feedback or sign off |
|
Nit, up to you, but consider "git rebase -i" such that you end up with 1x commit to do the cleanup (eg., using |
It cannot really cleanly divided like that, all mixed. Anyways I might need to create new PR due to recreated fork, will keep that in mind if that happens |
|
I had some suggestion from Adam about how to have more stable env when running perf tests. Had more stable test results but the result was not good as I was hoping:
Now I cannot say it is acceptable. Anyways gonna close this PR as its not accessable anymore. Will create another PR if we want to fix the issue with this performance degression. |
Merge from dotnet/runtime
Fixes https://github.com/dotnet/corefx/issues/36404
XNamespace has a hashtable of XNames keeping all Names bound to it, also XNames has strong reference back to its XNamespace.When creating lots of XNames within same XNamespace its XNames Hastable grows endlessly until fill out memory. Even XNamespaces are WeakReferences it stays in memory forever because they have strong references to its XNames. With this change removing that strong reference from XName