Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace JObjects in PackageMetadataResourceV3 with strong type for better performance, lower memory consumption #3462

Merged

Conversation

erdembayar
Copy link
Contributor

@erdembayar erdembayar commented Jun 18, 2020

Bug

Fixes: NuGet/Home#9719
Regression: No

  • Last working version:
  • How are we preventing it in future:

Fix

Details:
Using VS nuget GUI to consolodate different versions of a package crashes the instance of VS. Occurs in 2017, 2019 and 2019 int preview.
One of main problem was PM UI creates giant Jobject and millions of Nuget related objects and ballooning memory.
I am replacing this strong type instead of JObject.

Testing/Validation

Tests Added: Yes
Reason for not adding tests:
Validation: Private unit test.

@davidfowl
Copy link
Member

Where are the before and after benchmarks?

Copy link
Contributor

@donnie-msft donnie-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Few questions/comments..

.Select(ParseMetadata)
.Select(m => metadataCache.GetObject(m))
.ToArray();
var range = VersionRange.All; // This value preset for Consolidate UI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this method is specific to Consolidate? I think several UI codepaths come here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus potentially any customer who has a PackageReference to NuGet.Protocol in their own app.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed wording. Actually it's from "Update UI" too.

@nkolev92
Copy link
Member

@erdembayar Can you provide some benchmarks please.

Ensure you profile with different sizes of registration pages.

@erdembayar
Copy link
Contributor Author

erdembayar commented Jun 19, 2020

Where are the before and after benchmarks?

I didn't do any benchmarking yet. But finally I was able to see Consolidate page of this giant solution, I guess that is very good sign of improvement. There are 62 packages need consolidation.

image

Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is the only change needed for NuGet/Home#8352, then ok. If that issue is going to need other changes, then please create a new issue specific to the changes in this PR, and link that issue instead.

.Select(ParseMetadata)
.Select(m => metadataCache.GetObject(m))
.ToArray();
var range = VersionRange.All; // This value preset for Consolidate UI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus potentially any customer who has a PackageReference to NuGet.Protocol in their own app.

@zivkan
Copy link
Member

zivkan commented Jun 19, 2020

Also, this PR's title suggests that all JObjects thoughout the entire codebase were replaced. Please change the title so it's obvious that it's specific to the protocol registration resources.

@erdembayar erdembayar changed the title Replace JObjects with strong type for better performance, lower memory consumption Replace JObjects in PackageMetadataResourceV3 with strong type for better performance, lower memory consumption Jun 19, 2020
@erdembayar
Copy link
Contributor Author

Also, this PR's title suggests that all JObjects thoughout the entire codebase were replaced. Please change the title so it's obvious that it's specific to the protocol registration resources.

Ok. I changed PR name.

{
catalogEntry.ReportAbuseUrl = _reportAbuseResource?.GetReportAbuseUrl(catalogEntry.PackageId, catalogEntry.Version);
catalogEntry.PackageDetailsUrl = _packageDetailsUriResource?.GetUri(catalogEntry.PackageId, catalogEntry.Version);
metadataCache.GetObject(catalogEntry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of GetObject isn't used. Can we stop calling it and get rid of the changes related to reflection?

Copy link
Contributor Author

@erdembayar erdembayar Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I checked it what this thing is doing, it do have benefit. It depopulate same string contents because of string is true immutable type. All same content strings point to same memory address, I did manually checked it, it works for string type, so I decided to improve it. It's hard to explain here, but I can talk to you on Teams.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this is helping. This is trying to dedup the full object. I don't see the NuGetVersion and Strings being deduped at all.

Copy link
Contributor Author

@erdembayar erdembayar Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caching method was already here before my change so I don't expect any drastic changes, I only added little improvement on reflection part of cache so it would work little faster otherwise it's same thing. Also it only works string property with both getter and setter. I don't think this one decrease number of NuGetVersions, only decrease overall memory size reserved by string properties of instances passed through this cache.
The moment this one runs we already passed Json part and have end result shouldn't see any changes from previously.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the strings are getting deduped.
I ran Get on the nuget.packagemanagement package and the cache was empty at the end.

Copy link
Contributor Author

@erdembayar erdembayar Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nkolev92 what do you mean by cache exactly? For example for below case nothing happens, because a and b are already pointing same object reference:
var a = "aaa";
var b = "aaa";

var res = Object.ReferenceEquals(a, b);
Console.WriteLine(res); //Expect true

But below case you would expect deduplication if you use this cache tool, because they're not same object reference.
var a = "aaa";
var b = "aa";
var c = b+ "a";
var res = Object.ReferenceEquals(a, c);
Console.WriteLine(res); //expect False

Comment on lines 106 to 112
Type stringPropertyType = typeof(string);
MethodInfo method = _metadataReferenceCacheType.GetTypeInfo()
.DeclaredMethods.FirstOrDefault(
m =>
m.Name == CachableTypesMap[stringPropertyType] &&
m.GetParameters().Select(p => p.ParameterType).SequenceEqual(new Type[] { stringPropertyType }));
_stringMethodsCache.Add(_metadataReferenceCacheType, method);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me. It's looking for methods defined on the MetadataReferenceCache class, rather than generic type T's class. Is that a mistake?

Copy link
Contributor Author

@erdembayar erdembayar Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add this logic, it it was there actually before my change, I just moved to top for multiple use.
image

@@ -17,6 +17,8 @@ public class MetadataReferenceCache
private readonly Dictionary<string, string> _stringCache = new Dictionary<string, string>(StringComparer.Ordinal);
private readonly Dictionary<Type, PropertyInfo[]> _propertyCache = new Dictionary<Type, PropertyInfo[]>();
private readonly Dictionary<string, NuGetVersion> _versionCache = new Dictionary<string, NuGetVersion>(StringComparer.Ordinal);
private readonly Dictionary<Type, MethodInfo> _stringMethodsCache = new Dictionary<Type, MethodInfo>();
private readonly Type _metadataReferenceCacheType = typeof(MetadataReferenceCache);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is MetadataReferenceCache. It seems very strange to me why you'd need to use reflection for this class to access itself.

Copy link
Contributor Author

@erdembayar erdembayar Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add this logic, it it was there actually before my change, I just moved to top for multiple use.
image

@erdembayar
Copy link
Contributor Author

erdembayar commented Jun 25, 2020

@nkolev92 @zivkan @davidfowl
I ran benchmarking with Newtosoft.Json package.
It looks runtime is 60% of previous implementation and memory is almost half of previous implementation. You can find my code used for this from here.
To run it:

  1. Compile in Release mode,
  2. Open console cmd. Close VS and don't do anything while running test, it skews test.
  3. go to NuGet.Client\test\Benchmarking\Nuget.Protocol.Benchmarking\bin\Release\netcoreapp3.1
  4. dotnet Nuget.Protocol.Benchmarking.dll
  5. Wait for result.

image

@erdembayar erdembayar force-pushed the dev-eryondon-ReplaceJObjectsInmemoryForBetterPerformance branch from f7632f4 to 44779ac Compare June 25, 2020 03:17
@zivkan
Copy link
Member

zivkan commented Jun 25, 2020

Any ideas why the GC stats are not showing? I can't believe that 3MB of allocations didn't trigger any GC. What TFM did you benchmark with? Do different TFMs have different results (I know we care primarily about Visual Studio, but I'm curious if .NET Core has different perf characteristics).

edit: And as Nikolche pointed out below, it would be good to benchmark small, medium, large json payload sizes to see if there's much difference.

@nkolev92
Copy link
Member

Consider using "fake" to benchmark. It contains lots of versions, the true extreme case :)

Copy link
Member

@nkolev92 nkolev92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor suggestions.

Please do the benchmark with different sizes of registration pages for completeness.

/// <param name="log"></param>
/// <param name="token"></param>
/// <returns>Package meta data.</returns>
/// <remarks>The inlined entries are potentially going away soon</remarks>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best way to determine it by following the references that VS shows you.

I'm not surprised that VS only only goes through the "all" codepath.

@nkolev92
Copy link
Member

Updated the body of the issue to close NuGet/Home#9719 instead of NuGet/Home#8352 because that one is really an epic.

@erdembayar
Copy link
Contributor Author

erdembayar commented Jun 26, 2020

Any ideas why the GC stats are not showing? I can't believe that 3MB of allocations didn't trigger any GC. What TFM did you benchmark with? Do different TFMs have different results (I know we care primarily about Visual Studio, but I'm curious if .NET Core has different perf characteristics).

edit: And as Nikolche pointed out below, it would be good to benchmark small, medium, large json payload sizes to see if there's much difference. Almost 2 times improvement across all load sizes.

First one I am targeting 3.1 Dotnet core.
Here S=small ("_" package), M=medium("Newtonsoft"), L=large("Fake") loads.

image

It looks Large = "Fake" has 27 small parts that is making it slow overall, still new implementation is faster than before.

Here is same things for Dot.net 4.8, somehow this one shows result in milliseconds instead of microseconds. Also improvement here is more drastic than .net core for large load, but almost nothing on small loads.
New implementation is almost 2-80 times faster than on large loads, also consume less memory. Not sure why we gain 80 times faster performance during one of benchmark profiling.

image

image

image

@erdembayar erdembayar requested a review from zivkan June 26, 2020 16:40
Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to one day refactor NuGet.Protocol to make things easier to read, but who knows if we'll ever get around to it.

Perf improvements look fantastic. Great work! 👍

@erdembayar erdembayar requested a review from zivkan June 26, 2020 20:46
@erdembayar erdembayar force-pushed the dev-eryondon-ReplaceJObjectsInmemoryForBetterPerformance branch from ea16b21 to a2e28fd Compare June 26, 2020 21:06
@erdembayar erdembayar merged commit 7532d57 into dev Jun 27, 2020
@erdembayar erdembayar deleted the dev-eryondon-ReplaceJObjectsInmemoryForBetterPerformance branch June 27, 2020 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve memory performance of PackageMetadataResourceV3 by reducing the JObject dependencies
6 participants