Removed WeakReference<IHosts> already cleaned up by GC #4995

mstfbl · 2020-04-01T21:58:18Z

WeakReference objects are stored in _children. Occasionally Garbage Collector cleans up these WeakReferences and change their target value to null, but their WeakReferences in the List _children are never removed. This PR explicity removed those WeakReference objects that have null targets.

List.RemoveAll() has a time-complexity of O(n) where n is the number of objects in List.

src/Microsoft.ML.Core/Environment/HostEnvironmentBase.cs

antoniovs1029 · 2020-04-01T23:36:10Z

In general, I guess it's ok and that it will work. But I am not 100% sure of having to do this every time a new IHost is Registered. Wonder if there's any performance impact because of this, but probably not, eventhough Register is called several times throughout the codebase. And in the other hand, after quickly looking into the code, I don't know where else could we do this.

Since I am not familiar enough with IHost's I don't know if we could add a mechanism somewhere so that when we're about to stop using the IHost it can report to its "parent" that it's good to remove it from the _children List.

Wonder if @eerhardt has any thoughts on this, since he has recently encountered this issue.

antoniovs1029 · 2020-04-02T16:58:29Z

Thanks for addressing my comment, and @eerhardt 's comment on multithreading.

I wonder, did you found any performance difference when running this on the original issue scenario?

mstfbl · 2020-04-02T19:34:48Z

@antoniovs1029 No worries! I did not see any performance difference when running this on master vs. on my own branch with these changes. I am currently running performance benchmark tests comparing the effect of my changes, and will share the results here when I'm done.

mstfbl · 2020-04-03T04:08:20Z

I ran the ImageClassificationBench benchmark in Microsoft.ML.Benchmarks following the directions here to compare between the current master branch and the changes I've made in this PR. I ran this benchmark 4-5 times with no other background processes running, and the figures below were roughly unchanged.

Without changes:

Method Mean Error StdDev Extra Metric Gen 0 Gen 1 Gen 2 Allocated

TrainResnetV250 51.22 s 46.27 s 2.536 s - 80000.0000 34000.0000 12000.0000 2.01 GB

With changes:

Method Mean Error StdDev Extra Metric Gen 0 Gen 1 Gen 2 Allocated

TrainResnetV250 52.44 s 78.95 s 4.327 s - 80000.0000 34000.0000 12000.0000 2.01 GB

Explanation of Legends:

// * Legends *
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
Extra Metric : Value of the provided extra metric
Gen 0 : GC Generation 0 collects per 1000 operations
Gen 1 : GC Generation 1 collects per 1000 operations
Gen 2 : GC Generation 2 collects per 1000 operations
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
1 s : 1 Second (1 sec)

Here the results show no significant change in performance with these changes. Hence the changes of this PR do not add significant performance penalties.

eerhardt · 2020-04-03T15:40:21Z

I wouldn't use ImageClassificationBench as a measurement here. Most of the time in that benchmark isn't spent in ML.NET code, but instead in TensorFlow.

Instead, I would run the following:

https://github.com/dotnet/machinelearning/blob/master/test/Microsoft.ML.Benchmarks/KMeansAndLogisticRegressionBench.cs
https://github.com/dotnet/machinelearning/blob/master/test/Microsoft.ML.Benchmarks/Text/MultiClassClassification.cs
https://github.com/dotnet/machinelearning/blob/master/test/Microsoft.ML.Benchmarks/PredictionEngineBench.cs

Which should give a pretty good indication if anything has regressed.

mstfbl · 2020-04-03T20:31:26Z

@eerhardt I ran the KMeansAndLogisticRegressionBench and PredictionEngineBench benchmarks as well, the results are below. Again I do not see any significant changes in performance.

Without changes:
KMeansAndLogisticRegressionBench

Method Mean Error StdDev Extra Metric

TrainKMeansAndLR 43.01 ms 4.002 ms 4.609 ms -

PredictionEngineBench

Method Mean Error StdDev Extra Metric

MakeIrisPredictions 5.614 ms 0.0925 ms 0.0865 ms -

MakeSentimentPredictions 55.626 ms 2.1345 ms 2.4580 ms -

MakeBreastCancerPredictions 2.062 ms 0.0916 ms 0.1055 ms -

With changes:
KMeansAndLogisticRegressionBench

Method Mean Error StdDev Extra Metric

TrainKMeansAndLR 40.00 ms 4.658 ms 5.364 ms -

PredictionEngineBench

Method Mean Error StdDev Extra Metric

MakeIrisPredictions 5.688 ms 0.1077 ms 0.1197 ms -

MakeSentimentPredictions 60.780 ms 0.3018 ms 0.2356 ms -

MakeBreastCancerPredictions 2.218 ms 0.0038 ms 0.0033 ms -

eerhardt · 2020-04-03T20:34:35Z

MakeSentimentPredictions seems to be outside of the error range. Basically it is 10% slower (55ms to 60ms). Same for MakeBreastCancerPredictions. If you run it multiple times, are these numbers consistent?

mstfbl · 2020-04-03T22:05:12Z

@eerhardt I have attached a .xlsx file containing the results for 5 consecutive runs for PredictionEngineBench for both with and without changes (I attempted to share the full results directly in this markdown comment, but the tables don't turn out good.)

(https://github.com/dotnet/machinelearning/files/4430244/Benchmarks.Comparison.xlsx)

Here is a summary averaged over 5 runs:
MakeIrisPredictions increased by 2%
MakeSentimentPredictions increased by 1%
MakeBreastCancerPredictions decreased by 2%.

I first ran 5 runs for without changes, then ran 5 runs for with changes. Between each run I copied the new results to a text file editor. No other major applications were running in the background.

After multiple runs I still believe the run times for both with and without changes are very similar, with slight increases for with changes in runs 2 and 4.

Edit: Replaced copied results with an .xlsx file that does a better job of succinctly displaying results. Thank you @harishsk for the heads up.

Removing unused IHosts

28c14f9

mstfbl requested a review from a team as a code owner April 1, 2020 21:58

mstfbl requested review from harishsk and antoniovs1029 April 1, 2020 22:00

antoniovs1029 reviewed Apr 1, 2020

View reviewed changes

src/Microsoft.ML.Core/Environment/HostEnvironmentBase.cs Outdated Show resolved Hide resolved

Removed unused hosts in both Register methods

4c54ff6

mstfbl mentioned this pull request Apr 2, 2020

Debugging Test PR #4908

Closed

harishsk approved these changes Apr 4, 2020

View reviewed changes

mstfbl merged commit 820f1e2 into dotnet:master Apr 4, 2020

mstfbl mentioned this pull request May 26, 2020

Gradual Memory Increase #5161

Closed

ghost locked as resolved and limited conversation to collaborators Mar 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removed WeakReference<IHosts> already cleaned up by GC #4995

Removed WeakReference<IHosts> already cleaned up by GC #4995

Uh oh!

mstfbl commented Apr 1, 2020 •

edited

Loading

Uh oh!

Uh oh!

antoniovs1029 commented Apr 1, 2020

Uh oh!

antoniovs1029 commented Apr 2, 2020

Uh oh!

mstfbl commented Apr 2, 2020

Uh oh!

mstfbl commented Apr 3, 2020

Uh oh!

eerhardt commented Apr 3, 2020

Uh oh!

mstfbl commented Apr 3, 2020

Uh oh!

eerhardt commented Apr 3, 2020 •

edited

Loading

Uh oh!

mstfbl commented Apr 3, 2020 •

edited

Loading

Uh oh!

Uh oh!

Removed WeakReference<IHosts> already cleaned up by GC #4995

Removed WeakReference<IHosts> already cleaned up by GC #4995

Uh oh!

Conversation

mstfbl commented Apr 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

antoniovs1029 commented Apr 1, 2020

Uh oh!

antoniovs1029 commented Apr 2, 2020

Uh oh!

mstfbl commented Apr 2, 2020

Uh oh!

mstfbl commented Apr 3, 2020

Uh oh!

eerhardt commented Apr 3, 2020

Uh oh!

mstfbl commented Apr 3, 2020

Uh oh!

eerhardt commented Apr 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mstfbl commented Apr 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mstfbl commented Apr 1, 2020 •

edited

Loading

eerhardt commented Apr 3, 2020 •

edited

Loading

mstfbl commented Apr 3, 2020 •

edited

Loading