Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically capture heap dumps #3667

Open
wants to merge 61 commits into
base: version-5.0.0
Choose a base branch
from
Open

Conversation

jamescrosswell
Copy link
Collaborator

@jamescrosswell jamescrosswell commented Oct 9, 2024

Resolves #2580

Basic usage

samples/Sentry.Samples.Console.Basic/Sentry.Samples.Console.Basic.csproj includes an example of how to configure automatic heap dumps:

// This option tells Sentry to capture a heap dump when the process uses more than 5% of the total memory. The heap
// dump will be sent to Sentry as a file attachment.
options.EnableHeapDumps(5);
// This determines the level of heap dump events that are sent to Sentry
options.HeapDumpEventLevel = SentryLevel.Warning;
// A debouncer can be configured to tell Sentry how frequently to send heap dumps. In this case we've configured it
// to capture a maximum of 3 events per day and to wait at least 1 hour between each event.
options.HeapDumpDebouncer = Debouncer.PerDay(3, TimeSpan.FromHours(1));

That sample demonstrates using a built in trigger that we provide, that triggers heap dumps when memory usage exceeds a certain threshold (5% in that example).

Alternatively there is an override for the EnableHeapDumps method that SDK users can call to provide a custom trigger:

public void EnableHeapDumps(HeapDumpTrigger trigger) => HeapDumpTrigger = trigger;

Prerequisites

dotnet-gcdump

The SDK relies on dotnet-gcdump to capture the heap dumps.

SDK users can bundle this with their application by setting the SentryBundleGCDump build property to true in their csproj file.

Alternatively, the dotnet-gcdump can be installed globally on the machine or container where the heap dumps will be captured.

dotnet tool install --global dotnet-gcdump

net6.0

dotnet-gcdump requires .NET 6 or later.

Analysing the Dump File

A couple of different options:

  1. dotnet-gcdump report
  2. Open the dump file with Perfview or Visual Studio... provides richer functionality but is only available on Windows.
  3. dotnet-heapview

See Stefan Geiger's blog for a bit of info on the first two options.

@jamescrosswell jamescrosswell linked an issue Oct 10, 2024 that may be closed by this pull request
Directory.Build.props Outdated Show resolved Hide resolved
Copy link
Member

@bruno-garcia bruno-garcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we make an alpha from this branch and test on Symbol Collector?

samples/Sentry.Samples.Console.Basic/Program.cs Outdated Show resolved Hide resolved
using var sut = _fixture.GetSut();

// Act
sut.CaptureMemoryDump();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any tests where CaptureMemoryDump throw? More concerned about the caller actually. Since we don't want to crash the app, do we handle every Task we create?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Task that calls that is ultimately this:

// Since we're not awaiting the task, the continuation will happen elsewhere but that's OK - all we care about
// is that any exceptions get logged as soon as possible.
GarbageCollectionMonitor.Start(CheckMemoryUsage, _cancellationTokenSource.Token)
.ContinueWith(
t => _options.LogError(t.Exception!, "Garbage collection monitor failed"),
TaskContinuationOptions.OnlyOnFaulted // guarantees that the exception is not null
);

I played around with various things and eventually settled on a FireAndForget with a Continuation. The alternative of holding a reference to the task wasn't ideal because you only become aware of any errors when you await the task... which here would have to happen when disposing of the MemoryMonitor (application shutdown). I wanted to log any errors as soon as they occur however.

I've refactored the GarbageCollectionMonitor and added some GarbageCollectionMonitorTests just to make sure cancellation happens as we expect (without bubbling any exceptions up that might disrupt the calling code).

I think ultimately what you're wanting to test is that task continuation but the fire and forget nature of that code makes it difficult to test. I did test it manually in a console app, when I was investigating how to detect any non-cancellation errors as soon as they happen.

@jamescrosswell
Copy link
Collaborator Author

How about we make an alpha from this branch and test on Symbol Collector?

We can. Currently the merge target is the version-5.0.0 branch so an alternative would be to make a 5.0.0-alpha1 release from there, once this PR has been approved/merged.

@bruno-garcia
Copy link
Member

How about we make an alpha from this branch and test on Symbol Collector?

We can. Currently the merge target is the version-5.0.0 branch so an alternative would be to make a 5.0.0-alpha1 release from there, once this PR has been approved/merged.

Lets make the release then, before we approve (lets not merge this accidently)

@bruno-garcia
Copy link
Member

@jamescrosswell feel free to add it to https://github.com/getsentry/symbol-collector if it helps test this

Copy link
Member

@bruno-garcia bruno-garcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest I didn't look in detail as much as this change might require but it does seem good to go.

Looking at the preview package though, it does have a LOT of files. I wonder if we have to be concerned about the increase in size (sentry-cli is huge already)?

image

src/Sentry/Internal/Hub.cs Outdated Show resolved Hide resolved
src/Sentry/Internal/MemoryMonitor.cs Show resolved Hide resolved
@bruno-garcia
Copy link
Member

also maybe @filipnavara might be interested in this feature. So might be willing to review the PR too 🙏

@bruno-garcia
Copy link
Member

Is CI on Windows really taking over 1 hour and 20 minutes? or this is hanging? https://github.com/getsentry/sentry-dotnet/actions/runs/11659928344/job/32500819174?pr=3667

The other day I killed a build that seemed to be handing but didn't take note of whcih agent

@jamescrosswell
Copy link
Collaborator Author

Is CI on Windows really taking over 1 hour and 20 minutes? or this is hanging?

Seems to be hanging since merging in the latest updates from the version-5.0.0 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic Heap dump collection
4 participants