Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/signalfx] Fix memory leak on shutdown #30887

Merged
merged 13 commits into from
Mar 1, 2024

Conversation

crobert-1
Copy link
Member

Description:

goleak was detecting leaking goroutines in tests, this attempts to resolve. I found what appeared to be a couple races but can't reproduce locally so I'll run CI a few times to ensure this works as expected.

Changes in PR:

  1. Add correlation client Shutdown function that blocks on the waitgroup. This is the main fix of this PR that should fix the leaking goroutines.
  2. Re-organize the shutdown process of the apm client correlation test suite to properly synchronize the shutting down process.
  3. Fix typo
  4. Add goleak checks to exporter/signalfx/internal/correlationandexporter/signalfx/internal/apm/correlations`

Link to tracking Issue:
Resolves #30864
#30438

Testing:
All existing and added tests should be passing. Since this has only failed in CI I'm going to try to run it a few times before marking as ready for review.

@crobert-1
Copy link
Member Author

Added failing connector test as freq of #31005

@crobert-1
Copy link
Member Author

Tests have passed 6 times in a row, I believe this is ready for review.

@crobert-1 crobert-1 marked this pull request as ready for review February 6, 2024 19:30
@crobert-1 crobert-1 requested a review from a team February 6, 2024 19:30
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 23, 2024
@crobert-1 crobert-1 removed the Stale label Feb 26, 2024
@atoulme atoulme added the ready to merge Code review completed; ready to merge by maintainers label Feb 29, 2024
Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one clarification

@mx-psi mx-psi merged commit 494bdb8 into open-telemetry:main Mar 1, 2024
141 of 142 checks passed
@github-actions github-actions bot added this to the next release milestone Mar 1, 2024
dmitryax added a commit to dmitryax/opentelemetry-collector-contrib that referenced this pull request Mar 4, 2024
XinRanZhAWS pushed a commit to XinRanZhAWS/opentelemetry-collector-contrib that referenced this pull request Mar 13, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
`goleak` was detecting leaking goroutines in tests, this attempts to
resolve. I found what appeared to be a couple races but can't reproduce
locally so I'll run CI a few times to ensure this works as expected.

Changes in PR:
1. Add correlation client Shutdown function that blocks on the
waitgroup. This is the main fix of this PR that should fix the leaking
goroutines.
2. Re-organize the shutdown process of the apm client correlation test
suite to properly synchronize the shutting down process.
3. Fix typo
4. Add goleak checks to exporter/signalfx/internal/correlation` and
`exporter/signalfx/internal/apm/correlations`

**Link to tracking Issue:** <Issue number if applicable>
Resolves open-telemetry#30864
open-telemetry#30438

**Testing:** <Describe what testing was performed and which tests were
added.>
All existing and added tests should be passing. Since this has only
failed in CI I'm going to try to run it a few times before marking as
ready for review.
mx-psi pushed a commit that referenced this pull request Mar 25, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Changes in PR:

1. Add correlation client Shutdown function that blocks on the
waitgroup. This is the main fix of this PR that should fix the leaking
goroutines.
2. Re-organize the shutdown process of the apm client correlation test
suite to properly synchronize the shutting down process.
3. Fix typo
4. Only block request sender until context is cancelled. The request
processor is shutdown when the context is cancelled, so this would
result in `Shutdown` waiting forever, since the request would never be
processed.
5. Enable goleak in some more packages.

**Note**: This is contains the exact same contents as
#30887,
but change number 4 is new, and should resolve the test issue the
original PR was causing.

**Link to tracking Issue:**
Resolves
#30864

#30438

**Testing:** <Describe what testing was performed and which tests were
added.>
All existing tests are passing, as well as added goleak checks. I'm
going to run this a number of times to try to help ensure it's not flaky
anymore.
@sparam
Copy link

sparam commented Mar 25, 2024

@crobert-1 any idea which version introduced this memory leak?

@crobert-1
Copy link
Member Author

@crobert-1 any idea which version introduced this memory leak?

Not exactly sure, but it's been present since at least v0.86.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/signalfx ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporter/signalfx] Failing goleak test
6 participants