-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/signalfx] Fix memory leak on shutdown #30887
[exporter/signalfx] Fix memory leak on shutdown #30887
Conversation
Added failing connector test as freq of #31005 |
Tests have passed 6 times in a row, I believe this is ready for review. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one clarification
exporter/signalfxexporter/internal/correlation/correlation_test.go
Outdated
Show resolved
Hide resolved
…1541) This reverts commit 494bdb8. The change introduced a flaky test, making the build unstable. Reverting for now. Examples: - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8135561006/job/22233883822 - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8131615657/job/22221154096 - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8118428574/job/22192725196
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> `goleak` was detecting leaking goroutines in tests, this attempts to resolve. I found what appeared to be a couple races but can't reproduce locally so I'll run CI a few times to ensure this works as expected. Changes in PR: 1. Add correlation client Shutdown function that blocks on the waitgroup. This is the main fix of this PR that should fix the leaking goroutines. 2. Re-organize the shutdown process of the apm client correlation test suite to properly synchronize the shutting down process. 3. Fix typo 4. Add goleak checks to exporter/signalfx/internal/correlation` and `exporter/signalfx/internal/apm/correlations` **Link to tracking Issue:** <Issue number if applicable> Resolves open-telemetry#30864 open-telemetry#30438 **Testing:** <Describe what testing was performed and which tests were added.> All existing and added tests should be passing. Since this has only failed in CI I'm going to try to run it a few times before marking as ready for review.
…ry#30887)" (open-telemetry#31541) This reverts commit 494bdb8. The change introduced a flaky test, making the build unstable. Reverting for now. Examples: - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8135561006/job/22233883822 - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8131615657/job/22221154096 - https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8118428574/job/22192725196
**Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> Changes in PR: 1. Add correlation client Shutdown function that blocks on the waitgroup. This is the main fix of this PR that should fix the leaking goroutines. 2. Re-organize the shutdown process of the apm client correlation test suite to properly synchronize the shutting down process. 3. Fix typo 4. Only block request sender until context is cancelled. The request processor is shutdown when the context is cancelled, so this would result in `Shutdown` waiting forever, since the request would never be processed. 5. Enable goleak in some more packages. **Note**: This is contains the exact same contents as #30887, but change number 4 is new, and should resolve the test issue the original PR was causing. **Link to tracking Issue:** Resolves #30864 #30438 **Testing:** <Describe what testing was performed and which tests were added.> All existing tests are passing, as well as added goleak checks. I'm going to run this a number of times to try to help ensure it's not flaky anymore.
@crobert-1 any idea which version introduced this memory leak? |
Not exactly sure, but it's been present since at least |
Description:
goleak
was detecting leaking goroutines in tests, this attempts to resolve. I found what appeared to be a couple races but can't reproduce locally so I'll run CI a few times to ensure this works as expected.Changes in PR:
and
exporter/signalfx/internal/apm/correlations`Link to tracking Issue:
Resolves #30864
#30438
Testing:
All existing and added tests should be passing. Since this has only failed in CI I'm going to try to run it a few times before marking as ready for review.