-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/stanza] Windows input operator should resubscribe when remote host restarts #35175
[pkg/stanza] Windows input operator should resubscribe when remote host restarts #35175
Conversation
/label os:windows |
/label Run Windows Not sure if this works... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check for a remote subscription here
opentelemetry-collector-contrib/pkg/stanza/operator/input/windows/input.go
Lines 128 to 131 in 9139ce1
subscription := NewLocalSubscription() | |
if i.isRemote() { | |
subscription = NewRemoteSubscription(i.remote.Server) | |
} |
I think we should also do same while resubscribing. What do you say?
961aa95
to
40184e9
Compare
Apart from my last comment, LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a remote subscription is created its handle is invalid (0). The subscription needs to be opened before using it to read remote events.
1f718d1
to
844a6a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
cb26e30
to
8595e6f
Compare
8595e6f
to
4ac1c32
Compare
@atoulme this can be merged |
… successfully (#35520) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> If `Input::read` returns 0, the operator will attempt to open the subscription again. This happens after a resubscribe *or* whenever there are no event logs to read, which is often. As a result, this error gets logged constantly: ``` "Failed to re-open subscription for remote server"..."error: subscription handle is already open" ``` To fix this, the handle should only attempt to re-open when the resubscribe happens. This is a followup to #35175 which fixed that resubscribe issue for the user that raised it, but they have also been confused by this error being logged. **Link to tracking Issue:** <Issue number if applicable> **N/A** **Testing:** <Describe what testing was performed and which tests were added.> Manually verified log collection is successful before and after remote host reconnects, and that the noisy error is silenced except during resubscribe, where it is relevant. **Documentation:** <Describe the documentation added.> **N/A**
…st restarts (open-telemetry#35175) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> While collecting Windows Event Logs from a remote host, if that host restarts or shuts off, the collector begins logging errors and stops collecting WE logs, even after the remote host is back up. Restarting the collector fixes this. To fix this without restarting the collector, the stanza input operator needs to resubscribe if the remote host restarts. Error log: ``` "Failed to read events from subscription","error":"The handle is invalid." ``` **Link to tracking Issue:** <Issue number if applicable> **Testing:** <Describe what testing was performed and which tests were added.> Manually tested with a build of the collector, and confirmed logs continue being collected after the remote host restarts. **Documentation:** <Describe the documentation added.> N/A, this change is expected behavior
… successfully (open-telemetry#35520) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> If `Input::read` returns 0, the operator will attempt to open the subscription again. This happens after a resubscribe *or* whenever there are no event logs to read, which is often. As a result, this error gets logged constantly: ``` "Failed to re-open subscription for remote server"..."error: subscription handle is already open" ``` To fix this, the handle should only attempt to re-open when the resubscribe happens. This is a followup to open-telemetry#35175 which fixed that resubscribe issue for the user that raised it, but they have also been confused by this error being logged. **Link to tracking Issue:** <Issue number if applicable> **N/A** **Testing:** <Describe what testing was performed and which tests were added.> Manually verified log collection is successful before and after remote host reconnects, and that the noisy error is silenced except during resubscribe, where it is relevant. **Documentation:** <Describe the documentation added.> **N/A**
… successfully (open-telemetry#35520) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> If `Input::read` returns 0, the operator will attempt to open the subscription again. This happens after a resubscribe *or* whenever there are no event logs to read, which is often. As a result, this error gets logged constantly: ``` "Failed to re-open subscription for remote server"..."error: subscription handle is already open" ``` To fix this, the handle should only attempt to re-open when the resubscribe happens. This is a followup to open-telemetry#35175 which fixed that resubscribe issue for the user that raised it, but they have also been confused by this error being logged. **Link to tracking Issue:** <Issue number if applicable> **N/A** **Testing:** <Describe what testing was performed and which tests were added.> Manually verified log collection is successful before and after remote host reconnects, and that the noisy error is silenced except during resubscribe, where it is relevant. **Documentation:** <Describe the documentation added.> **N/A**
Description:
While collecting Windows Event Logs from a remote host, if that host restarts or shuts off, the collector begins logging errors and stops collecting WE logs, even after the remote host is back up. Restarting the collector fixes this.
To fix this without restarting the collector, the stanza input operator needs to resubscribe if the remote host restarts.
Error log:
Link to tracking Issue:
Testing:
Manually tested with a build of the collector, and confirmed logs continue being collected after the remote host restarts.
Documentation:
N/A, this change is expected behavior