-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New component: Failover Connector #20766
Comments
I'm glad you made this @djaglowski! I was just chatting with @atoulme about adding failover and circuit breaker support for exporters a couple days ago. The connector seems like a great method to add broad failover support. How about tweaking this slightly to support 1..N entries as a yaml flow sequence? It would reduce complexity in the failover connector by removing the need for keys (primary, secondary, etc.) in order to choose the next pipeline to failover to. Example:
|
I'd like to sponsor this. |
@sethallen, I like the idea of allowing a priority list, but I think we should leave room for other parameters as well. I also think we need to allow multiple pipelines per "level". connectors:
failover:
priority:
- [logs/main]
- [logs/backup, logs/backup2]
- [logs/backup/3]
min_failover_interval: 2m # Possibly would add this in future |
@djaglowski How would the multiple pipelines be used? In a fan-out or Priority 1, Priority 2-1, Priority 2-2, ... Priority N |
@cparkins, when there are multiple pipelines at the same priority level, it would fan out data to those pipelines. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
@djaglowski I also had this feature request, I'd be happy to work on it/ support any way I can. |
@akats7, any help moving this forward would be great. I'll be happy to review any PRs. |
@djaglowski sounds good, can I please be assigned this issue. |
@djaglowski / @akats7 / @atoulme - Perhaps this work effort can be merged with what @cparkins has been working on internally for us over the last few months. He added resiliency features (Failover, Circuit Breaker) to the Splunk HEC Exporter for the OTel Collector and submitted them in the PR below: |
@sethallen, I'm supportive of the idea. In my opinion, failover at least should be implemented as a connector because in many cases it may be appropriate to failover to a different type of exporter. If I recall correctly, you and/or @cparkins looked into the idea of implementing other resiliency features into a connector. Do you still see that as a viable path? Either way, I think the failover connector should move forward and we can add additional capabilities based on a proposal. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
^ I was able to begin looking into this recently and will open a first pass PR for this shortly. |
That's exciting @akats7. We've been maintaining an internal fork of resiliency features added to the Splunk HEC Exporter and would love to get these features somewhere into the mainline collector. Your PR for a Connector will be great to see and hopefully help with. Cheers! |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This is the Part 1 PR for the Failover Connector (split according to the CONTRIBUTING.md doc) Link to tracking Issue: #20766 Testing: Added factory test Note: Full functionality PR exists [here](#27641) and will likely be refactored to serve as the part 2 PR cc: @djaglowski @sethallen @MovieStoreGuy
This is the Part 1 PR for the Failover Connector (split according to the CONTRIBUTING.md doc) Link to tracking Issue: open-telemetry#20766 Testing: Added factory test Note: Full functionality PR exists [here](open-telemetry#27641) and will likely be refactored to serve as the part 2 PR cc: @djaglowski @sethallen @MovieStoreGuy
This is the 2nd PR for the failover connector that implements the core failover functionality. It is currently in place for Traces and once solidified will be repeated for metrics and logs Link to tracking Issue: #20766 Note: Will add traces tests today but pushing up to begin review cc: @djaglowski @fatsheep9146
This is the 3rd PR for the failover connector. This PR adds support for metric and log pipelines Link to tracking Issue: #20766 cc: @djaglowski @fatsheep9146
This is the 3rd PR for the failover connector. This PR adds support for metric and log pipelines Link to tracking Issue: open-telemetry#20766 cc: @djaglowski @fatsheep9146
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
Would love to revive this, definitely interested in this topic. |
Thanks for pinging this @verejoel. An implementation is in place but stability is still marked as |
Hey @djaglowski @verejoel, Yep the MVP functionality is in place, I did have one more change I've been planning to push so I'll push that along with the update to Alpha. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
@akats7, should we close this issue as completed? |
Hey @djaglowski, yep I think we can close this |
Thanks @akats7! |
The purpose and use-cases of the new component
A connector that routes data based on the current health status of a downstream component, typically an exporter.
I have heard several users ask for the ability to send data to a backup exporter, if a primary exporter fails. I believe this could be implemented as a routing connector.
The user would specify at least one pipeline to which data would typically be routed. Additionally, the user must specify at least one backup pipeline or pipelines which would be used when an error is encountered.
Initially, I think the trigger for routing to a backup pipeline could be based on backpropogated errors, though this is not yet very robust (See open-telemetry/opentelemetry-collector#7460). At a later time, I imagine this could be based on the health status of an exporter (See open-telemetry/opentelemetry-collector#6344).
Example configuration for the component
Telemetry data types supported
traces->traces
metrics->metrics
logs->logs
Is this a vendor-specific component?
Sponsor (optional)
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: