-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extension/solarwindsapmsettingsextension] Added part one of the concrete implementation of solarwindsapmsettingsextension #30788
Conversation
@@ -13,7 +14,7 @@ import ( | |||
) | |||
|
|||
const ( | |||
DefaultInterval = "1m" | |||
DefaultInterval = time.Duration(10000000000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it 10*time.Second
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks!
Fixed.
Interval string `mapstructure:"interval"` | ||
} | ||
|
||
func (cfg *Config) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is validation the removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning an error in Config.Validate()
will cause collector not starting up.
While we are testing with the opentelemetry-lambda
collector lambda layer, the collector error from this function can cause the lambda function (user's lambda function) timeout. i.e. A bad config in this extension can break user's lambda function.
While we think a bad config in this extension (e.g. a wrong endpoint causing no such host
gRPC error or a missing / wrong key to get a setting should not be a fatal to the collector process.
So we moved the logic to the extension.
When there is a bad config, the extension will log messages and will be in noop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jerrytfleung ,
With regards to:
Returning an error in Config.Validate() will cause collector not starting up.
This was a purposeful decision considering that invalid config either means assuming defaults (like a default region, default ingestion endpoint) or letting it run in a broken state (which means it can silently fail)
From my perspective, converting into a noop component on a bad configuration is an anti pattern for a lot of users, a critical piece of infrastructure.
Moreover, without this functionality, the command otelcol validate
will return fine which may also contribute to user confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added back otelcol validate
function
} else { | ||
extension.logger.Warn("solarwindsapmsettingsextension is in noop. Please check config") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed. Config.Validate()
is called by the builder automatically and collector doesn't start with returned error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it can break user's application when it is used in collector lambda layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it can break user's application when it is used in collector lambda layer.
The time couple of lambda to lambda extensions can be painful, I am not sure it justifies removing all protections. I would recommend that you raise that the lambda project as a feature in order to allow the lambda layer to run without error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will raise that to lambda project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised a feature request issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MovieStoreGuy We are thinking to add back Config.Validate()
and replace any invalid field of the configuration by default values. As we set default values, Config.validate()
will always return nil
. And the extension will not need to work in noop mode anymore.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added back Config.Validate()
This may answer my two comments. But why? |
replied in the thread. |
@atoulme @MovieStoreGuy I tried my best to address your review comments / concerns. Can you please kindly to elaborate and review this PR again? Thanks a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a comment from me and the rest appear fine.
|
||
func (cfg *Config) Validate() error { | ||
// Endpoint | ||
matched, _ := regexp.MatchString(`apm.collector.[a-z]{2,3}-[0-9]{2}.[a-z\-]*.solarwinds.com:443`, cfg.Endpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you ever want this to be localhost for testing or other purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We want the endpoint to be one of our APM collector endpoints listed in here
i.e.
apm.collector.na-01.cloud.solarwinds.com:443
apm.collector.na-02.cloud.solarwinds.com:443
apm.collector.eu-01.cloud.solarwinds.com:443
apm.collector.apj-01.cloud.solarwinds.com:443
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a comment from me and the rest appear fine.
@MovieStoreGuy I fixed the ordering. Can you kindly review again? Thanks!
return nil | ||
} | ||
|
||
func (extension *solarwindsapmSettingsExtension) Shutdown(_ context.Context) error { | ||
extension.logger.Debug("Shutting down solarwinds apm settings extension") | ||
if extension.conn != nil { | ||
return extension.conn.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flip the ordering and that should resolve the shutdown issue :)
@atoulme @MovieStoreGuy I addressed the PR comment and updated the branch. Can you kindly review this PR again? Thank you! |
Hey @atoulme @MovieStoreGuy, may we know your feedback? We still want to proceed with this PR. Thanks! |
Description:
github.com/solarwindscloud/apm-proto
dependency in go.modstring
totime.Duration
Link to tracking Issue:
27668
Testing:
Documentation:
interval
and the default value