-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/clickhouse] Default async_insert
to true. Added related config option.
#32340
[exporter/clickhouse] Default async_insert
to true. Added related config option.
#32340
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// AsyncInsert if true will enable async inserts. Default is `true`. | ||
// Ignored if async inserts are configured in the `endpoint` or `connection_params`. | ||
// Async inserts may still be overridden server-side. | ||
AsyncInsert *bool `mapstructure:"async_insert"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a pointer instead of just a value? The only reason I could think of would be for it to be empty, as in your tests, but this is true
by default, so it couldn't be empty, as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default only gets applied if it's not in endpoint
/connection_params
AND if cfg.AsyncInsert
is nil
.
If it were a regular bool
, Go would initialize the struct with false
, and so in the config parsing, I wouldn't be able to determine if it was set to false
by the user or by the struct's initialization.
So the code would look like:
// This effectively ignores my config if I set cfg.AsyncInsert to false.
if cfg.AsyncInsert == false {
queryParams.Set("async_insert", "true")
}
So if I wanted it to remain false
in my config, it would never work, I would have to set it in endpoint
or connection_params
.
This is similar to the database
config option, except in that case Go will initialize it to ""
, so the code checks for that instead of nil
.
TL;DR: false is false
, true is true
, nil
is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I understand your concern. However, the way configurations are parsed by the collector works in this case. It first calls createDefaultConfig, then parses whatever configuration the user defined, overwriting any default values with given values (and leaving default values as-is if not defined by user). This is done here, for reference.
So even if you have a bool
with default true
, if you set it to true
in createDefaultConfig
, it will only be overwritten if the user specifies async_insert: false
in their config.
So would it work to use bool
for your use case, or is there more reason than that?
I realize this logic is a bit involved and complicated since this variable is coming from potentially three different places, so would it work to add some kind of getter method that will handle precedence of different sources of async_insert
and then require all references be to the getter method instead of directly parsing config options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @SpencerTorres, I was wondering if you got a chance to test this, or look into it more?
@crobert-1 I appreciate the review! I have applied all of your suggestions:
Let me know if anything else stands out. Thanks! |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Hey @SpencerTorres, feel free to reopen when you're able to work on this again, I just had a question/suggestion around the boolean pointer. |
Description:
Sets
async_insert
to true by default to enable asynchronous inserts.Because this value is being given a default, I have added a config option under the same name.
Keep in mind that if
async_insert
is provided inendpoint
orconnection_params
it will take precedence and ignore this new config option.This is similar to how the
database
config option behaves.The goal is to provide better insert performance by default, since not all users will know to set it in their DSN URL.
This also opens the discussion to whether or not this is a breaking change. Depending on the deployment's telemetry throughput, this could be an unexpected change that leads to
TOO_MANY_PARTS
errors. I don't expect this to be the case however, but I welcome any discussion about this concern.Link to tracking Issue:
Testing:
Ran integration tests. Also added an abundance of tests to check the behavior of
async_insert
when present inendpoint
,connection_params
, and exporter config.Documentation:
Added new config option to README + code comments.
Unrelated change, also updated README's SQL samples to use
sql
instead ofclickhouse
for the code samples to enable proper syntax highlighting. ClickHouse SQL is compatible with plain SQL.