-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/pg: introduce pg_snapshot_statement_timeout LD flag #21150
storage/pg: introduce pg_snapshot_statement_timeout LD flag #21150
Conversation
23ac046
to
3cebac9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo question about the default value of the parameter.
src/sql/src/session/vars.rs
Outdated
/// Sets the `statement_timeout` value to use during the snapshotting phase of | ||
/// PG sources. | ||
const PG_SNAPSHOT_STATEMENT_TIMEOUT: ServerVar<Duration> = ServerVar { | ||
name: UncasedStr::new("pg_snapshot_statement_timeout"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preference to spell out more clearly that this is about PostgreSQL sources and not other PG-like things (e.g., pgwire, the SQL interface generally, whatever).
name: UncasedStr::new("pg_snapshot_statement_timeout"), | |
name: UncasedStr::new("postgres_source_snapshot_statement_timeout"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, crap, just saw that we introduced pg_replication_*
variables already. Oh well. Scratch my suggestion. Consistency with what we have is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benesch The pg_replication_*
variables will eventually go away in favor of being applied more generally to all PG client connections. Let me see if I can bundle that work into this and we can make the names less ambiguous overall.
&format!("SET statement_timeout = {}", config.params.pg_snapshot_statement_timeout.as_millis()) | ||
).await?; | ||
|
||
mz_ore::soft_assert!{{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good use of a soft_assert!
!
value: &mz_postgres_util::DEFAULT_SNAPSHOT_STATEMENT_TIMEOUT, | ||
description: "Sets the `statement_timeout` value to use during the snapshotting phase of PG sources (Materialize)", | ||
internal: true, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you do me a favor and make sure this works properly for durations of 0
? I think it gets mapped to usize::MAX
which might not work? But also you wrote a test and that didn't choke. So IDK.
Slack context about the weirdness of duration configuration parameters: https://materializeinc.slack.com/archives/C02FWJ94HME/p1691674999794559
Thanks for doing this, @sploiselle! Will be glad to not have to debug this issue a third time! |
4b17bb3
to
9a2dc76
Compare
This PR has higher risk. Make sure to carefully review the file hotspots. In addition to having a knowledgeable reviewer, it may be useful to add observability and/or a feature flag. What's This?
Buggy File Hotspots:
|
@MaterializeInc/adapter PTAL + cc @benesch if you're interested Expanded the scope of this PR to include:
Nightly failures are not caused by this PR and have something to do with statement logging. |
Thanks for slogging! Going to tap out of the re-review due to time constraints but love the movement towards unifying these settings as much as possible. |
value: &mz_postgres_util::DEFAULT_REPLICATION_KEEPALIVE_RETRIES, | ||
/// when connecting to PG via `mz_postgres_util`. | ||
const PG_SOURCE_KEEPALIVES_RETRIES: ServerVar<u32> = ServerVar { | ||
name: UncasedStr::new("pg_source_keepalives_retries"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will break any LD settings we have. Do those need to be audited/duplicated to these new values before releasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inadvertently, these features were never added to LaunchDarkly so they cannot have been set previously. Will add the new ones to LaunchDarkly, though.
src/sql/src/session/vars.rs
Outdated
name: UncasedStr::new("pg_source_connect_timeout"), | ||
value: &mz_postgres_util::DEFAULT_CONNECT_TIMEOUT, | ||
value: &None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user runs SHOW ALL
, what would appear with this change? Will they see the default values that will be set if unconfigured here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch here--made these non-optional again and will show their defaults.
9a2dc76
to
9890d9f
Compare
Previously, the configurable TCP timeouts only applied when establishing a PG replication client. With this change, we apply the default TCP timeouts to all PG connections established through postgres-util, while still making the values configurable.
9890d9f
to
783178f
Compare
Motivation
This PR adds a known-desirable feature. MaterializeInc/database-issues#4657
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.