-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: the flag --sql-advertise-addr is incompletely designed #52266
Comments
Hi @knz, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate. While you're here, please consider adding an A- label to help keep our repository tidy. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
cc @bdarnell : do you think we should polish this further given the functional utility? Or would you rather drop this logic entirely? Or perhaps we'd be satisfied with an intermediate situation where we hide the flag from command-line cc @Amruta-Ranade, this is what we were discussing last week |
See discussion in cockroachdb#52266. Release note: None
It's very low priority, but yes, I think we should expand on this functionality instead of removing it. There should also be an "advertise http addr" (which would also be useful to As for hiding it from |
I get that this is super low priority, but FWIW I ran into a surprising quirk with I tried setting And you can't cheekily work around it by setting Would y'all be open to a patch that allows |
What you describe looks like a bug (and probably a separate one, deserving a separate issue). Thanks for finding it.
Yes that seems like a good idea. If you plan to make a PR, send it my way. @jtsiros @mwang1026 we will want to revive this, as a general item to make it easier to deploy multi-region. We'll need an improvement in this area if we want to expand our k8s operator capabilities towards multi-region (and possibly ditto in the CC infra). |
I have no immediate plans to, but I may come back to it in a while if I get bored and it doesn't otherwise get fixed. |
87027: streamingccl: reduce server count in multinode tests r=samiskin a=samiskin While these tests would pass under stress locally they would fail CI stress, which may be because we were starting more server processes than ever before with 4 source nodes, 4 source tenant pods, and 4 destination nodes. This PR reduces the node count to 3 (any lower and scatter doesn't correctly distribute ranges) and only starts a single tenant pod for the source cluster. Release justification: test-only change Release note: None 87412: cli,server: fix --sql-advertise-addr when --sql-addr is not specified r=a-robinson,ajwerner a=knz Fixes #87040. Informs #52266. cc `@a-robinson` Release justification: bug fix Release note (bug fix): The flag `--sql-advertise-addr` now properly works even when the SQL and RPC ports are shared (because `--sql-addr` was not specified). Note that this port sharing is a deprecated feature in v22.2. 87440: ui: update txn contention insights to use waiting txns as event r=ericharmeling a=ericharmeling This commit updates the transaction workload insights pages to use the waiting contended transaction as the primary contention event, rather than the blocking transaction. Fixes #87284. https://www.loom.com/share/383fec4297a74ec79d90e46f11def792 Release justification: bug fixes and low-risk updates to new functionality Release note: None 87462: upgrade/upgrades: allow CreatedAtNanos to be set when validating migration r=ajwerner a=ajwerner Schema change upgrade migrations to system tables are made idempotent by checking that the descriptor reaches some expected state. In order to ensure that it is in that expected state, some volatile fields need to be masked. We forgot to mask CreatedAtNanos. We also lost the testing which came with these helper functions we use. The vast majority of this PR is reviving testing from #66889. Fixes #85228. Release justification: Import bug fix for backport Release note (bug fix): Some upgrade migrations perform schema changes on system tables. Those upgrades which added indexes could, in some cases, get caught retrying because they failed to detect that the migration had already occurred due to the existence of a populated field. When that happens, the finalization of the new version can hang indefinitely and require manual intervention. This bug has been fixed. Co-authored-by: Shiranka Miskin <shiranka.miskin@gmail.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com> Co-authored-by: Andrew Werner <awerner32@gmail.com>
We have marked this issue as stale because it has been inactive for |
Same here. Have next configuration
have no advertise options and on command
|
Today the flag
--sql-advertise-addr
is offered for symmetry with--advertise-addr
.We're not too happy about the status quo. There is more work needed.
Background
The reason why this flag is at all necessary is because of the following specific combination:
--advertise-addr
for the RPC port; this is used internally e.g. by node-node connections--sql-advertise-addr
for when the SQL listener is split from RPC, which we'd like to make security best practice.Pros of the feature
in the node descriptor (
roachpb.NodeDescriptor
): all advertised addresses are stored there.the RPC adv addr (not SQL) is pulled from node descs for node-node connections. This is necessary for normal cluster operation.
inside SQL, the SQL adv addr is used to populate the vtable
crdb_internal.node_build_info
.This might be used by 3rd party automation to discover the addresses of SQL servers, e.g. to automate setting up load balancers. We're not sure (it's also not documented).
cockroach debug zip
needs to establish SQL connections to every node in the cluster, in addition to RPC connections; so even ifdebug zip
is ran on one of the node's VMs, it may still need to reach out to other DCs using the nodes' public address.Currently
debug zip
discovers the RPC and SQL public address by looking at the node descriptor on the 1st node it connects to. That's necessary for proper operation ofdebug zip
but then will drop after Ability to obtaindebug.zip
directly via http endpoint #51008 is addressedSo in summary while a separate RPC adv address is necessary for proper operation, the utility of a separate SQL adv address is relatively small.
Cons of the feature
cockroach gen haproxy
and therefore our own recommended configuration generator for load balancers don't use this feature properlyWays forward
There are two possible ways forward, with different action items:
remove the concept:
debug.zip
directly via http endpoint #51008 and ensure that all thedebug zip
logic (that becomes server-side) operates using RPCs, without the need for node-to-node SQL connections.cockroach gen haproxy
command that all nodes are reachable using the selected address--sql-advertise-addr
and eventually remove it.(Note: in this approach, it would still be possible to split RPC and SQL listeners, it's just that we wouldn't need to configure or store an advertised SQL address server-side)
OR
double down on the feature:
cockroach gen haproxy
use the advertised SQL addressNote that in this second approach, until we have more comprehensive docs it is confusing to mention the option to users upfront; as it raises more questions than it answers. Therefore it may be a good idea to hide the flag until we have more comprehensive deployment docs.
Jira issue: CRDB-3971
The text was updated successfully, but these errors were encountered: