Document method to ask a running node for its ID #124988

jgallagher · 2024-06-03T15:05:30Z

Decommissioning a CRDB node requires knowing its node ID. Node IDs are printed and logged on startup, but I'd like a more precise way to programmatically determine the ID of a running node.

Initially I thought I could gather it from the output of cockroach node status based on the listening address, but that can be imprecise; e.g., if I do gross things with running processes and timing, I can get output like this:

  id |   address   | sql_address |  build  |         started_at         |         updated_at         | locality | is_available | is_live
-----+-------------+-------------+---------+----------------------------+----------------------------+----------+--------------+----------
   1 | [::1]:33331 | [::1]:33331 | v22.1.9 | 2024-06-03 12:46:44.335389 | 2024-06-03 12:50:06.881011 |          | true         | true
   2 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:46:52.257414 | 2024-06-03 12:49:11.929403 |          | false        | false
   7 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:50:06.200582 | 2024-06-03 12:50:06.880881 |          | true         | true
   8 | [::1]:33332 | [::1]:33332 | v22.1.9 | 2024-06-03 12:49:14.815347 | 2024-06-03 12:50:04.356989 |          | true         | true

where nodes 2 and 8 used to be running at a particular address; now node 7 is running there, but based on just this output I can't tell whether it's 7 or 8 (since both claim to be available and live).

I asked on Slack, and was told about

select crdb_internal.node_id();

which is exactly what I want. (For my use case I don't particularly care whether this comes from the SQL shell or the HTTP API or a CLI invocation.) However, I can't find this in the crdb-internal docs, and therefore don't know whether it would get the "use in production" stability checkmark.

Could node_id() be added to those docs? If it's not suitable for production use, is there a different way to gather node IDs?

Jira issue: CRDB-39187

The text was updated successfully, but these errors were encountered:

blathers-crl · 2024-06-03T15:05:33Z

Hello, I am Blathers. I am here to help you get the issue triaged.

It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates.

I was unable to automatically find someone to ping.

If we have not gotten back to your issue within a few business days, you can try the following:

Join our community slack channel and ask on #cockroachdb.
Try find someone from here if you know they worked closely on the area and CC them.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

yuzefovich · 2024-06-03T20:02:54Z

Thanks for raising this request!

therefore don't know whether it would get the "use in production" stability checkmark

Currently, anything that has crdb_internal.* prefix is not really suitable for production use. I know that we documented virtual tables in crdb_internal schema (the docs page you referenced above), but even those documented virtual tables are "less production ready" than other documented features (for example, we've renamed / dropped columns from those tables in patch releases, so SELECT * FROM crdb_internal.<some_vtable> could be broken even without a major upgrade). AFAIK we don't document any of the crdb_internal.* built-ins at all.

Having said that, the behavior of crdb_internal.node_id built-in is unlikely to change for some time. I can only imagine that this might happen with our efforts to build out multi-tenancy / Unified Architecture for all clusters (currently, it's an opt-in).

Your request for improving documentation around decommissioning seems reasonable to me, so cc'ing @taroface @rmloveland for triaging this issue further. Perhaps this needs to be moved to cockroachdb/docs repo.

jgallagher added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jun 3, 2024

blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Jun 3, 2024

yuzefovich added docs-todo A-docs and removed X-blathers-untriaged blathers was unable to find an owner labels Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document method to ask a running node for its ID #124988

Document method to ask a running node for its ID #124988

jgallagher commented Jun 3, 2024 •

edited by cockroach-jira-scripts

Loading

blathers-crl bot commented Jun 3, 2024

yuzefovich commented Jun 3, 2024 •

edited

Loading

Document method to ask a running node for its ID #124988

Document method to ask a running node for its ID #124988

Comments

jgallagher commented Jun 3, 2024 • edited by cockroach-jira-scripts Loading

blathers-crl bot commented Jun 3, 2024

yuzefovich commented Jun 3, 2024 • edited Loading

jgallagher commented Jun 3, 2024 •

edited by cockroach-jira-scripts

Loading

yuzefovich commented Jun 3, 2024 •

edited

Loading