-
Notifications
You must be signed in to change notification settings - Fork 45
Start implementing bootstore: Replicated storage required for rack unlock #1656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jgallagher One more thing I should mention with regards to this PR. There's no locking or concurrency support of any kind on the DB connection yet. That's because I haven't figured out what I want to do. I want to get more code written first. In the majority of cases, the DB will be Read-Only, and I'm not sure if we want to just cache state at the app layer and use a single connection with an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; I appreciate splitting this out into an initial PR before it becomes larger - thanks!
I wanted to add some notes on the pq-sys stuff since we spent some time digging into it. To summarize:
We observed this behavior:
Because build scripts appear to only run once and also only get metadata from regular deps (not dev deps), the conclusion is that if your package ever depends on pq-sys directly or indirectly, even via dev-dependency, then you must apply the pattern by adding a regular dependency on pq-sys. You can't use a dev-dependency for pq-sys, even if you only depend on pq-sys via a dev-dependency. |
Crucible changes are: Make `ImpactedBlocks` an absolute range (#1685) update omicron (#1687) Update Rust crate chrono to v0.4.40 (#1682) Update Rust crate tempfile to v3.19.0 (#1676) Log loops during crutest replay (#1674) Refactor live-repair start and send (#1673) Update Rust crate libc to v0.2.171 (#1684) Update Rust crate clap to v4.5.32 (#1683) Update Rust crate async-trait to 0.1.88 (#1680) Update Rust crate bytes to v1.10.1 (#1681) Update Rust crate anyhow to v1.0.97 (#1679) Update Rust crate http to v1 (#1678) Update Rust crate tokio-util to v0.7.14 (#1675) Update Rust crate thiserror to v2 (#1657) Update Rust crate omicron-zone-package to 0.12.0 (#1647) Update Rust crate opentelemetry to 0.28.0 (#1543) Update Rust crate itertools to 0.14.0 (#1646) Update Rust crate clearscreen to v4 (#1656) nightly test polish (#1665) Update Rust crate strum_macros to 0.27 (#1654) Update Rust crate strum to 0.27 (#1653) Update Rust crate rusqlite to 0.34 (#1652) Better return semantics from should_send Trying to simplify enqueue Remove unnecessary argument Remove unused code from `ImpactedBlocks` (#1668) Log when pantry/volume layers activate things (#1670) Fix unused import (#1669) Use `RangeSet` instead of tracking every complete job (#1663) Update Rust crate dropshot to 0.16.0 (#1645) Simplify pending work tracking (#1662) Remove rusqlite dependency from crucible-common (#1659) Update Rust crate reedline to 0.38.0 (#1651) Update Rust crate proptest to 1.6.0 (#1648) Update Rust crate bytes to v1.10.0 (#1644) Update Rust crate tracing-subscriber to 0.3.19 (#1643) Update Rust crate tracing to v0.1.41 (#1642) Update Rust crate toml to v0.8.20 (#1641) Update Rust crate thiserror to v1.0.69 (#1640) Update Rust crate serde_json to v1.0.139 (#1639) Update Rust crate serde to v1.0.218 (#1638) Update Rust crate reqwest to v0.12.12 (#1637) Update Rust crate libc to v0.2.170 (#1636) Update Rust crate indicatif to 0.17.11 (#1635) Update Rust crate httptest to 0.16.3 (#1634) Update Rust crate clap to v4.5.31 (#1632) Update Rust crate chrono to v0.4.39 (#1631) Update Rust crate byte-unit to 5.1.6 (#1630) Update Rust crate async-trait to 0.1.86 (#1629) Update Rust crate csv to 1.3.1 (#1633) Update Rust crate anyhow to v1.0.96 (#1628) Fix a test flake (#1626) Bitpack per-block slot data in memory (#1625) Use `div_ceil` instead of `(x + 7) / 8` (#1624) Add repair server dynamometer (#1618) Propolis changes are: Paper over minor() differences Update crucible (#886) Test oximeter metrics end to end (#855) server: improve behavior of starting VMs that are waiting for Crucible activation (#873) server: extend API request queue's memory of prior requests (#880) Use production propolis-server flags in PHD CI builds (#878) Added a new field to the Board struct that propolis requires.
Crucible changes are: Make `ImpactedBlocks` an absolute range (#1685) update omicron (#1687) Update Rust crate chrono to v0.4.40 (#1682) Update Rust crate tempfile to v3.19.0 (#1676) Log loops during crutest replay (#1674) Refactor live-repair start and send (#1673) Update Rust crate libc to v0.2.171 (#1684) Update Rust crate clap to v4.5.32 (#1683) Update Rust crate async-trait to 0.1.88 (#1680) Update Rust crate bytes to v1.10.1 (#1681) Update Rust crate anyhow to v1.0.97 (#1679) Update Rust crate http to v1 (#1678) Update Rust crate tokio-util to v0.7.14 (#1675) Update Rust crate thiserror to v2 (#1657) Update Rust crate omicron-zone-package to 0.12.0 (#1647) Update Rust crate opentelemetry to 0.28.0 (#1543) Update Rust crate itertools to 0.14.0 (#1646) Update Rust crate clearscreen to v4 (#1656) nightly test polish (#1665) Update Rust crate strum_macros to 0.27 (#1654) Update Rust crate strum to 0.27 (#1653) Update Rust crate rusqlite to 0.34 (#1652) Better return semantics from should_send Trying to simplify enqueue Remove unnecessary argument Remove unused code from `ImpactedBlocks` (#1668) Log when pantry/volume layers activate things (#1670) Fix unused import (#1669) Use `RangeSet` instead of tracking every complete job (#1663) Update Rust crate dropshot to 0.16.0 (#1645) Simplify pending work tracking (#1662) Remove rusqlite dependency from crucible-common (#1659) Update Rust crate reedline to 0.38.0 (#1651) Update Rust crate proptest to 1.6.0 (#1648) Update Rust crate bytes to v1.10.0 (#1644) Update Rust crate tracing-subscriber to 0.3.19 (#1643) Update Rust crate tracing to v0.1.41 (#1642) Update Rust crate toml to v0.8.20 (#1641) Update Rust crate thiserror to v1.0.69 (#1640) Update Rust crate serde_json to v1.0.139 (#1639) Update Rust crate serde to v1.0.218 (#1638) Update Rust crate reqwest to v0.12.12 (#1637) Update Rust crate libc to v0.2.170 (#1636) Update Rust crate indicatif to 0.17.11 (#1635) Update Rust crate httptest to 0.16.3 (#1634) Update Rust crate clap to v4.5.31 (#1632) Update Rust crate chrono to v0.4.39 (#1631) Update Rust crate byte-unit to 5.1.6 (#1630) Update Rust crate async-trait to 0.1.86 (#1629) Update Rust crate csv to 1.3.1 (#1633) Update Rust crate anyhow to v1.0.96 (#1628) Fix a test flake (#1626) Bitpack per-block slot data in memory (#1625) Use `div_ceil` instead of `(x + 7) / 8` (#1624) Add repair server dynamometer (#1618) Propolis changes are: Paper over minor() differences Update crucible (#886) Test oximeter metrics end to end (#855) server: improve behavior of starting VMs that are waiting for Crucible activation (#873) server: extend API request queue's memory of prior requests (#880) Use production propolis-server flags in PHD CI builds (#878) --------- Co-authored-by: Alan Hanson <alan@oxide.computer>
Currently RSS is in charge of generating the rack secret and the bootstrap agent is in charge of establishing sprockets sessions and transferring shares required for rack unlock. The shares themselves, and the trust quorum membership is stored in a file on each sled. The major problems with this implementation are that it is not resilient to failure and does not allow reconfiguration of the trust quorum. Additionally, the trust quorum code is somewhat scattered across the sled-agent, and a little hard to follow.
This change introduces the
bootstore
, which will become responsible for all trust quorum protocol and storage on the rack. The bootstore will be instantiated by the boot agent, which will be responsible for establishing sprockets sessions that the trust quorum can operate over. This allows for a readable, testable implementation of the protocol and storage without having to worry about extraneous details related to networking or the SP. We may choose to pull that in later to a higher level module in the bootstore. I intend to utilize property based tests with fault injection to cover the protocol.This is an early WIP of the bootstore, and lays out the basic structure of the code, while implementing some small bits. The goal is to get this out early before it gets too large to easily review.
The
Node
is the main entry point to the codebase, although most logic has not been implemented. TheNode
uses theDb
to query and store trust quorum information. The goal is to build up a fairly general two-phase commit mechanism that can also be used for storing networking information required to setup NTP so that CockroachDb can boot and the rest of the control plane software can run. The two-phase commit is standard, but as always has the problem of any node failing halts the protocol. We will manage this by moving to a new epoch and reconfiguration if this occurs. Of course the devil is in the details, but it's not incredibly complicated. This implementation will supercede what is written in section 5 of RFD 238. I will update that document with the new protocol. One thing that was not considered there was transferring the encrypted (wrapped) root secret to new nodes that don't have the original root secret derived from the epoch 0 rack secret. As part of this, we'll distribute it as part of the 2-phase commit. There are a couple mechanisms with various levels of security, that I won't get into here. The important thing to know for this PR is that I created a table and model for storing the secret. Code to use it is unimplemented so far.There is also the notion of a [
Coordinator
] that is not yet implemented. The coordinator interacts with the sled-agent to record important information in nexus and get instructions related to trust quorum from nexus. It only stores state in memory and can crash without consequence. It may end up being driven by a saga.I also copied the
trust_quorum
directory from the sled-agent as thebootstore
will become responsible for the trust quorum. I haven't yet removed this code from the sled-agent, as this work is preliminary and I don't want to hook it in yet.