-
Notifications
You must be signed in to change notification settings - Fork 45
Update thing-flinger for modern Omicron #1180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
009b9ac
to
9ede4c0
Compare
9ede4c0
to
88b4ec6
Compare
`create_virtual_hardware.sh` requires access to config.toml to create zpools. We now copy over the smf/sled-agent directory to deployment servers so that the script can find what zpools to create. Fixes #1612
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to test this locally today, successfully after one minor hiccup (left a comment about config-rss.toml
naming). Changes all LGTM!
I made all changes related to @jgallagher's review. I also fixed a bug in |
Crucible changes: Per client, queue-based backpressure (#1186) A builder for the Downstairs Downstairs struct. (#1152) Update Rust to v1.76.0 (#1153) Deactivate the read only parent after a scrub (#1180) Start byte-based backpressure earlier (#1179) Tweak CI scripts to fix warnings (#1178) Make `gw_ds_complete` less public (#1175) Verify extent under repair is valid after copying files (#1159) Remove individual panic setup, use global panic settings (#1174) [smf] Use new zone network config service (#1096) Move a few methods into downstairs (#1160) Remove extra clone in upstairs read (#1163) Make `crucible-downstairs` not depend on upstairs (#1165) Update Rust crate rusqlite to 0.31 (#1171) Update Rust crate reedline to 0.29.0 (#1170) Update Rust crate clap to 4.5 (#1169) Update Rust crate indicatif to 0.17.8 (#1168) Update progenitor to bc0bb4b (#1164) Do not 500 on snapshot delete for deleted region (#1162) Drop jobs from Offline downstairs. (#1157) `Mutex<Work>` → `Work` (#1156) Added a contributing.md (#1158) Remove ExtentFlushClose::source_downstairs (#1154) Remove unnecessary mutexes from Downstairs (#1132) Propolis changes: PHD: improve Windows reliability (#651) Update progenitor and omicron deps Clean up VMM resource on server shutdown Remove Inventory mechanism Update vergen dependency Properly handle pre/post illumos#16183 fixups PHD: add `pfexec` to xtask phd-runner invocation (#647) PHD: add Windows Server 2016 adapter & improve WS2016/2019 reliability (#646) PHD: use `clap` for more `cargo xtask phd` args (#645) PHD: several `cargo xtask phd` CLI fixes (#642) PHD: Use ZFS clones for file-backed disks (#640) PHD: improve ctrl-c handling (#634)
Crucible changes: Per client, queue-based backpressure (#1186) A builder for the Downstairs Downstairs struct. (#1152) Update Rust to v1.76.0 (#1153) Deactivate the read only parent after a scrub (#1180) Start byte-based backpressure earlier (#1179) Tweak CI scripts to fix warnings (#1178) Make `gw_ds_complete` less public (#1175) Verify extent under repair is valid after copying files (#1159) Remove individual panic setup, use global panic settings (#1174) [smf] Use new zone network config service (#1096) Move a few methods into downstairs (#1160) Remove extra clone in upstairs read (#1163) Make `crucible-downstairs` not depend on upstairs (#1165) Update Rust crate rusqlite to 0.31 (#1171) Update Rust crate reedline to 0.29.0 (#1170) Update Rust crate clap to 4.5 (#1169) Update Rust crate indicatif to 0.17.8 (#1168) Update progenitor to bc0bb4b (#1164) Do not 500 on snapshot delete for deleted region (#1162) Drop jobs from Offline downstairs. (#1157) `Mutex<Work>` → `Work` (#1156) Added a contributing.md (#1158) Remove ExtentFlushClose::source_downstairs (#1154) Remove unnecessary mutexes from Downstairs (#1132) Propolis changes: PHD: improve Windows reliability (#651) Update progenitor and omicron deps Clean up VMM resource on server shutdown Remove Inventory mechanism Update vergen dependency Properly handle pre/post illumos#16183 fixups PHD: add `pfexec` to xtask phd-runner invocation (#647) PHD: add Windows Server 2016 adapter & improve WS2016/2019 reliability (#646) PHD: use `clap` for more `cargo xtask phd` args (#645) PHD: several `cargo xtask phd` CLI fixes (#642) PHD: Use ZFS clones for file-backed disks (#640) PHD: improve ctrl-c handling (#634) Co-authored-by: Alan Hanson <alan@oxide.computer>
Omicron has undergone many enhancements in the last few months including, but no limited to, some things that affect Falcon:
Some support for these features in falcon was already included in prior commits. This commit includes support during the
install-prereqs
command via the following functions:It also allows a location on the client to be provided for the
config-rss.toml
so that different configs can be used when deploying to different topologies. This is especially useful with falcon.Besides installing all the necessary pre-requisites to support the new Omicron features, thing-flinger itself has been improved by parallelizing all installs and deploys, and delaying the startup of sled-agent until after the overlay files are in place. Prior restarts triggered #1592 which are now no longer an issue. In order to allow this, two new deploy commands were added to
omicron-package
:DeployCommand::Unpack
andDeployCommand::Activate
which separate the unpacking of package files from the installation of SMF manifests and starting of services.While running thing-flinger deployments, I noticed that trust-quorum was no longer working and secrets were not being shared. With some help from @jgallagher that has since been rectified.
With all these changes, thing-flinger is now capable of deploying into arbitrary falcon topologies, with the first one being exemplified in https://github.com/oxidecomputer/falcon-omicron which provides a two node setup. I have run this locally on my helios box and am working to get it running on buildomat.