-
Notifications
You must be signed in to change notification settings - Fork 62
Closed
Labels
Sled AgentRelated to the Per-Sled Configuration and ManagementRelated to the Per-Sled Configuration and ManagementstorageRelated to storage.Related to storage.
Description
See also: RFD 118
Sled Agent currently configures a few pieces of data outside datasets explicitly allocated in pools:
/var/oxidecontains a variety of configuration information, including...- RSS setup information
- The "Sled Request" used to launch the sled-agent (including underlay information)
- A list of "All services which should be launched on this sled"
- A list of "All services which should be launched on u.2 zpools"
/zonecontains the "filesystem for all zones"/opt/oxidecontains all the "latest installed system images", and is used to update control plane software which exists outside the ramdisk.
Q: So, why is this bad?
A: All those paths are currently backed by a ramdisk -- specifically rpool -- on gimlets.
This means that when we reboot, a significant portion of the necessary configuration information to launch the sled will be lost. Furthermore, for the zonepath filesystems, a significant portion of user RAM will be dedicated to zone-based filesystems, which we'd prefer to distribute to disk-backed file storage.
Here's a list of some of the work we need to accomplish to mitigate this in a production environment:
- Sled agent should be responsible for formatting U.2s and M.2s with necessary datasets
- Sled agent should move "sled-agent-request" information into the M.2 partitions, duplicating them on write. #2998
- Sled agent should move all RSS-related information into the M.2 partitions, duplicating them on write. #2970
- Sled agent should move all service-related information into the M.2 partitions, duplicating them on write. #2969
- This would be made easier with Tracking issue for "Self-assembling Zones" #1898 , as we'd only need to store the zone information, and not the corresponding config for "how to launch the zone".
- Sled agent should allocate
zonepaths from U.2s- This means sled agent needs to handle the allocation / de-allocation of zonepaths much more explicitly.
- Related: [sled agent] Creating zones for Propolis Instances leaks datasets #1119 , Crucible datasets remain after disks are deleted. #1313
- Sled agent should move all software images from
/opt/oxideinto/pool/int/<UUID>/install. #2971
Metadata
Metadata
Assignees
Labels
Sled AgentRelated to the Per-Sled Configuration and ManagementRelated to the Per-Sled Configuration and ManagementstorageRelated to storage.Related to storage.