Skip to content

Conversation

@smklein
Copy link
Collaborator

@smklein smklein commented Apr 17, 2023

  • When parsing disks, look for DumpDevice partition. As defined, these currently only exist within M.2 devices:
    static M2_EXPECTED_PARTITIONS: [Partition; M2_EXPECTED_PARTITION_COUNT] = [
    Partition::BootImage,
    Partition::Reserved,
    Partition::Reserved,
    Partition::Reserved,
    Partition::DumpDevice,
    Partition::ZfsPool,
    ];
  • When we identify one of these partitions, ensure the DumpDevice partition is being used as swap space, using the swap command.

Fixes #2858

Comment on lines +305 to +306
let zpool_name =
Self::initialize_zpool(&log, variant, &paths, &partitions)?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is unchanged from before this PR, just refactored

ensure_partition_layout(&log, &paths, unparsed_disk.variant)?;
let partitions = ensure_partition_layout(&log, &paths, variant)?;

Self::initialize_swap(&log, &paths, &partitions)?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is new

Comment on lines +324 to +327
match paths.partition_device_path(
&partitions,
Partition::DumpDevice,
false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of this is probably my own unfamiliarity with the sled-hardware code, but it might be good to explain why this is going in the DumpDevice partition?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've struggled to find a great summary for this in the illumos docs, but I'm really just trying to follow the convention for this which has been set there.

Reading a combination of

I was able to infer:

I added a summary in 4039c7a

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this was helpful.

Ok(())
}

// Initialize a swap within the ZfsPool partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a swap device"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't see where we do that here. Shouldn't there be a call to Swap::set_swap if that were true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this comment is totally wrong - fixed in 2e3533b

@jclulow
Copy link
Collaborator

jclulow commented Apr 17, 2023

Before we move ahead with this I'd like us to consider a number of factors around serviceability and reliability and so on; e.g.,

  • these devices are not redundant, what happens when one fails either wholly or partially
    • processes with pages stored in the device will not be able to get them back, which is effectively catastrophic
    • removing a failed device means we're then back to having no swap until we add another
  • slice backed swap has the virtue of being simple, but it has the downside that none of the data is checksummed; if we page back in and the data is corrupt it seems like we wouldn't notice
  • what monitoring will we have in place to detect persistent runaway pageout, and what impact will paging I/O have on the longevity of this part that is not field replaceable
  • are we correctly extracting a potential dump from the device before engaging it as swap?
  • should we instead just fix the swapfs minimum threshold and whatever else is in the way to enable most of our memory to be used without swap?
  • there is no encryption of this swap data; we don't have a generic encrypted block device overlay and the swap mechanism has no first class encryption facility -- this is a security problem because sensitive data will be paged out eventually

If we decide we still need a swap device, we should consider an alternative strategy as well; viz., using a zvol on a U.2 device for swap. This has several potential benefits:

  • these parts are field replaceable if they wear out
  • swap to ZFS is more complex, but it has:
    • checksumming on all blocks, so we'd at least know we'd seen corruption and could kill and restart the process
    • swap zvols can be compressed
    • swap zvols can be encrypted

There are possibly other strategies to consider too. I don't think this is a slam dunk as-is, and I think we should do the research and design work before proceeding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sled Agent should be capable of utilizing swap space on the M.2 dump device

4 participants