Skip to content

Conversation

2uasimojo
Copy link
Member

@2uasimojo 2uasimojo commented Aug 5, 2025

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612) to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work.

This change then adds a new generic destroyer via the (existing) hiveutil deprovision command that consumes this metadata.json to deprovision the cluster.

This new behavior is the default, but we also include an escape hatch to run the platform-specific legacy destroyer by setting the following annotation on the ClusterDeployment:

hive.openshift.io/legacy-deprovision: "true"

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 5, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 5, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

In response to this:

Well, mostly.

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're instead offloading it verbatim to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

Instead of building the installer's ClusterMetadata structure for the destroyer with individual fields from the CD's ClusterMetadata, we're unmarshaling it directly from the contents of that Secret.

(Except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612)

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work. If this results in a hanging deprovision due to a missing field, the workaround is to modify the contents of the Secret to add it in; then kill the deprovision pod and the next attempt should pick up the changes. (If the result is a "successful" deprovision with leaked resources, the only workaround is to clean up the infra manually. Sorry.)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 5, 2025
Copy link
Contributor

openshift-ci bot commented Aug 5, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 5, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

In response to this:

Well, mostly.

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're instead offloading it verbatim to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

Instead of building the installer's ClusterMetadata structure for the destroyer with individual fields from the CD's ClusterMetadata, we're unmarshaling it directly from the contents of that Secret.

(Except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612)

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work. If this results in a hanging deprovision due to a missing field, the workaround is to modify the contents of the Secret to add it in; then kill the deprovision pod and the next attempt should pick up the changes. (If the result is a "successful" deprovision with leaked resources, the only workaround is to clean up the infra manually. Sorry.)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Aug 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2025
@2uasimojo
Copy link
Member Author

TODO: Deprovisioner side

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from 8707a79 to d09e43e Compare August 15, 2025 17:59
@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 15, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

In response to this:

Well, mostly.

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612) to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work.

In the future (but not here!) instead of building the installer's ClusterMetadata structure for the destroyer with individual fields from the CD's ClusterMetadata, we'll unmarshal it directly from the contents of this Secret.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link

openshift-ci-robot commented Aug 15, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

In response to this:

Well, mostly.

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612) to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work.

In the future (but not here!) instead of building the installer's ClusterMetadata structure for the destroyer with individual fields from the CD's ClusterMetadata, we'll unmarshal it directly from the contents of this Secret.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from d09e43e to 13ad458 Compare August 19, 2025 21:20
@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from 13ad458 to ace243e Compare September 9, 2025 20:36
@2uasimojo 2uasimojo changed the title HIVE-2302: Pass metadata.json through opaquely HIVE-2302, HIVE-2644: Pass metadata.json through opaquely Sep 9, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 9, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

This pull request references HIVE-2644 which is a valid jira issue.

In response to this:

Well, mostly.

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612) to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work.

In the future (but not here!) instead of building the installer's ClusterMetadata structure for the destroyer with individual fields from the CD's ClusterMetadata, we'll unmarshal it directly from the contents of this Secret.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 9, 2025

@2uasimojo: This pull request references HIVE-2302 which is a valid jira issue.

This pull request references HIVE-2644 which is a valid jira issue.

In response to this:

Previously any time installer added a field to metadata.json, we would need to evaluate and possibly add a bespoke field and code path for it to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in some cases we have to scrub/replace credentials fields -- see HIVE-2804 / #2612) to a new Secret in the ClusterDeployment's namespace, referenced from a new field: ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to retrofit the new Secret based on the legacy fields. This is best effort and may not always work.

This change then adds a new generic destroyer via the (existing) hiveutil deprovision command that consumes this metadata.json to deprovision the cluster.

This new behavior is the default, but we also include an escape hatch to run the platform-specific legacy destroyer by setting the following annotation on the ClusterDeployment:

hive.openshift.io/legacy-deprovision: "true"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from ace243e to cf8610a Compare September 9, 2025 21:50
@2uasimojo
Copy link
Member Author

/test e2e e2e-azure e2e-gcp e2e-vsphere e2e-openstack

🤞

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from cf8610a to 268d7cc Compare September 10, 2025 19:01
@2uasimojo
Copy link
Member Author

/test e2e-azure e2e-vsphere

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from 268d7cc to bce1629 Compare September 23, 2025 19:38
@2uasimojo
Copy link
Member Author

  • This is now rebased on HIVE-2908: Remove OVirt (RHV) #2746
  • VSphere (hopefully) remedied: metadata.json can be in one of two different shapes depending on pre- or post-zonal. We should now be accounting for both.

/test e2e-vsphere

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from bce1629 to cb1b5d5 Compare September 23, 2025 21:31
@2uasimojo 2uasimojo marked this pull request as ready for review September 23, 2025 21:32
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 23, 2025
@2uasimojo
Copy link
Member Author

/hold for moar testings

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2025
@openshift-ci openshift-ci bot requested review from dlom and jstuever September 23, 2025 21:32
Copy link

codecov bot commented Sep 23, 2025

Codecov Report

❌ Patch coverage is 18.38235% with 333 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.06%. Comparing base (0834189) to head (8158d73).

Files with missing lines Patch % Lines
.../clusterdeployment/clusterdeployment_controller.go 25.17% 99 Missing and 5 partials ⚠️
pkg/install/generate.go 18.18% 95 Missing and 4 partials ⚠️
contrib/pkg/deprovision/deprovision.go 0.00% 75 Missing ⚠️
pkg/clusterresource/builder.go 0.00% 18 Missing and 3 partials ⚠️
contrib/pkg/createcluster/create.go 0.00% 11 Missing ⚠️
contrib/pkg/deprovision/azure.go 0.00% 3 Missing ⚠️
contrib/pkg/deprovision/gcp.go 0.00% 3 Missing ⚠️
contrib/pkg/deprovision/ibmcloud.go 0.00% 3 Missing ⚠️
contrib/pkg/deprovision/nutanix.go 0.00% 3 Missing ⚠️
contrib/pkg/deprovision/openstack.go 0.00% 3 Missing ⚠️
... and 3 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2729      +/-   ##
==========================================
- Coverage   50.33%   50.06%   -0.28%     
==========================================
  Files         284      284              
  Lines       33952    34235     +283     
==========================================
+ Hits        17090    17139      +49     
- Misses      15517    15740     +223     
- Partials     1345     1356      +11     
Files with missing lines Coverage Δ
contrib/pkg/deprovision/awstagdeprovision.go 0.00% <ø> (ø)
pkg/constants/constants.go 100.00% <ø> (ø)
...lusterdeprovision/clusterdeprovision_controller.go 53.40% <100.00%> (+0.17%) ⬆️
pkg/controller/utils/logtagger.go 100.00% <100.00%> (ø)
.../v1/clusterdeployment_validating_admission_hook.go 87.07% <100.00%> (+0.04%) ⬆️
...shift/hive/apis/hive/v1/clusterdeployment_types.go 0.00% <ø> (ø)
...hift/hive/apis/hive/v1/clusterdeprovision_types.go 0.00% <ø> (ø)
...controller/clusterclaim/clusterclaim_controller.go 62.28% <33.33%> (-0.22%) ⬇️
contrib/pkg/deprovision/azure.go 0.00% <0.00%> (ø)
contrib/pkg/deprovision/gcp.go 0.00% <0.00%> (ø)
... and 10 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@2uasimojo
Copy link
Member Author

...so I think I'll just rebase to pick up that last one anyway.

...but I might as well wait until #2754 lands.

@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from cb1b5d5 to f6f2464 Compare October 1, 2025 19:16
@2uasimojo
Copy link
Member Author

/assign @dlom

@2uasimojo
Copy link
Member Author

Legacy granny switch validated via #2759

Well, mostly.

Previously any time installer added a field to metadata.json, we would
need to evaluate and possibly add a bespoke field and code path for it
to make sure it was supplied to the destroyer at deprovision time.

With this change, we're offloading metadata.json verbatim (except in
some cases we have to scrub/replace credentials fields -- see HIVE-2804
/ openshift#2612) to a new Secret in the ClusterDeployment's namespace,
referenced from a new field:
ClusterDeployment.Spec.ClusterMetadata.MetadataJSONSecretRef.

For legacy clusters -- those created before this change -- we attempt to
retrofit the new Secret based on the legacy fields. This is best effort
and may not always work.

In the future (but not here!) instead of building the installer's
ClusterMetadata structure for the destroyer with individual fields from
the CD's ClusterMetadata, we'll unmarshal it directly from the contents
of this Secret.
An earlier commit ensures that ClusterDeployments have an associated
Secret containing the metadata.json emitted by the installer.

This change adds a new generic destroyer via the (existing) `hiveutil
deprovision` command that consumes this metadata.json to deprovision the
cluster.

This new behavior is the default, but we also include an escape hatch to
run the platform-specific legacy destroyer by setting the following
annotation on the ClusterDeployment:

`hive.openshift.io/legacy-deprovision: "true"`
@2uasimojo
Copy link
Member Author

/retest hive-on-pull-request

@2uasimojo
Copy link
Member Author

/retest hive-mce-26-on-pull-request

@openshift openshift deleted a comment from openshift-ci bot Oct 1, 2025
@openshift openshift deleted a comment from openshift-ci bot Oct 1, 2025
@2uasimojo 2uasimojo force-pushed the HIVE-2302/metadata.json-passthrough branch from f6f2464 to 8158d73 Compare October 1, 2025 20:18
@dlom
Copy link
Contributor

dlom commented Oct 2, 2025

/retest hive-on-pull-request

@openshift openshift deleted a comment from openshift-ci bot Oct 2, 2025
@dlom
Copy link
Contributor

dlom commented Oct 2, 2025

/retest hive-mce-26-on-pull-request

Copy link
Contributor

openshift-ci bot commented Oct 2, 2025

@dlom: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

/test coverage
/test e2e
/test e2e-azure
/test e2e-gcp
/test e2e-openstack
/test e2e-pool
/test e2e-vsphere
/test images
/test periodic-images
/test security
/test unit
/test verify

Use /test all to run all jobs.

In response to this:

/retest hive-mce-26-on-pull-request

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@2uasimojo
Copy link
Member Author

/test e2e-openstack

Pre-actual-test IPI setup flake

@2uasimojo
Copy link
Member Author

/test e2e-openstack

Same again

@2uasimojo
Copy link
Member Author

/test e2e-openstack

Looks like provision timed out?

Copy link
Contributor

openshift-ci bot commented Oct 3, 2025

@2uasimojo: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

}

// TODO: Make a registry or interface for this
var ConfigureCreds func(client.Client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, it's readable and makes sense

Comment on lines +97 to +102
// If env vars are unset, the destroyer will fail organically.
ConfigureCreds = func(c client.Client) {
nutanixutil.ConfigureCreds(c)
metadata.Nutanix.Username = os.Getenv(constants.NutanixUsernameEnvVar)
metadata.Nutanix.Password = os.Getenv(constants.NutanixPasswordEnvVar)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If env vars are unset, the destroyer will fail organically.
ConfigureCreds = func(c client.Client) {
nutanixutil.ConfigureCreds(c)
metadata.Nutanix.Username = os.Getenv(constants.NutanixUsernameEnvVar)
metadata.Nutanix.Password = os.Getenv(constants.NutanixPasswordEnvVar)
}
// If env vars are unset, the destroyer will fail organically.
ConfigureCreds = nutanixutil.ConfigureCreds
metadata.Nutanix.Username = os.Getenv(constants.NutanixUsernameEnvVar)
metadata.Nutanix.Password = os.Getenv(constants.NutanixPasswordEnvVar)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just write it like this. The function is immediately invoked right below

// (They were there originally, but we scrubbed them for security.)
// If env vars are unset, the destroyer will fail organically.
ConfigureCreds = func(c client.Client) {
vsphereutil.ConfigureCreds(c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Nothing else inside the function is referencing this c variable, I think it can just be directly hoisted up into the switch case

clusterMetadata *hivev1.ClusterMetadata,
metadataJSON []byte,
forceUpdate bool,
logger log.FieldLogger) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger log.FieldLogger) error {
logger log.FieldLogger,
) error {

Cosmetic only suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants