enable two node cluster deployment #3671

parth-gr · 2026-01-23T13:01:08Z

Part of https://issues.redhat.com/browse/RHSTOR-8071 Solution

it is needed to run the floating mon with the
other mon deployment
Also made the mgr count:1
And also make the default max replica count 2

parth-gr · 2026-01-23T13:01:34Z

/assign @travisn @malayparida2000

/cherry-pick release-4.21

openshift-cherrypick-robot · 2026-01-23T13:01:38Z

@parth-gr: once the present PR merges, I will cherry-pick it on top of release-4.21 in a new PR and assign it to you.

Details

In response to this:

/assign @travisn @malayparida2000

/cherry-pick release-4.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

it is needed to run the floating mon with the other mon deployment Also made the mgr count:1 And also make the default max replica count 2 Signed-off-by: parth-gr <partharora1010@gmail.com>

openshift-ci · 2026-01-23T13:05:19Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: parth-gr
Once this PR has been reviewed and has the lgtm label, please ask for approval from malayparida2000. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

malayparida2000 · 2026-01-27T04:07:36Z

controllers/util/k8sutil.go

 }

+// IsTwoNodeDeployment returns true if cluster has only two nodes.
+func IsTwoNodeDeployment(nodeCount int) bool {


I don't think we should rely on this to confirm if it's a two node deployment. This can be dangerous as someone may try to deploy ODF on two nodes even when it's not a TNF cluster.

A better way IMO would be to either utilise a ENV var for two node deployment similar to Single Node Deployment or detect two node deployment by getting the clusterversion CR & checking it's topology field. We already get the clusterversion CR in the operator code so there won't be performance loss issue as well.

checking it's topology field.

what info it will provide?

here is the output

rider:Downloads$ oc get clusterversion -o yaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2026-01-22T16:21:56Z" generation: 2 name: version resourceVersion: "1448558" uid: c1718d72-7559-4d39-9ad2-25f297abb7a8 spec: channel: stable-4.20 clusterID: 59e5579a-f880-4f1a-ae6b-b97b51ff0063 status: availableUpdates: - channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:2d228e6d0b5a5ef2d7eb40bc171ad44f06b990d7adb678914e5d9d047e72568d url: https://access.redhat.com/errata/RHBA-2026:370 version: 4.20.10 - channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:91606a5f04331ed3293f71034d4f480e38645560534805fe5a821e6b64a3f203 url: https://access.redhat.com/errata/RHBA-2025:23103 version: 4.20.8 - channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:24da924c84a1dfa28525f85525356cf1ac4fbe23faec7c66d1890e0b3bcba7a0 url: https://access.redhat.com/errata/RHSA-2025:19890 version: 4.20.3 - channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:0e232879e27fb821eeb1d0e34f9bd8f85e28533836e59cc7fee96fcc9f3851cd url: https://access.redhat.com/errata/RHSA-2025:19296 version: 4.20.2 - channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:cbde13fe6ed4db88796be201fbdb2bbb63df5763ae038a9eb20bc793d5740416 url: https://access.redhat.com/errata/RHSA-2025:19003 version: 4.20.1 capabilities: enabledCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - OperatorLifecycleManagerV1 - Storage - baremetal - marketplace - openshift-samples knownCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - OperatorLifecycleManagerV1 - Storage - baremetal - marketplace - openshift-samples conditionalUpdates: - conditions: - lastTransitionTime: "2026-01-22T16:22:21Z" message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. https://issues.redhat.com/browse/RUN-3748 reason: RuncShareProcessNamespace status: "False" type: Recommended release: channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:a29bcbc9f286d68b394ffa0288c5de7e487c90077c06cbaf7a4cadeb0398ce28 url: https://access.redhat.com/errata/RHSA-2025:22257 version: 4.20.6 risks: - matchingRules: - type: Always message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. name: RuncShareProcessNamespace url: https://issues.redhat.com/browse/RUN-3748 - conditions: - lastTransitionTime: "2026-01-22T16:22:21Z" message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. https://issues.redhat.com/browse/RUN-3748 reason: RuncShareProcessNamespace status: "False" type: Recommended release: channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:c1568bf00f149d16b4cbe5cd8aedf3bef110c1460a91f81688aca8e338806a2c url: https://access.redhat.com/errata/RHBA-2025:21811 version: 4.20.5 risks: - matchingRules: - type: Always message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. name: RuncShareProcessNamespace url: https://issues.redhat.com/browse/RUN-3748 - conditions: - lastTransitionTime: "2026-01-22T16:22:21Z" message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. https://issues.redhat.com/browse/RUN-3748 reason: RuncShareProcessNamespace status: "False" type: Recommended release: channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:5b87a665045cdfe0a1b271024be936a0c46de17b25a112d6a136c5af89d861c4 url: https://access.redhat.com/errata/RHBA-2025:21228 version: 4.20.4 risks: - matchingRules: - type: Always message: Some runc 1.2 releases fail to launch containers in some Pods where shareProcessNamespace is explicitly set true. name: RuncShareProcessNamespace url: https://issues.redhat.com/browse/RUN-3748 conditions: - lastTransitionTime: "2026-01-22T16:22:22Z" status: "True" type: RetrievedUpdates - lastTransitionTime: "2026-01-22T16:22:22Z" message: |- Multiple cluster operators should not be upgraded between minor versions: * Cluster operator config-operator should not be upgraded between minor versions: FeatureGates_RestrictedFeatureGates_TechPreviewNoUpgrade: FeatureGatesUpgradeable: "TechPreviewNoUpgrade" does not allow updates * Cluster operator etcd should not be upgraded between minor versions: UnsupportedConfigOverrides_UnsupportedConfigOverridesSet: UnsupportedConfigOverridesUpgradeable: setting: [useExternalEtcdSupport useUnsupportedUnsafeEtcdContainerRemoval] reason: ClusterOperatorsNotUpgradeable status: "False" type: Upgradeable - lastTransitionTime: "2026-01-22T16:22:22Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2026-01-22T16:22:22Z" message: Payload loaded version="4.20.0" image="quay.io/openshift-release-dev/ocp-release@sha256:d1dc76522d1e235b97675b28e977cb8c452f47d39c0eb519cde02114925f91d2" architecture="amd64" reason: PayloadLoaded status: "True" type: ReleaseAccepted - lastTransitionTime: "2026-01-22T16:48:48Z" message: Done applying 4.20.0 status: "True" type: Available - lastTransitionTime: "2026-01-27T05:46:18Z" status: "False" type: Failing - lastTransitionTime: "2026-01-22T16:48:48Z" message: Cluster version is 4.20.0 status: "False" type: Progressing desired: channels: - candidate-4.20 - candidate-4.21 - candidate-4.22 - eus-4.20 - fast-4.20 - stable-4.20 image: quay.io/openshift-release-dev/ocp-release@sha256:d1dc76522d1e235b97675b28e977cb8c452f47d39c0eb519cde02114925f91d2 url: https://access.redhat.com/errata/RHSA-2025:9562 version: 4.20.0 history: - completionTime: "2026-01-22T16:48:48Z" image: quay.io/openshift-release-dev/ocp-release@sha256:d1dc76522d1e235b97675b28e977cb8c452f47d39c0eb519cde02114925f91d2 startedTime: "2026-01-22T16:22:22Z" state: Completed verified: false version: 4.20.0 observedGeneration: 2 versionHash: PF7438UmreY= kind: List metadata: resourceVersion: ""

Hm, Seems like this does not has any info regd that. Can you please check the infrastructure CR

nothing like that

rider:Downloads$ oc get infrastructure -o yaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: Infrastructure metadata: creationTimestamp: "2026-01-22T16:21:50Z" generation: 1 name: cluster resourceVersion: "543" uid: 8bdcd1a4-b9ab-4e5f-87aa-8d2608613de8 spec: cloudConfig: name: "" platformSpec: type: None status: apiServerInternalURI: https://api-int.2nodehp-test.hubcluster-1.lab.eng.cert.redhat.com:6443 apiServerURL: https://api.2nodehp-test.hubcluster-1.lab.eng.cert.redhat.com:6443 controlPlaneTopology: DualReplica cpuPartitioning: None etcdDiscoveryDomain: "" infrastructureName: 2nodehp-test-spxcz infrastructureTopology: HighlyAvailable platform: None platformStatus: type: None kind: List metadata: resourceVersion: ""

maybe you are looking for controlPlaneTopology: DualReplica

Yes, This controlPlaneTopology: DualReplica can give us a definite sign of this being a TNF cluster

malayparida2000 · 2026-01-27T04:08:32Z

controllers/util/k8sutil.go

 }

+// IsTwoNodeDeployment returns true if cluster has only two nodes.
+func IsTwoNodeDeployment(nodeCount int) bool {


The name of the func can also be more indicative of the actual intention like IsOCPTNFDeployment or something else like that instead of just being TwoNodeDeployment

malayparida2000 · 2026-01-27T04:12:21Z

controllers/storagecluster/cephcluster.go

 		// cluster-wide encryption is enabled or any of the device set is encrypted
 		// ie, sc.Spec.Encryption.ClusterWide/sc.Spec.Encryption.Enable is True or any device is encrypted
 		// and KMS ConfigMap is available
-


Unrelated change

malayparida2000 · 2026-01-27T04:16:21Z

controllers/storagecluster/cephcluster.go

 		return 1
 	}
+	if statusutil.IsTwoNodeDeployment(nodeCount) {
+		return 2


I don't think it's necessary. This AFAIK reflects minimum Replica in a deviceSet. In case of TNF these are baremetal clusters so the DeviceSet may look like Count:2, Replica:1. Please cross check

openshift-ci bot assigned malayparida2000 and travisn Jan 23, 2026

enable two node cluster deployment

8de0df2

it is needed to run the floating mon with the other mon deployment Also made the mgr count:1 And also make the default max replica count 2 Signed-off-by: parth-gr <partharora1010@gmail.com>

parth-gr force-pushed the tnf branch from 7133ce4 to 8de0df2 Compare January 23, 2026 13:05

malayparida2000 reviewed Jan 27, 2026

View reviewed changes

parth-gr requested a review from malayparida2000 January 27, 2026 05:36

enable two node cluster deployment #3671

Are you sure you want to change the base?

enable two node cluster deployment #3671

Conversation

parth-gr commented Jan 23, 2026

Uh oh!

parth-gr commented Jan 23, 2026

Uh oh!

openshift-cherrypick-robot commented Jan 23, 2026

Uh oh!

openshift-ci bot commented Jan 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants