-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for PVC label selector conflicts against the secondary cluster #1855
base: main
Are you sure you want to change the base?
Check for PVC label selector conflicts against the secondary cluster #1855
Conversation
d8f3da4
to
92a2325
Compare
How do you plan to do the conflict check? Labels can be added and removed at any time. Lets say we have a conflict check the time a drpc is added, there are no conflict. Now the user add a label or add a new pvc machine another drpc label. The next time we look at the pvcs we will find a new pvc and protect it. How the conflict check helps? The example you using Exist for label is a very bad way to select pvcs. The only way to protect from such bad label selector is to reject Exists selector and allow only In selector. |
The conflict check is implemented within the VRG reconciliation process, which already watching PVCs for the addition of the VRG owner label. Therefore, updating labels on PVCs does not pose an issue.
I think you are explaining primary cluster behavior, The conflict check applies specifically to the secondary cluster. If any PVC in the secondary cluster matches the primary VRG’s label selector, the check will trigger an error. It does not consider whether the PVC is owned by another VRG or remains unowned.
The above example provides a straightforward, high-level overview to highlight the importance of performing a PVC conflict check in the secondary cluster. Since a PVC can have multiple labels, conflicts can arise in various ways. |
The flow is not clear. Can yo describe the state in both clusters and explain when the conflicts check happens, and what is expected result of the conflict check? Does it only log an error, or prevent failover/relocate? The behavior must be specified and discussed before we working on a solution. An issue is the right place to describe the problem and specify the solution. |
This PR is part of the independent VM feature, and I have already discussed the issue with my sub-team. I raised the PR to facilitate discussions with other sub-teams, as I felt continuing the discussion under an issue would delay implementation. I still need help deciding whether to simply log the issue, return an error in the secondary VRG reconciliation, or update a status condition. |
92a2325
to
05da967
Compare
There is no way to discuss this PR with other people if the problem and the suggested behavior is not specified in public. |
I still don’t understand why presenting the problem along with a clear example isn't sufficient for the PR review. While we can continue refining the solution by creating a well-documented issue, drafting extensive documentation, and holding multiple meetings, this approach will only prolong the process. The purpose of the PR is to initiate discussions and collaboratively work toward a solution rather than delaying implementation with excessive formalities. |
f6057dd
to
f11a4e4
Compare
Volsync Secondary: status:
conditions:
- lastTransitionTime: '2025-02-28T11:05:28Z'
message: 'Volsync based PVC protection does not report DataReady condition as Secondary. PVC protection as secondary is complete, or no PVCs needed protection using VolumeReplication scheme'
observedGeneration: 8
reason: Unused
status: 'True'
type: DataReady
- lastTransitionTime: '2025-02-28T11:05:28Z'
message: A PVC that is not a replication destination should not match the label selector.
observedGeneration: 8
reason: PVCConflict
status: 'False'
type: DataProtected
- lastTransitionTime: '2025-02-28T11:05:28Z'
message: A PVC that is not a replication destination should not match the label selector.
observedGeneration: 8
reason: PVCConflict
status: 'False'
type: ClusterDataProtected
kubeObjectProtection: {}
lastUpdateTime: '2025-02-28T11:05:28Z'
observedGeneration: 8
state: Secondary VolRep Secondary: status:
conditions:
- lastTransitionTime: '2025-02-28T08:02:43Z'
message: PVCs in the VolumeReplicationGroup group are replicating
observedGeneration: 3
reason: Replicating
status: 'True'
type: DataReady
- lastTransitionTime: '2025-02-28T10:18:20Z'
message: No PVC on the secondary should match the label selector
observedGeneration: 3
reason: PVCConflict
status: 'False'
type: DataProtected
- lastTransitionTime: '2025-02-28T06:02:47Z'
message: Restored 0 volsync PVs/PVCs and 2 volrep PVs/PVCs
observedGeneration: 2
reason: Restored
status: 'True'
type: ClusterDataReady
- lastTransitionTime: '2025-02-28T08:00:51Z'
message: No PVCs are protected using Volsync scheme. Cluster data is not protected as Secondary
observedGeneration: 3
reason: Unused
status: 'True'
type: ClusterDataProtected
- lastTransitionTime: '2025-02-28T06:02:47Z'
message: Kube objects restored
observedGeneration: 2
reason: KubeObjectsRestored
status: 'True'
type: KubeObjectsReady
kubeObjectProtection: {}
lastUpdateTime: '2025-02-28T10:18:20Z'
observedGeneration: 3
protectedPVCs:
- resources:
requests:
storage: 1Gi
replicationID:
id: 0c7a26e15564eb07d56cda451e0f42e5
modes:
- Failover
storageClassName: ocs-storagecluster-ceph-rbd
csiProvisioner: openshift-storage.rbd.csi.ceph.com
name: busybox-pvc
accessModes:
- ReadWriteOnce
storageID:
id: 92472e3f8bd240a3e24e6faed855071d
conditions:
- lastTransitionTime: '2025-02-28T08:02:43Z'
message: VolumeReplication resource for the pvc is syncing as Secondary
observedGeneration: 3
reason: Replicating
status: 'True'
type: DataReady
- lastTransitionTime: '2025-02-28T06:02:59Z'
message: PV cluster data already protected for PVC busybox-pvc
observedGeneration: 2
reason: Uploaded
status: 'True'
type: ClusterDataProtected
- lastTransitionTime: '2025-02-28T08:00:57Z'
message: VolumeReplication resource for the pvc is syncing as Secondary
observedGeneration: 3
reason: Replicating
status: 'False'
type: DataProtected
namespace: busybox-tim1
volumeMode: Filesystem
labels:
app: sapptim1
app.kubernetes.io/part-of: sapptim1
appname: busybox
apps.open-cluster-management.io/reconcile-rate: medium
state: Secondary |
for _, pvc := range v.volSyncPVCs { | ||
matchFound := false | ||
|
||
for _, rdSpec := range v.instance.Spec.VolSync.RDSpec { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need @BenamarMk to check if this is going to an correct check all the time. There is a certain duration when the vrg is secondary and doesn't have the rdspec filled it. It might lead to spurious errors.
@@ -2730,3 +2719,39 @@ func PruneAnnotations(annotations map[string]string) map[string]string { | |||
|
|||
return result | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no conflict checking function for volrep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Gowtham Shanmugasundaram <gshanmug@redhat.com>
f11a4e4
to
72e109b
Compare
Applying PVC Label Conflict Check on the Secondary Cluster
The PR introduces a mechanism in
vrg_volrep.go
andvrg_volsync.go
that applies the PVCLabelSelector from the secondary VRG and checks whether any PVCs are selected.Expected behavior:
For more: #1861