Skip to content

Conversation

@christian-deleon
Copy link
Contributor

Fix race condition in NBRoutingPeer status updates

Description

Problem

The NBRoutingPeer controller was experiencing intermittent reconciliation failures with the error:

Operation cannot be fulfilled on nbroutingpeers.netbird.io "router": the object has been modified; please apply your changes to the latest version and try again

Root Cause

The controller had a race condition in its status update logic:

  1. The NBRoutingPeer object was fetched at the start of reconciliation
  2. Status was modified throughout the reconciliation process
  3. In the deferred function, the status update used the originally fetched object
  4. If the object was modified between the initial fetch and the status update (by another reconcile loop, webhook, or external change), the update would fail due to a stale resourceVersion

This is a classic optimistic concurrency conflict in Kubernetes controllers.

Solution

Re-fetch the object immediately before updating the status to ensure we have the latest resourceVersion. The fix:

  1. Re-fetch the NBRoutingPeer object in the defer function before status update
  2. Copy the computed status to the freshly fetched object
  3. Update the status with the latest resource version

This ensures the status update always uses the most current object metadata, preventing conflicts.

Changes

  • Modified nbroutingpeer_controller.go: Updated the defer function in Reconcile() to re-fetch the object before status updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant