Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: wait for rook ceph migrator pod ready #4571

Merged
merged 1 commit into from
May 31, 2023

Conversation

emosbaugh
Copy link
Member

@emosbaugh emosbaugh commented May 31, 2023

What this PR does / why we need it:

We do two migrations back to back. This code has a race condition and will try to wait for the previous pod that is terminating rather than the pod from the new deployment.

https://testgrid.kurl.sh/run/STAGING-daily-storage-migration-ee0da9b-2023-05-25T01:28:20Z?kurlLogsInstanceId=vvcexmochqzfsqvy&nodeId=vvcexmochqzfsqvy-initialprimary#L0

2023-05-25 12:29:56+00:00 ✔ Rook Flex volumes to CSI volumes migrated successfully
2023-05-25 12:29:56+00:00 storageclass.storage.k8s.io "default" deleted
2023-05-25 12:29:57+00:00 storageclass.storage.k8s.io/default created
2023-05-25 12:29:57+00:00 cephblockpool.ceph.rook.io/replicapool unchanged
2023-05-25 12:29:57+00:00 ⚙  Migrating Rook Flex volumes to CSI volumes
2023-05-25 12:29:57+00:00 + ./bin/kurl rook flexvolume-to-csi --source-sc rook-ceph-tmp --destination-sc default --node vvcexmochqzfsqvy-initialprimary --pv-migrator-bin-path /var/lib/kurl/bin/rook-pv-migrator --ceph-migrator-image rook/ceph:v1.7.11
2023-05-25 12:29:58+00:00 Running rook-ceph-migrator deployment ...
2023-05-25 12:29:58+00:00 # Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
2023-05-25 12:29:58+00:00 Waiting for rook-ceph-migrator deployment to be ready ...
2023-05-25 12:30:08+00:00 Deleting flex migrator ...
2023-05-25 12:30:08+00:00 Deleted flex migrator
2023-05-25 12:30:08+00:00 Error: wait for flex migrator pod: wait for rook-ceph-migrator pod: pods "rook-ceph-migrator-8c89874c6-77hvf" not found

Which issue(s) this PR fixes:

Fixes NONE

Special notes for your reviewer:

Steps to reproduce

Does this PR introduce a user-facing change?

Fixes an issue that could cause Rook upgrades from version 1.0.4 to 1.7.x to fail with error rook-ceph-migrator pod not found.

Does this PR require documentation?

NONE

@emosbaugh emosbaugh added type::bug Something isn't working bug::normal labels May 31, 2023
@emosbaugh emosbaugh requested a review from a team as a code owner May 31, 2023 19:04
Comment on lines +255 to +259
for _, pod := range pods.Items {
if k8sutil.IsPodReady(pod) {
return &pod, nil
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to have something after this that handles "no pod was ready"

after all, that's why we initially had the call to WaitForPodReady

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this might be covered already by WaitForDeploymentReady

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe waitfordeploymentready above will wait for one replica (pod) to be ready

@emosbaugh emosbaugh merged commit e39cb05 into main May 31, 2023
@emosbaugh emosbaugh deleted the emosbaugh/sc-74164/rook-ceph-migrator-pod-not-ready branch May 31, 2023 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug::normal type::bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants