Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: Handle maximum snapshots on a single rbd image #1195

Merged
merged 10 commits into from
Jul 6, 2020

Conversation

Madhu-1
Copy link
Collaborator

@Madhu-1 Madhu-1 commented Jul 2, 2020

We can have only a maximum of 510 snapshots on rbd images, due to the limitations mentioned in #1098. In CephCSI we are setting a limit of maximum 450 snapshots on an rbd image (keeping a buffer to avoid issues) During the SnapshotCreate RPC call we are creating a temporary snapshot and we create a clone from the snapshot and we delete the snapshot, Even if we delete the snapshot it will be in the trash as it's still linked to the cloned images, when we list all the snapshots snap ls --all on an rbd image, it will show all the snapshots which are in the trash also. to remove the snapshots from the trash we need to remove the linking from cloned and the snapshots.

As the cloned image is having the deep-flatten feature and once we flatten the cloned images which are created from the parent image, the temporary snapshots which are in the trash will be garbage collected, This will allow as to create more number of snapshots on a single rbd image.

closes #1098
updates: #1188

@Madhu-1 Madhu-1 marked this pull request as draft July 2, 2020 06:14
@Madhu-1 Madhu-1 added the component/rbd Issues related to RBD label Jul 2, 2020
Copy link
Contributor

@Yuggupta27 Yuggupta27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goerr complains about error to be wrapped static in place of dynamic. This change will avoid that warning.

@Madhu-1 Madhu-1 force-pushed the snap-limit branch 3 times, most recently from 968b0eb to 9ca19c1 Compare July 2, 2020 07:35
@Madhu-1 Madhu-1 force-pushed the snap-limit branch 3 times, most recently from 1c8076d to 65016c7 Compare July 2, 2020 09:32
@Madhu-1 Madhu-1 marked this pull request as ready for review July 2, 2020 09:33
humblec
humblec previously approved these changes Jul 2, 2020
@mergify mergify bot dismissed humblec’s stale review July 2, 2020 12:37

Pull request has been modified.

humblec
humblec previously approved these changes Jul 2, 2020
@mergify mergify bot dismissed humblec’s stale review July 2, 2020 13:23

Pull request has been modified.

@Madhu-1 Madhu-1 force-pushed the snap-limit branch 3 times, most recently from b6956c5 to 9eae500 Compare July 3, 2020 06:45
added listsnapshots function for an
rbd image to list all the snapshots
created from an rbd images, This will
list the snapshots which are in trash also.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
nixpanic
nixpanic previously approved these changes Jul 3, 2020
@nixpanic
Copy link
Member

nixpanic commented Jul 3, 2020

/retest ci/centos/containerized-tests

humblec
humblec previously approved these changes Jul 3, 2020
Madhu-1 added 3 commits July 3, 2020 15:30
Added maxsnapshotsonimage flag to flatten
the older rbd images on the chain to avoid
issue in krbd.The limit is in krbd since it
only allocate 1 4KiB page to handle all the
snapshot ids for an image.

The max limit is 510 as per
https://github.com/torvalds/linux/blob/
aaa2faab4ed8e5fe0111e04d6e168c028fe2987f/drivers/block/rbd.c#L98
in cephcsi we arekeeping the default to 450 to reserve 10%
to avoid issues.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
flatten cloned images to remove the link
with the snapshot as the parent snapshot can
be removed from the trash.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we need to test the maxsnapshotsonimage we
need to set the limit to minimal value which we
can test in CI as the default limit is 450,which
cannot be tested in CI.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@mergify mergify bot dismissed stale reviews from humblec and nixpanic July 3, 2020 10:06

Pull request has been modified.

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Jul 3, 2020

@humblec @nixpanic E2E tests are getting timeout as the default timeout for a Travis CI instance is 50Min and our E2E tests are taking more time. we need to split out cephfs and rbd into separate tests or move to centos CI.
for now, decreased the snapshot count to 5 to allow CI to pass. PTAL

Madhu-1 added 5 commits July 3, 2020 15:38
we need to take lock on parent rbd image when
we are creating a snapshot from it, if the user
tries to delete/resize the rbd image when we are
taking snapshots,we may face issues. if the volume
lock is present on the rbd image, the user cannot
resize the rbd image nor delete the rbd image.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Moved e2e test timeout to build.env

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
moved deploy timeout from travis
scripts to build.env

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
as we are creating more resources
the e2e timeout is not enough we need
to increase the e2e timeout to higher
value for now.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
Reduced the number of pods created
in ROX E2E to save some time in E2E
and changed the waiting time from 2 to 1
min.

Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>
@nixpanic
Copy link
Member

nixpanic commented Jul 3, 2020

split out cephfs and rbd into separate tests

That would be my preference. We can then run tests in parallel and hopefully speed things up.

I am still working to get the tests running in CentOS CI. Not sure when it is stable enough to replace some of the Travis CI jobs.

@Madhu-1 Madhu-1 requested a review from humblec July 6, 2020 05:14
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Jul 6, 2020

/retest ci/centos/containerized-tests

@mergify mergify bot merged commit 6838d30 into ceph:master Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limits for RBD create PVC from snapshot
5 participants