Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support snapshot merge / rollback in Stratis #597

Closed
bmr-cymru opened this issue Mar 27, 2023 · 3 comments
Closed

Support snapshot merge / rollback in Stratis #597

bmr-cymru opened this issue Mar 27, 2023 · 3 comments
Assignees

Comments

@bmr-cymru
Copy link
Member

The LVM2 project provides the capability to "merge" a snapshot volume back into its origin (the device from which the snapshot was originally created). Following the merge operation the content of the origin reflects the state at the point in time time the snapshot was created. This is useful to allow the file system state to be rolled back to an earlier point in time, for e.g. to recover from a failed update or other change scenario.

With device-mapper thin snapshots the merge operation is achieved by changing the mapping from device names to thin identifiers tracked by the pool, for example if fs1 is some file system from which a snapshot, fs1-snap was previously taken:

BEFORE
+----------------+  +----------------+
|      fs1       |  |   fs1-snap     |
|   thin_id: 1   |  |  thin_id: 2    |
+----------------+  +----------------+
          |                |
+------------------------------------+
|           DM thin pool             |
+------------------------------------+

AFTER
                     +---------------+
                     |      fs1      |
(thin_id 1 deleted)  |  thin_id: 2   | 
                     +---------------+  
                              |
+------------------------------------+
|           DM thin pool             |
+------------------------------------+

Merging a stopped file system can proceed immediately since the volume is not in use. For active volumes a note must be made in the device metadata indicating the intent to merge which is then applied the next time the file system is started (for e.g. following a reboot).

Stratis currently has the ability to take snapshots of file systems but does not yet support automatically merging or rolling back a file system to an earlier state tracked by a snapshot. A similar result can be achieved by deleting or renaming the origin device and then renaming the snapshot with the old origin name but this is a manual process with a number of drawbacks:

  • No error checking to ensure the correct device is used as the merge target
  • File systems must be de-activated and their corresponding thin ID values changed
  • Potential for data loss exists if the wrong target or thin ID values are used

Automating this process is particularly valuable for snapshots involving the root file system, since the manual approach would require the use of rescue media in order to de-activate, remove and replace the device containing the root file system.

@mulkieran
Copy link
Member

mulkieran commented Jun 15, 2023

This could be broken down into three separate tasks, which could be done in the order listed:

  1. Storing the relationship between Stratis filesystems. Currently, a snapshot is just another Stratis filesystem, and Stratis does not track a snapshot relationship at all. This relationship would have to be established when a snapshot is created, and patched up when any snapshot is removed. Step (1) is now complete.
  2. Recording the relationship in the pool-level filesystem-level metadata. Step (2) is now complete. However, we need to add some testing to verify that the filesystem metadata is always kept up-to-date in this case.
  3. The actual snapshot merge action. If both snapshot and origin are inactive (to be more precisely defined), a merge command could have immediate effect. Otherwise, the merge would be effected when the filesystem became active again.

A consideration is the XFS filesystem UUID. It would have to be updated by the merge, which in some cases could be expensive.

@bmr-cymru
Copy link
Member Author

Filed #643 for the origin tracking and reporting parts.

@mulkieran
Copy link
Member

Closing, but leaving the sub-issues, to help track the port back from the patch branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done(1)
Development

No branches or pull requests

2 participants