Skip to content

Add support in SM Plugin to delete snapshots created manually #1452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Tarun-kishore
Copy link
Contributor

Description

This PR introduces optional creation workflows and snapshot pattern support for Snapshot Management policies, enabling users to create deletion-only policies and manage external snapshots alongside policy-created ones.

Key Changes

1. Optional Creation Field

  • Made the creation field optional in SMPolicy, allowing deletion-only policies
  • Added version compatibility checks (Version.V_3_2_0) for backward compatibility
  • Updated JSON parsing and serialization to handle nullable creation workflows

2. Snapshot Pattern Support

  • Added snapshotPattern field to SMPolicy.Deletion to include external snapshots in deletion workflows
  • Enhanced deletion states to combine policy-created and pattern-matched snapshots
  • Applied deletion conditions (max_age, min_count) across all matching snapshots

3. Enhanced State Machine Logic

  • Updated DeletingState to retrieve and combine snapshots from both policy and pattern sources
  • Modified DeletionFinishedState to handle completion logic for pattern-based deletions
  • Ensured proper state transitions for deletion-only workflows

4. Index Mapping Updates

  • Added snapshot_pattern field to deletion properties in index mappings
  • Updated both main and test mapping files to support strict mode

Use Cases Enabled

  • Cleanup-only workflows: Delete old snapshots without automatic creation
  • External snapshot management: Include snapshots created by external tools in retention policies
  • Flexible policy configurations: Support creation-only, deletion-only, or combined workflows

Backward Compatibility

All changes are fully backward compatible:

  • Existing policies with required creation fields continue to work unchanged
  • Version checks ensure proper serialization/deserialization across different versions
  • No breaking changes to existing APIs or data structures
  • No data migration required

Testing

Comprehensive test coverage including:

  • Unit tests for optional creation, snapshot patterns, and serialization
  • State machine tests for deletion-only and combined workflows
  • Integration tests for REST API policy creation

Related Issues

Resolves #867

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Tarun-kishore
Copy link
Contributor Author

Hey @bowenlan-amzn — since you were involved in the original issue, would you be open to reviewing the PR when you get a chance? Appreciate your insights!

@bowenlan-amzn
Copy link
Member

Will find time to do a review

Copy link
Member

@bowenlan-amzn bowenlan-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overrall looks pretty good!

I think as long as user only use the deletion-only policy after fully upgraded. There shouldn't be any issue. We can call this out in the documentation.

It's unfortunate we don't have code coverage, for some reason it hasn't been working for a long time. But I can see tests being added.

if (job.creation == null) {
log.warn("Policy creation config becomes null before trying to create snapshot. Reset.")
return SMResult.Fail(
metadataBuilder, WorkflowType.CREATION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have a resetCreation just like resetDeletion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added resetCreation.

Comment on lines 66 to 70
// Get pattern-based snapshots if pattern is specified
val patternSnapshots = if (job.deletion.snapshotPattern != null) {
try {
client.getSnapshots(job.deletion.snapshotPattern, repository)
} catch (ex: Exception) {
log.error("Caught exception while getting snapshots for pattern ${job.deletion.snapshotPattern}.", ex)
metadataBuilder.setLatestExecution(
status = SMMetadata.LatestExecution.Status.RETRYING,
message = "Caught exception while getting snapshots for pattern ${job.deletion.snapshotPattern}.",
cause = ex,
)
return SMResult.Fail(metadataBuilder, WorkflowType.DELETION)
}
} else {
emptyList()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate code, can extract to one method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. Moved duplicate code to util class

@bowenlan-amzn bowenlan-amzn self-assigned this Aug 4, 2025
… manually

Signed-off-by: Tarun-kishore <tarun2kishore@gmail.com>
Signed-off-by: Tarun-kishore <tarun2kishore@gmail.com>
@Tarun-kishore
Copy link
Contributor Author

I think as long as user only use the deletion-only policy after fully upgraded. There shouldn't be any issue. We can call this out in the documentation.

Yes, I'll update opensearch documentation after this PR is merged to use deletion-only policy after fully upgrading cluster.

It's unfortunate we don't have code coverage, for some reason it hasn't been working for a long time. But I can see tests being added.

Adding a ci check for code coverage (or preferrably new line changes) would be nice, it would ensure that tests are covering all cases.

@Tarun-kishore Tarun-kishore force-pushed the feature/snapshot-deletion-support branch 2 times, most recently from 9ba8dd0 to 461d4d8 Compare August 4, 2025 08:30
Signed-off-by: Tarun-kishore <tarun2kishore@gmail.com>
@Tarun-kishore Tarun-kishore force-pushed the feature/snapshot-deletion-support branch from 461d4d8 to 9f464fb Compare August 4, 2025 09:04
@Tarun-kishore
Copy link
Contributor Author

These CI failures look unrelated to this change, Other integration tests are failing which are not being reproduced in local environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] SM to support deletion of snapshots created manually
2 participants