-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backup): delete backup in the backupstore asynchronously #3038
feat(backup): delete backup in the backupstore asynchronously #3038
Conversation
c809fd8
to
707f44c
Compare
Test PlanNormal Case
Error Case (use nfs)
Controller Crashes Case (use nfs)
|
707f44c
to
4e74975
Compare
There is a regression |
4e74975
to
5e21ded
Compare
Fixed. |
5e21ded
to
cf4419c
Compare
change the backoff time to |
@ChanYiLin Is it ready for review? |
Yes, it is ready for review. |
Hi @derekbit |
cf4419c
to
25dcfe8
Compare
Hi @derekbit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, LGTM
25dcfe8
to
dab29dc
Compare
dab29dc
to
1e10e27
Compare
@c3y1huang Could you take a look at this PR? Thanks. |
This pull request is now in conflict. Could you fix it @ChanYiLin? 🙏 |
@ChanYiLin Could you resolve the conflicts? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM
@ChanYiLin |
a63073e
to
47f49ce
Compare
cc @ChanYiLin
|
47f49ce
to
e56e419
Compare
WalkthroughThe changes in this pull request enhance the functionality of the backup management system within the application. Key updates include modifications to error handling in the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant BackupController
participant BackupTargetClient
User->>BackupController: Request Backup Deletion
BackupController->>BackupTargetClient: Start Deletion Process
BackupTargetClient->>BackupController: Log Deletion Start
BackupController->>BackupController: Update Backup Status to Deleting
BackupController->>BackupTargetClient: Handle Deletion Asynchronously
BackupTargetClient-->>BackupController: Return Deletion Result
BackupController->>User: Notify Deletion Status
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (6)
k8s/pkg/apis/longhorn/v1beta2/backup.go (1)
16-18
: LGTM: New backup state for deletion processThe addition of
BackupStateDeleting
aligns with the PR objectives for implementing asynchronous backup deletion. The naming and placement are consistent with other backup states.Consider a minor rewording of the comment for clarity:
- // Deleting is also considered as final state - // as it only happens when the backup is being deleting and has deletion timestamp + // Deleting is considered a final state + // as it only occurs when the backup is being deleted and has a deletion timestampcontroller/backup_volume_controller.go (1)
326-326
: Improved error handling for BackupGet operationThis change refines the error logging for the
BackupGet
operation by excluding "not found" errors. This aligns with the modified behavior ofbackupTargetClient.BackupGet
and reduces noise in the logs.Consider adding a debug log for the "not found" case to maintain traceability without cluttering the main log:
if backupInfo, err := backupTargetClient.BackupGet(backupURL, backupTargetClient.Credential); err != nil && !types.ErrorIsNotFound(err) { log.WithError(err).WithFields(logrus.Fields{ "backup": backupName, "backupvolume": backupVolumeName, "backuptarget": backupURL}).Warn("Failed to get backupInfo from remote backup target") +} else if types.ErrorIsNotFound(err) { + log.WithFields(logrus.Fields{ + "backup": backupName, + "backupvolume": backupVolumeName, + "backuptarget": backupURL}).Debug("Backup not found in remote backup target") } else { if backupInfo != nil && backupInfo.Labels != nil { if accessMode, exist := backupInfo.Labels[types.GetLonghornLabelKey(types.LonghornLabelVolumeAccessMode)]; exist { backupLabelMap[types.GetLonghornLabelKey(types.LonghornLabelVolumeAccessMode)] = accessMode } } }This addition will help maintain visibility into "not found" cases without cluttering the main log, which can be useful for debugging and auditing purposes.
engineapi/backups.go (1)
Action Required: Update Callers to Handle Modified
BackupGet
Error BehaviorThe
BackupGet
method inengineapi/backups.go
has changed its error handling. Previously, it returnednil
when a backup was not found, but it now returns an error in such cases. This modification affects the following files:
csi/controller_server.go
controller/backup_volume_controller.go
engineapi/backups.go
client/generated_backup_volume.go
app/recurring_job.go
api/systembackup.go
api/backup.go
Please ensure that each of these callers properly handles the new error scenario, distinguishing between critical errors and cases where a backup is not found to maintain application stability.
🔗 Analysis chain
Line range hint
302-306
: Change in error handling behavior for BackupGet methodThe error handling logic in the
BackupGet
method has been modified. Previously, if the backup configuration was not found, the method would returnnil, nil
. Now, it will return an error wrapped with a message indicating the failure to get the backup configuration.This change might impact any calling code that relies on the previous behavior. Please ensure that all callers of this method are updated to handle the new error case appropriately.
To verify the impact of this change, we can search for all calls to
BackupGet
in the codebase:🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for all calls to BackupGet rg --type go 'BackupGet\(' -A 5 -B 5Length of output: 7229
types/types.go (1)
760-762
: LGTM. Consider case-insensitive error message check.The new
ErrorIsInProgress
function is well-implemented and consistent with other error checking functions in the file. It correctly checks if an error message contains the phrase "in progress".For improved robustness, consider using case-insensitive string comparison:
func ErrorIsInProgress(err error) bool { - return strings.Contains(err.Error(), "in progress") + return strings.Contains(strings.ToLower(err.Error()), "in progress") }This change would catch variations like "In Progress" or "IN PROGRESS" as well.
controller/backup_controller.go (2)
83-86
: Avoid redundant initialization of synchronization primitivesGiven that the zero value of
sync.Mutex
andflowcontrol.Backoff
are usable, the explicit initialization in the constructor may not be necessary. This can simplify the code.Consider removing the explicit initializations:
- deletingMapLock: &sync.Mutex{}, - inProgressDeletingMap: map[string]*DeletingStatus{}, - deletingBackoff: flowcontrol.NewBackOff(DeletionMinInterval, DeletionMaxInterval), + // No need to initialize deletingMapLock and inProgressDeletingMap explicitly + // deletingBackoff will be initialized when first usedAlso applies to: 121-124
1001-1032
: Ensure consistent error handling and loggingIn the
handleBackupDeletionInBackupStore
function, when errors occur during backup deletion, ensure that all error messages provide sufficient context and are logged appropriately. This will aid in debugging and monitoring.Review the error messages and consider enhancing them for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
- controller/backup_controller.go (9 hunks)
- controller/backup_volume_controller.go (1 hunks)
- engineapi/backups.go (1 hunks)
- k8s/pkg/apis/longhorn/v1beta2/backup.go (1 hunks)
- types/types.go (1 hunks)
🧰 Additional context used
🔇 Additional comments (4)
k8s/pkg/apis/longhorn/v1beta2/backup.go (3)
8-11
: LGTM: Improved comment capitalizationThe capitalization of "Non-final state" improves consistency in the documentation style. This change doesn't affect functionality and enhances readability.
12-15
: LGTM: Consistent comment capitalizationThe capitalization of "Final state" maintains consistency with the previous comment modification. This change enhances readability without affecting functionality.
Line range hint
1-156
: Summary: Changes align with PR objectives and improve code qualityThe modifications in this file successfully introduce the new
BackupStateDeleting
constant, which is crucial for implementing asynchronous backup deletion as outlined in the PR objectives. The comment improvements enhance readability and maintain consistency in the codebase.To ensure the new state is properly integrated, please run the following verification script:
This script will help identify any places where the new state should be handled but might have been overlooked.
✅ Verification successful
BackupStateDeleting integration verified successfully
The
BackupStateDeleting
constant is consistently used across the relevant controller files without any overlaps or conflicts withSystemBackupStateDeleting
. This ensures clear and distinct handling of backup deletion states as intended.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of the new BackupStateDeleting constant # Test: Search for usages of BackupStateDeleting rg --type go "BackupStateDeleting" # Test: Check if BackupStateDeleting is included in switch statements or if statements handling backup states rg --type go -e "switch.*BackupState" -e "if.*BackupState"Length of output: 3265
Script:
#!/bin/bash # Description: Identify all backup state constants and check for duplicates or inconsistencies # Search for all constants related to BackupState rg --type go "BackupState\w*" | sort | uniq # Search for all constants related to SystemBackupState rg --type go "SystemBackupState\w*" | sort | uniq # Compare the two sets to identify overlaps or discrepancies comm -12 <(rg --type go "BackupState\w*" | sort | uniq) <(rg --type go "SystemBackupState\w*" | sort | uniq)Length of output: 25952
Script:
#!/bin/bash # Description: Identify overlapping constants between BackupState and SystemBackupState without using 'comm' # Extract all BackupState constants rg --type go "BackupState\w*" | sort | uniq > backup_states.txt # Extract all SystemBackupState constants rg --type go "SystemBackupState\w*" | sort | uniq > system_backup_states.txt # Find common constants using grep grep -Fxf system_backup_states.txt backup_states.txtLength of output: 7880
engineapi/backups.go (1)
Line range hint
302-310
: Improved logging in BackupDelete methodThe addition of a log entry at the start of the backup deletion process enhances traceability. This change improves the ability to track and debug backup deletions by providing clear start and end points for the operation.
The actual deletion logic remains unchanged, ensuring that the core functionality is preserved while improving observability.
e56e419
to
40de0f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
types/types.go (1)
760-762
: LGTM! Consider using a constant for the error message.The new
ErrorIsInProgress
function is a good addition, following the existing pattern for error checking functions. It will be useful for identifying "in progress" errors.Consider defining a constant for the "in progress" string to improve maintainability:
+const errorInProgressMessage = "in progress" func ErrorIsInProgress(err error) bool { - return strings.Contains(err.Error(), "in progress") + return strings.Contains(err.Error(), errorInProgressMessage) }controller/backup_controller.go (2)
52-52
: Simplify time duration expressionThe expression
time.Minute * 1
can be simplified totime.Minute
for clarity.Apply this diff to simplify:
- DeletionMinInterval = time.Minute * 1 + DeletionMinInterval = time.Minute
121-121
: Unnecessary initialization ofsync.Mutex
It's unnecessary to explicitly initialize
sync.Mutex
since its zero value is ready for use. Initializing it is redundant.Apply this diff to remove the redundant initialization:
- deletingMapLock: sync.Mutex{},
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
- controller/backup_controller.go (9 hunks)
- controller/backup_volume_controller.go (1 hunks)
- engineapi/backups.go (1 hunks)
- k8s/pkg/apis/longhorn/v1beta2/backup.go (1 hunks)
- types/types.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- controller/backup_volume_controller.go
- engineapi/backups.go
- k8s/pkg/apis/longhorn/v1beta2/backup.go
🧰 Additional context used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, LGTM.
40de0f8
to
a72d8e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
engineapi/backups.go (1)
Line range hint
302-310
: Enhanced logging in BackupDelete methodThe addition of log statements at the start and end of the backup deletion process significantly improves traceability. This change aligns well with the new asynchronous deletion feature, providing clear indicators of the deletion process status.
Consider adding a log statement for the case when
types.ErrorIsNotFound(err)
is true, to explicitly log when a backup is not found during deletion attempt.func (btc *BackupTargetClient) BackupDelete(backupURL string, credential map[string]string) error { logrus.Infof("Start deleting backup %s", backupURL) _, err := btc.ExecuteEngineBinaryWithoutTimeout("backup", "rm", backupURL) if err != nil { if types.ErrorIsNotFound(err) { + logrus.Infof("Backup %s not found during deletion, considering it as already deleted", backupURL) return nil } return errors.Wrapf(err, "error deleting backup %v", backupURL) } logrus.Infof("Complete deleting backup %s", backupURL) return nil }
types/types.go (1)
760-762
: LGTM! Consider adding an error check for consistency.The new
ErrorIsInProgress
function is well-implemented and consistent with other error checking functions in the file. It efficiently checks for the "in progress" substring in the error message.For consistency with other error checking functions in this file, consider adding a nil check:
func ErrorIsInProgress(err error) bool { + if err == nil { + return false + } return strings.Contains(err.Error(), "in progress") }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (5)
- controller/backup_controller.go (9 hunks)
- controller/backup_volume_controller.go (1 hunks)
- engineapi/backups.go (1 hunks)
- k8s/pkg/apis/longhorn/v1beta2/backup.go (1 hunks)
- types/types.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- controller/backup_volume_controller.go
- k8s/pkg/apis/longhorn/v1beta2/backup.go
🧰 Additional context used
🔇 Additional comments (11)
engineapi/backups.go (1)
302-305
: Improved error handling in BackupGet methodThe removal of the
ErrorIsNotFound
check enhances the robustness of the backup retrieval process. This change ensures that any failure to retrieve a backup configuration, including "not found" errors, is properly reported. This aligns well with the new asynchronous deletion feature, where a backup in the "Deleting" state should not be treated as non-existent.controller/backup_controller.go (10)
38-39
: LGTM: New message type constants addedThe addition of
MessageTypeReconcileInfo
andMessageTypeReconcileError
constants improves the categorization of messages in the backup status. The naming is clear and consistent with existing conventions.
43-48
: LGTM: New message constants added for backup processThe addition of these message constants improves the clarity and consistency of log messages throughout the backup process. They cover various scenarios including waiting for snapshots, engines, and backup deletions, as well as error cases.
51-54
: LGTM: Deletion interval constants addedThe addition of
DeletionMinInterval
andDeletionMaxInterval
constants provides clear boundaries for the deletion backoff mechanism. The chosen values (1 minute minimum, 24 hours maximum) seem reasonable for managing deletion retries.
56-60
: LGTM: DeletingStatus struct addedThe new
DeletingStatus
struct is a good addition for tracking the state and potential error messages during the backup deletion process. It provides a clear structure for managing deletion status information.
81-86
: LGTM: BackupController struct updated for asynchronous deletionThe additions to the
BackupController
struct (deletingMapLock
,inProgressDeletingMap
, anddeletingBackoff
) provide the necessary components to support asynchronous deletion of backups. The use of a mutex for thread-safe access to the map and a backoff mechanism for managing retries are good practices.
121-124
: LGTM: Proper initialization of new BackupController fieldsThe new fields in the
BackupController
struct are correctly initialized in the constructor. The mutex is initialized to its zero value, the map is created empty, and the backoff mechanism is set up with the defined min and max intervals.
784-795
: LGTM: Prevention of backup creation during deletionThe addition of a check to prevent new backup creation while any backup is in the
Deleting
state is a good safeguard. This helps maintain consistency in the backup store and prevents potential conflicts during the deletion process.
945-947
: LGTM: BackupStateDeleting added to final statesAdding
BackupStateDeleting
to the list of final states in thebackupInFinalState
method is consistent with the new asynchronous deletion feature. This ensures that backups in the deleting state are treated appropriately in the controller's logic.
949-972
: LGTM: Asynchronous backup deletion implementationThe
startDeletingBackupInBackupStore
method is a well-implemented solution for asynchronous backup deletion. It correctly uses mutex locking, includes panic recovery in the goroutine, and properly handles errors by updating the deletion status map. This approach allows for efficient processing of deletion requests without blocking the controller.
1044-1050
: LGTM: Thread-safe update of deletion statusThe
setInprogressDeletionMap
method provides a thread-safe way to update the status of an in-progress deletion. It correctly uses a mutex to protect access to the shared map and updates both the state and error message. This method adheres to the principle of single responsibility and is well-implemented.
a72d8e3
to
f47e12e
Compare
ref: longhorn/longhorn 8746 Signed-off-by: Jack Lin <jack.lin@suse.com>
f47e12e
to
37a4dff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (5)
- controller/backup_controller.go (9 hunks)
- controller/backup_volume_controller.go (1 hunks)
- engineapi/backups.go (1 hunks)
- k8s/pkg/apis/longhorn/v1beta2/backup.go (1 hunks)
- types/types.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- controller/backup_volume_controller.go
- engineapi/backups.go
- k8s/pkg/apis/longhorn/v1beta2/backup.go
- types/types.go
🧰 Additional context used
🔇 Additional comments (4)
controller/backup_controller.go (4)
38-39
: LGTM: New constants and struct for backup deletion processThe new constants and
DeletingStatus
struct are well-defined and support the enhanced backup deletion process. They provide clear message templates and a structured way to track deletion status.Also applies to: 43-48, 51-59
120-124
: LGTM: Proper initialization of new fieldsThe new fields in the BackupController struct are correctly initialized in the NewBackupController function. This ensures that the controller starts with the proper state for managing backup deletions.
1047-1053
: LGTM: Thread-safe update of deletion statusThe
setInprogressDeletionMap
method correctly uses the mutex to ensure thread-safe updates to theinProgressDeletingMap
. This is crucial for maintaining data integrity in a concurrent environment.
Line range hint
1-1053
: Overall assessment: Significant improvements with minor suggestionsThe changes to the BackupController significantly enhance its capabilities, particularly in managing the backup deletion process. The new methods and data structures are well-designed and improve the controller's ability to handle concurrent operations and implement backoff strategies.
Key improvements:
- Introduction of a structured deletion process with clear state management.
- Implementation of a backoff mechanism for retrying deletions.
- Asynchronous deletion to improve performance.
Suggestions for further improvement:
- Consider using
sync.Mutex
directly instead of a pointer.- Optimize the blocking mechanism for new backups during deletions.
- Enhance error handling in the asynchronous deletion process.
- Implement cleanup of the
inProgressDeletingMap
after successful deletions.These changes represent a substantial improvement to the backup management system, with the suggested modifications potentially further enhancing its robustness and efficiency.
ref: longhorn/longhorn#8746
follow LEP to implement: longhorn/longhorn#9152