fix: Avoid error in unpause operation of Cluster resource #1469

AgustinBettati · 2025-10-31T18:17:31Z

Proposed changes

Problem

Cluster resource does not support unpausing. This use case must be handled by defining a separate request only adjusting paused property due to how the API behaves.

Current implement supports pausing a cluster handling a separate request within update callback logic.

Solution

Before sending an update operation information of the complete model (which would result in an error) a check is made to see if the update operation is doing an unpause. Under this scenario an isolated PATCH request is crafted only sending paused property with false. Follow up PATCH request can be sent after this without encountering any errors (in case other changes are present).

Testing

Introduced a new e2e test for verifying successful creation, pause, and unpause operations.
Created stack using private registry verifying pause/unpause operations.

Type of change:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as
expected)
This change requires a documentation update
If changes include removal or addition of 3rd party GitHub actions, I updated our internal document. Reach out to the APIx Integration slack channel to get access to the internal document.

Manual QA performed:

cfn invoke for each of CRUDL/cfn test
Updated resource in example
Published to AWS private registry
Used the template in example to create and update a stack in AWS
Deleted stack to ensure resources are deleted
Created multiple resources in same stack
Validated in Atlas UI
Included screenshots

Required Checklist:

I have signed the MongoDB CLA
I have added tests that prove my fix is effective or that my feature works
I have checked that this change does not generate any credentials and that they are NOT accidentally logged anywhere.
I have added any necessary documentation (if appropriate)
I have run make fmt and formatted my code
For CFN Resources: I have released by changes in the private registry and proved by change
works in Atlas

Further comments

AgustinBettati · 2025-11-03T16:15:26Z

.github/workflows/e2e-testing.yaml

+        # run idividual test allowing for parallel execution. Due to usage of t.Setenv() in test code, t.Parallel() is not possible
+        run: |
+          cd cfn-resources/test/e2e/cluster
+          go test -timeout 90m -v -run '^TestClusterPauseCFN$' .    


While not ideal this was a solution that did not require a significant refactor to how our CFN e2e are currently executed and allow the new test to run in parallel. Currently e2e tests rely on executing shell scripts and setting env variables during execution.

Copilot

Pull Request Overview

This PR fixes a bug where unpausing a cluster through CloudFormation would fail due to how the Atlas API handles the unpause operation. The fix introduces a separate PATCH request for unpausing that only updates the paused property before applying any other changes.

Adds special handling for unpause operations before the main update request
Introduces comprehensive e2e tests for cluster pause/unpause functionality
Updates CI workflow to run the new pause test independently

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
cfn-resources/cluster/cmd/resource/resource.go	Implements unpause-specific handling logic to avoid API errors
cfn-resources/test/e2e/cluster/cluster_pause_test.go	Adds e2e test validating create, pause, and unpause operations
cfn-resources/test/e2e/cluster/cluster_pause.json.template	Provides CloudFormation template for pause/unpause testing
.github/workflows/e2e-testing.yaml	Updates workflow to run cluster pause tests separately

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/workflows/e2e-testing.yaml

manupedrozo · 2025-11-03T17:42:56Z

.github/workflows/e2e-testing.yaml

-          go test -timeout 90m -v cluster_test.go
+          go test -timeout 90m -v -run '^TestClusterCFN$' .
+
+  cluster-pause:


Curious why we are introducing a new workflow as opposed to using the cluster one for new tests.

Initial motivation is to add the new test but have them run in parallel (sequentially they take 40min + 20min). Due to how are e2e tests are implemented t.Parallel() cannot be used, Go returns an explicit error panic: testing: test using t.Setenv or t.Chdir can not use t.Parallel.
This is the reasoning behind adding a separate test group so both tests can run in parallel, tried to capture this why in the comment.

Got it. I see that you changed to a single workflow running 2 packages now. Smart! Ty!

single workflow running 2 packages now

Was close but did not work 😞 , publishing script modifies specific files in cluster package (for private publishing), need to run the tests in isolated filesystems to avoid race conditions.

EspenAlbert · 2025-11-03T19:00:08Z

cfn-resources/cluster/cmd/resource/resource.go

+	}
+
+	// Unpausing must be handled separately from other updates to avoid errors from the API.
+	if pe := handleUnpausingUpdate(client, currentCluster, currentModel); pe != nil {


Is there any need to wait for a stable cluster state? Is it ok we immediately send a different update afterwards?

There is not, was able to verify that once a first PATCH is sent doing the unpause, having the cluster in UPDATING allows sending other PATCH requests modifying the cluster

cfn-resources/test/e2e/cluster/cluster_pause_test.go

lantoli

LGTM, let me know if you want to discuss about testing here

…ctory" This reverts commit f620f31.

This reverts commit e6d5b76.

…luster test" This reverts commit ecddf7b.

manupedrozo · 2025-11-04T17:02:30Z

.github/workflows/e2e-testing.yaml

+
+  # Run idividual test in separate test group for parallel execution. 
+  # Due to usage of t.Setenv() in test code t.Parallel() is not possible.
+  # Having both tests run in same `go test` execution is not possible due to scripts used for private registry publishing modifying same files 


Ty for the comment

AgustinBettati added 11 commits October 31, 2025 12:43

logic for handling unpause of cluster

73d97e7

adding e2e test replicating pause and unpause

16ffc23

Merge remote-tracking branch 'origin/master' into CLOUDP-356110

b7cc45d

adding docs clarification

19ae278

fixing CI for paused test cluster

ae4126d

allowing execution to run in parallel

1684889

add separate group for executing pause test

a367cdf

removing parallel flag

b776377

make each test group run specific test

43a556b

allow update in paused and other changes within same update

3903fde

fix linting

f4acd9d

AgustinBettati commented Nov 3, 2025

View reviewed changes

AgustinBettati marked this pull request as ready for review November 3, 2025 16:17

AgustinBettati requested a review from a team as a code owner November 3, 2025 16:17

Copilot AI review requested due to automatic review settings November 3, 2025 16:17

Copilot AI reviewed Nov 3, 2025

View reviewed changes

.github/workflows/e2e-testing.yaml Outdated Show resolved Hide resolved

.github/workflows/e2e-testing.yaml Outdated Show resolved Hide resolved

marcosuma approved these changes Nov 3, 2025

View reviewed changes

manupedrozo reviewed Nov 3, 2025

View reviewed changes

manupedrozo approved these changes Nov 3, 2025

View reviewed changes

EspenAlbert reviewed Nov 3, 2025

View reviewed changes

lantoli reviewed Nov 4, 2025

View reviewed changes

cfn-resources/test/e2e/cluster/cluster_pause_test.go Outdated Show resolved Hide resolved

lantoli reviewed Nov 4, 2025

View reviewed changes

cfn-resources/test/e2e/cluster/cluster_pause_test.go Outdated Show resolved Hide resolved

lantoli approved these changes Nov 4, 2025

View reviewed changes

AgustinBettati added 3 commits November 4, 2025 12:14

having single test group running multiple packages for each cluster test

ecddf7b

simplify asserts

e6d5b76

support e2e test directory being different from resource directory

f620f31

EspenAlbert approved these changes Nov 4, 2025

View reviewed changes

AgustinBettati added 3 commits November 4, 2025 16:49

Revert "support e2e test directory being different from resource dire…

0c3cc15

…ctory" This reverts commit f620f31.

Revert "simplify asserts"

d43392a

This reverts commit e6d5b76.

Revert "having single test group running multiple packages for each c…

cfd5afe

…luster test" This reverts commit ecddf7b.

improving comment of why single test group is challenging

a843f04

AgustinBettati added this pull request to the merge queue Nov 4, 2025

Merged via the queue into master with commit 4f7a75b Nov 4, 2025
40 checks passed

AgustinBettati deleted the CLOUDP-356110 branch November 4, 2025 17:00

manupedrozo reviewed Nov 4, 2025

View reviewed changes

fix: Avoid error in unpause operation of Cluster resource #1469

fix: Avoid error in unpause operation of Cluster resource #1469

Conversation

AgustinBettati commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Problem

Solution

Testing

Type of change:

Manual QA performed:

Required Checklist:

Further comments

Uh oh!

AgustinBettati Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

manupedrozo Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

AgustinBettati Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

manupedrozo Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

AgustinBettati Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

EspenAlbert Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AgustinBettati Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lantoli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

manupedrozo Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

AgustinBettati commented Oct 31, 2025 •

edited

Loading

AgustinBettati Nov 3, 2025 •

edited

Loading

EspenAlbert Nov 3, 2025 •

edited

Loading