fix idempotency in kubernetes state module #20

djivey · 2025-09-18T00:31:26Z

What does this PR do?

What issues does this PR fix or reference?

Fixes: This PR addresses idempotency issues in the saltext-kubernetes extension by implementing proper patch functionality for Kubernetes resources. Previously, the extension lacked the ability to intelligently update existing resources, leading to unnecessary recreation operations that could disrupt running workloads. This enhancement enables Salt to detect differences between desired and current state, applying only the necessary changes through Kubernetes patch operations, ensuring idempotent resource management.

closes #16

Previous Behavior

The state module was not idempotent for most present and absent functions.

New Behavior

This ensures that running salt '*' state.apply multiple times produces consistent results without unintended side effects, which is fundamental to Salt's declarative infrastructure management approach.

Added patch_<resource>() functions to support both dictionary patches and source file rendering
Functions that previously returned .to_dict() responses now return sanitize_for_serialization() responses. The structure is similar but may have slight differences in nested object handling.
Resources are now patched rather than replaced when differences are detected
State operations correctly report "no changes" when resources already match desired state
Eliminated unnecessary resource recreation that could cause service disruption
Enhanced state logic to properly handle both creation and modification scenarios
Added replace as an argument incase someone wanted to force a replacement of a resource
Added dry_run to help with determining changes
Updated test to reflect changes

Merge requirements satisfied?

[NOTICE] Bug fixes or new features require tests.

[x ] Docs
[ x] Changelog - https://salt-extensions.github.io/salt-extension-copier/topics/documenting/changelog.html#procedure
[x ] Tests written/updated

Commits signed with GPG?

Yes

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

djivey · 2025-09-18T00:36:50Z

@lkubb Can you review the approach I used here to achieve idempotency? I only made changes to one resource type to keep the changes as low as I can for you.

This is going to end up being another big change I believe, but if you can agree with the approach, it should be easy for me to apply across the rest of the resource types.

I am not sure that some of these tests like the integration tests are needed, but I wanted to at least show the changes worked.

djivey · 2025-09-18T00:55:31Z

I will have to look at these test failures, they worked locally

lkubb · 2025-09-18T06:44:02Z

Afaict it's just salt-extensions/salt-extension-copier#199. Sorry, forgot about this issue, will try to get a new template release out soon.

lkubb

If you merge #21 and rebase this branch, CI should pass. 👍

This looks pretty good! I added some comments that hopefully help guide you for the rest of the resources.

Once again thanks for your efforts :)

src/saltext/kubernetes/modules/kubernetesmod.py

src/saltext/kubernetes/states/kubernetes.py

lkubb · 2025-09-19T09:20:13Z

src/saltext/kubernetes/states/kubernetes.py

            ret["comment"] = "The deployment is going to be created"
+            # Simulate creation to show changes to account for using the source argument
+            try:
+                dry_run_result = __salt__["kubernetes.create_deployment"](


hint: [Not sure if that's relevant in this specific case, more general guidance]

When running with test=true, dependencies of this object might not exist because previous required states run in test mode as well, which can cause failures in test mode when it would run fine in regular mode.

In those cases, the simplest solution generally is to leave ret["result"] = None and add a hint to the comment: ret["comment"] = f"Dry run failed. If previous states ensure requirements are met, you can ignore this message. The error was: {err}"

I am not sure I follow what you mean here exactly. The primary reason I added the dry_run here is so that I can get a change result if using source because at this point that file has not been rendered yet. Using dry_run gets that to render and I can return results on the changes.

Before it was only checking and returning on metadata and spec.

Sorry, I'll try to be clearer (without knowing specifics about Kubernetes resources, hence I'll illustrate hypothetical docker state modules):

Fetch image: docker_image.present: - name: ghcr.io/foo/bar Create volume: docker_volume.present: - name: foo-vol Create container: docker_container.present: - name: foo - image: ghcr.io/foo/bar - mount: foo-vol: /app/data - require: - docker_image: ghcr.io/foo/bar - docker_volume: foo-vol Other stuff: file.managed: - name: /etc/bar/baz.conf - source: salt://foo/bar/baz.conf - require: - docker_container: foo

When applying this state on a fresh node with test=true, both docker_image and docker_volume will report that their resources will be created, but not actually create them. If docker_container.present relied on a hypothetical docker.create_container(..., dry_run=True) API, Docker could rightfully complain that neither the referenced image nor the referenced volume exist, which would trigger an exception in this case.

If docker_container.present did not have specific test mode handling for this situation, the state would report an error in test mode and dependent states (like Other stuff here) would be marked failed as well, even if the state application as a unit would work just fine without test mode.

The simplest solution is to not fail docker_container.present in test mode (since it can have requirements managed by different states that don't actually ensure the requirement is created in test mode), but instead report ret["result"] = None with a comment explaining that the test failed, but it might just be a result of running in test mode. This allows for the file.managed state to still report on changes it would make.

If a state can expect that it should work in test mode (because it can't really have any requirements, like the docker_image example), this is unnecessary.

Since I'm not familiar with Kubernetes, I can't tell if this is an issue here. Even if it is, it would be minor usability issue. Just wanted to point it out as a guide. Many state modules don't account for this.

TLDR: In general, failing test mode should be restricted to cases where we're reasonably sure there's something wrong (e.g. the ghcr.io/foo/bar image does not exist* or the connection to the repository failed).

* One could argue that in the general case, an image could be created by a previous docker_image.built state, but that would likely be a rare situation, much rarer than a missing image.

Ah, I see what you mean. The good thing about kubernetes is that most of the time all of that is in the spec - ref example below. Persistent volumes would likely be the cause of an issue here unless they set their cluster to create persistent volumes when a claim is made. Thank you for clarifying.

EDIT: Secrets and configmaps would also apply here.

spec: imagePullSecrets: - name: {{ secret }} containers: - name: {{ name }} image: {{ url }}:{{ tag }} imagePullPolicy: Always ports: - containerPort: {{ tgt_port }} volumeMounts: - name: {{ pvc_name }} mountPath: {{ pvc_mount_path }} - name: {{ secret_name }} mountPath: {{ secret_mount_path }} readOnly: true volumes: - name: {{ pvc_name }} persistentVolumeClaim: claimName: {{ pvc_name }} - name: {{ secret_name }} secret: secretName: {{ secret_name }} defaultMode: 0400

lkubb · 2025-09-19T10:24:06Z

tests/functional/states/test_kubernetes.py

    assert deployment_state["spec"]["replicas"] == 3


+def test_deployment_present_patch(kubernetes, deployment, kubernetes_exe):


suggestion: I'd test in test mode as well and assert that the resource is not changed. It's often neglected when writing a state, but can have surprising consequences if not handled correctly.

lkubb · 2025-09-19T10:24:32Z

tests/functional/states/test_kubernetes.py

+    assert deployment_state["spec"]["replicas"] == 4
+
+
+def test_deployment_present_patch_source(


suggestion: Test mode, see above

lkubb · 2025-09-19T10:48:15Z

tests/unit/modules/test_kubernetesmod.py

+            assert result == {"code": 200}
+            assert kubernetes.kubernetes.client.AppsV1Api().delete_namespaced_deployment.called


note: This test is an example for my remark on unit tests above (or below if viewed from the conversations tab :)). It only checks that delete_deployment returns the value we made the sanitize_for_serialization mock return and ensures it calls delete_namespaced_deployment. For all this brittle setup, it's quite isolated from reality and redundant since we have the functional tests.

Both assertions should be covered by the functional test. The unit test however does not check for example that

the deployment was actually deleted

the functions are called with the correct arguments

the client lib still has the delete_namespaced_deployment/sanitize_for_serialization etc. functions

the client lib still works with the Kubernetes API

etc.

Unit tests, especially with comprehensive client libs like the kubernetes one, often need a lot of surgical patching (which can make changes to the functions under test quite annoying and limits their meaningfulness).

I'd reserve unit tests for hard to test behavior like some unhappy paths, e.g. "Does the state module handle situations where an execution module raises a CommandExecutionError" or a specific ApiException.

lkubb · 2025-09-19T10:51:43Z

tests/unit/modules/test_kubernetesmod.py

note: Imo many of the current unit tests don't make sense since they test the same happy-path behavior as the functional (and integration) ones. I'd scrap the existing redundant ones instead of rewriting them and would advise to focus on comprehensive functional ones instead.

[See related comment on test_delete_deployments]

lkubb · 2025-09-19T10:52:45Z

tests/unit/states/test_kubernetes.py

suggestion: Same comment regarding unit tests here. I'd scrap most of the existing (redundant) ones instead of rewriting them/adding new ones.

djivey · 2025-09-19T12:25:50Z

If you merge #21 and rebase this branch, CI should pass. 👍

This looks pretty good! I added some comments that hopefully help guide you for the rest of the resources.

Once again thanks for your efforts :)

Thank you for taking a look. Most of this should be pretty easy to fix.

I will look closer at the unit tests, I am still wrapping my head around what they are actually for. This helps a lot. Functional and integration tests make more sense to me.

…n-copier to v0.7.5

codecov · 2025-09-19T14:41:26Z

Codecov Report

❌ Patch coverage is 82.90155% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.49%. Comparing base (70bc313) to head (6b02962).

Files with missing lines	Patch %	Lines
src/saltext/kubernetes/states/kubernetes.py	61.53%	17 Missing and 3 partials ⚠️
src/saltext/kubernetes/modules/kubernetesmod.py	64.86%	10 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #20      +/-   ##
==========================================
+ Coverage   75.25%   75.49%   +0.23%     
==========================================
  Files          17       17              
  Lines        4130     4284     +154     
  Branches      432      448      +16     
==========================================
+ Hits         3108     3234     +126     
- Misses        883      906      +23     
- Partials      139      144       +5

Flag	Coverage Δ
Linux	`75.49% <82.90%> (+0.23%)`	⬆️
macOS	`42.57% <43.52%> (+0.03%)`	⬆️
project	`54.85% <62.92%> (+0.22%)`	⬆️
py310	`75.32% <82.90%> (+0.24%)`	⬆️
py39	`75.49% <82.90%> (+0.23%)`	⬆️
salt_3006_15	`75.49% <82.90%> (+0.23%)`	⬆️
salt_3007_7	`75.32% <82.90%> (+0.24%)`	⬆️
tests	`94.40% <100.00%> (+0.21%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fix deployment idempotency and add patching to deployments

ffd2193

djivey marked this pull request as draft September 18, 2025 00:57

lkubb reviewed Sep 19, 2025

View reviewed changes

salt-extensions-renovatebot bot and others added 2 commits September 19, 2025 10:33

chore(deps): update dependency https://github.com/lkubb/salt-extensio…

a4b8a05

…n-copier to v0.7.5

Merge branch 'salt-extensions:main' into refactor_present_functions

6b02962

		assert deployment_state["spec"]["replicas"] == 3


		def test_deployment_present_patch(kubernetes, deployment, kubernetes_exe):

		assert deployment_state["spec"]["replicas"] == 4


		def test_deployment_present_patch_source(

		assert result == {"code": 200}
		assert kubernetes.kubernetes.client.AppsV1Api().delete_namespaced_deployment.called

fix idempotency in kubernetes state module #20

Are you sure you want to change the base?

fix idempotency in kubernetes state module #20

Uh oh!

Conversation

djivey commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

What issues does this PR fix or reference?

Previous Behavior

New Behavior

Merge requirements satisfied?

Commits signed with GPG?

Uh oh!

djivey commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djivey commented Sep 18, 2025

Uh oh!

lkubb commented Sep 18, 2025

Uh oh!

lkubb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lkubb Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

djivey Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

djivey Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

lkubb Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

djivey commented Sep 19, 2025

Uh oh!

codecov bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

djivey commented Sep 18, 2025 •

edited

Loading

djivey commented Sep 18, 2025 •

edited

Loading

lkubb Sep 19, 2025 •

edited

Loading

lkubb Sep 20, 2025 •

edited

Loading

djivey Sep 22, 2025 •

edited

Loading

lkubb Sep 19, 2025 •

edited

Loading

codecov bot commented Sep 19, 2025 •

edited

Loading