Adding dry_run mode for setting data stream settings #128269

masseyke · 2025-05-21T17:42:41Z

This adds a dry_run parameter to the REST method added in #127858. If dry_run is true, it simulates setting the settings on the data stream and indices, but does not actually change any state. For example:

PUT _data_stream/my-data-stream/_settings?dry_run=true
{
    "index.lifecycle.name" : "new-test-policy",
    "index.number_of_shards": 11
}

returns:

{
  "data_streams": [
    {
      "name": "my-data-stream",
      "applied_to_data_stream": true,
      "settings": {
        "index": {
          "lifecycle": {
            "name": "new-test-policy"
          },
          "number_of_shards": "11"
        }
      },
      "effective_settings": {
        "index": {
          "lifecycle": {
            "name": "new-test-policy"
          },
          "mode": "standard",
          "number_of_shards": "11",
          "number_of_replicas": "0"
        }
      },
      "index_settings_results": {
        "applied_to_data_stream_only": [
          "index.number_of_shards"
        ],
        "applied_to_data_stream_and_backing_indices": [
          "index.lifecycle.name"
        ]
      }
    }
  ]
}

But the settings are not actually applied to the data stream or any indices.

elasticsearchmachine · 2025-05-21T20:48:35Z

Pinging @elastic/es-data-management (Team:Data Management)

Copilot

Pull Request Overview

This PR adds a dry_run mode to the data stream settings update API so that when enabled, settings are validated and simulated without being applied to the underlying data stream or its indices. Key changes include:

Updating request classes, transport actions, and REST handlers to accept a dry_run parameter.
Modifying the metadata update service to support a simulated dry run flow.
Enhancing tests and YAML-based integration tests to validate the dry_run behavior.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
server/src/test/java/org/elasticsearch/action/datastreams/UpdateDataStreamSettingsActionRequestTests.java	Updated test instance creation and mutation logic to include dry_run.
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsService.java	Added dry_run handling in updateSettings and simulation of settings update.
server/src/main/java/org/elasticsearch/action/datastreams/UpdateDataStreamSettingsAction.java	Modified request, equals, and hashCode to include dry_run.
server/src/main/java/org/elasticsearch/TransportVersions.java	Introduced a new transport version constant for dry_run.
modules/data-streams/src/yamlRestTest/resources/rest-api-spec/test/data_stream/240_data_stream_settings.yml	Added tests for verifying proper dry_run behavior.
modules/data-streams/src/main/java/org/elasticsearch/datastreams/rest/RestUpdateDataStreamSettingsAction.java	Updated REST handler to parse the dry_run parameter.
modules/data-streams/src/main/java/org/elasticsearch/datastreams/action/TransportUpdateDataStreamSettingsAction.java	Adjusted transport action methods to use the dry_run flag in processing logic.

Files not reviewed (1)

rest-api-spec/src/main/resources/rest-api-spec/api/indices.put_data_stream_settings.json: Language not supported

Comments suppressed due to low confidence (1)

modules/data-streams/src/main/java/org/elasticsearch/datastreams/action/TransportUpdateDataStreamSettingsAction.java:324

[nitpick] Although code logic is correct in the dry run branch, adding an explicit return statement after listener.onResponse(null) could improve clarity and reduce potential future maintenance errors.

if (dryRun) { listener.onResponse(null); }

Copilot · 2025-05-22T09:43:35Z

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsService.java

+                if (response.isAcknowledged()) {
+                    return clusterService.state().projectState(projectId).metadata().dataStreams().get(dataStreamName);
+                } else {
+                    throw new ElasticsearchException("Updating settings not accepted for unknown reasons");


Consider improving the error message here to provide more specific context about why the settings update was not accepted, which can aid in troubleshooting in both dry_run and regular flows.

Suggested change

throw new ElasticsearchException("Updating settings not accepted for unknown reasons");

throw new ElasticsearchException(

"Updating settings not accepted for project [" + projectId + "], data stream [" + dataStreamName +

"], with settings overrides: " + settingsOverrides.toString()

);

Sorry @masseyke, I have been using your PR's as a bit of a test bed for copilot-trial group. Feel free to ignore these and/or give feedback on how useful they are.

While I don't agree with every comment copilot has made, it has already found several bugs (or at least typos) in my PRs, and has made some good suggestions.

lukewhiting

Underlying logic and approach look sound. Just some outstanding questions about code cleanliness to explore.

lukewhiting · 2025-05-22T09:52:07Z

.../main/java/org/elasticsearch/datastreams/action/TransportUpdateDataStreamSettingsAction.java

+                        @Override
+                        public void onResponse(AcknowledgedResponse response) {
+                            UpdateDataStreamSettingsAction.DataStreamSettingsResponse.IndexSettingError error;
+                            if (response.isAcknowledged() == false) {


We're getting quite deeply nested at this point. Is it worth pulling out some of these blocks into methods / inner classes for readability?

OK I switched everything over to ActionListeners helper methods rather than creating my own ActionListeners. I think it improves readability a bit. Let me know what you think.

Thanks :-) Yeah I think that's a little easier on the eyes 👍🏻

lukewhiting · 2025-05-22T10:01:46Z

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsService.java

+        ProjectMetadata projectMetadata = clusterState.metadata().getProject(projectId);
+        Map<String, DataStream> dataStreamMap = projectMetadata.dataStreams();
+        DataStream dataStream = dataStreamMap.get(dataStreamName);
+        Settings existingSettings = dataStream.getSettings();
+
+        Template.Builder templateBuilder = Template.builder();
+        Settings.Builder mergedSettingsBuilder = Settings.builder().put(existingSettings).put(settingsOverrides);
+        Settings mergedSettings = mergedSettingsBuilder.build();
+
+        final ComposableIndexTemplate template = lookupTemplateForDataStream(dataStreamName, projectMetadata);
+        ComposableIndexTemplate mergedTemplate = template.mergeSettings(mergedSettings);
+        MetadataIndexTemplateService.validateTemplate(
+            mergedTemplate.template().settings(),
+            mergedTemplate.template().mappings(),
+            indicesService
        );
+
+        templateBuilder.settings(mergedSettingsBuilder);
+        return dataStream.copy().setSettings(mergedSettings).build();


This seems to duplicate a lot of the task implementation starting at line 148. Anything we can do to reflector that to reduce the duplication?

Is it possible to push down the dry run logic all the way to the task execution so there's no diverging path?

Is it possible to push down the dry run logic all the way to the task execution so there's no diverging path?

Is it possible to push down the dry run logic all the way to the task execution so there's no diverging path?
I'm on the fence about that. That's the way I had originally written it. In dry-run mode I would just return the original cluster state. But it felt odd running through the cluster state update mechanics just to effectively do template validation. So I pulled the template validation logic out and just ran it locally in dry-run mode. But I could see arguments either way.

This seems to duplicate a lot of the task implementation starting at line 148. Anything we can do to reflector that to reduce the duplication?

Which parts are you referring to? This code is all code I had moved out of that task logic around line 148 so that I could call it from both places w/o duplication.

Which parts are you referring to? This code is all code I had moved out of that task logic around line 148 so that I could call it from both places w/o duplication.

Sorry my bad -.- I cross referenced the class in my IDE but forgot I had checked out another branch. The looks good 👍🏻

Is it possible to push down the dry run logic all the way to the task execution so there's no diverging path?
I'm on the fence about that. That's the way I had originally written it. In dry-run mode I would just return the original cluster state. But it felt odd running through the cluster state update mechanics just to effectively do template validation. So I pulled the template validation logic out and just ran it locally in dry-run mode. But I could see arguments either way.

That's fair enough. If you have tried it and it didn't look noticeably better then I'm happy to stick with it how it is unless anyone else has strong opinions :-)

If you have tried it and it didn't look noticeably better then I'm happy to stick with it how it is unless anyone else has strong opinions :-)

I tried it again just to be sure. The problem is that when the updateSettingsExecutor is executed, the listener on my UpdateSettingsTask is notified with the updated cluster state. And since I don't want to actually update the cluster state in dry_run mode, that's the old cluster state. So I have no access to the data stream. I could do something like modify the UpdateSettingsTask with the new data stream so that it is available to the listener, but that gets kind of ugly (and those tasks aren't meant to be updated).

lukewhiting

Changes LGTM 👍🏻 Nice work!

joegallo · 2025-06-10T17:10:06Z

server/src/main/java/org/elasticsearch/action/datastreams/UpdateDataStreamSettingsAction.java

@@ -72,13 +83,21 @@ public Request(StreamInput in) throws IOException {
            super(in);
            this.dataStreamNames = in.readStringArray();
            this.settings = Settings.readSettingsFromStream(in);
+            if (in.getTransportVersion().onOrAfter(TransportVersions.SETTINGS_IN_DATA_STREAMS)) {


Shouldn't this be TransportVersions.SETTINGS_IN_DATA_STREAMS_DRY_RUN?

Adding dry_run mode for setting data stream settings

f9fdaa3

masseyke added >non-issue :Data Management/Data streams Data streams and their lifecycles v9.1.0 labels May 21, 2025

masseyke added 4 commits May 21, 2025 13:16

testing

353d1d1

merging main

83ea386

commenting, improving readability

cc19fbd

adding a comment

f5bdf7b

masseyke marked this pull request as ready for review May 21, 2025 20:48

elasticsearchmachine added the Team:Data Management Meta label for data/management team label May 21, 2025

lukewhiting requested a review from Copilot May 22, 2025 09:42

Copilot AI reviewed May 22, 2025

View reviewed changes

lukewhiting requested changes May 22, 2025

View reviewed changes

masseyke added 4 commits May 22, 2025 13:28

Merge branch 'main' into data-stream-settings-dry-run

67c4177

pulling out UpdateSingleIndexSettingsListener into its own class

051d7f0

reverting accidental change

59b8079

using helper methods for action listeners

39cf10e

masseyke requested a review from lukewhiting May 22, 2025 22:11

using wrap rather than both delegateFailure and delegateResponse

6fda776

lukewhiting approved these changes May 23, 2025

View reviewed changes

merging main

e161504

masseyke merged commit 7207692 into elastic:main May 23, 2025
18 checks passed

masseyke deleted the data-stream-settings-dry-run branch May 23, 2025 16:29

joegallo reviewed Jun 10, 2025

View reviewed changes

joegallo pushed a commit that referenced this pull request Jun 10, 2025

Adding dry_run mode for setting data stream settings (#128269)

27e9f8e

joegallo mentioned this pull request Jun 10, 2025

[8.19] Backport data stream settings #129213

Merged

joegallo pushed a commit that referenced this pull request Jun 11, 2025

Adding dry_run mode for setting data stream settings (#128269)

a92a55d

masseyke mentioned this pull request Jun 17, 2025

Claiming a transport version for the data stream settings backport #129560

Merged

Adding dry_run mode for setting data stream settings #128269

Adding dry_run mode for setting data stream settings #128269

Conversation

masseyke commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented May 21, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukewhiting left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

masseyke May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukewhiting left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

masseyke commented May 21, 2025 •

edited

Loading

masseyke May 22, 2025 •

edited

Loading