Skip to content

Reindexing steps for TSDB enabled data streams for conflicting fields #8085

Open
@ali786XI

Description

@ali786XI

Main Issue

Reindexing steps document

Related issues

Description

This issue provides the detailed reindexing steps for TSDB enabled data streams that need to be followed when there are field conflicts have been found because of the mismatched datatype.

For example, let's say host.ip field is shown conflicted under metrics-* data view, then this issue can be solved by reindexing the particular data stream's indices.

To reindex the data, the following steps must be performed.

Step 1
Stop the data stream by going to Integrations -> <integration_name> -> Integration policies open the configuration of integration and disable the impacted data stream and save the integration.

Step 2
Copy data into the temporary index by performing the following steps in the Dev tools.

POST _reindex
{
  "source": {
    "index": "<index_name>"
  },
  "dest": {
    "index": "temp_index"
  }
}  

Example:

POST _reindex
{
  "source": {
    "index": "metrics-dummy.cluster-default"
  },
  "dest": {
    "index": "temp_index"
  }
}

Step 3
Note down the following values from the backing indices and index template of the data stream to be re indexed.

  • Copy the index template(under Preview) of the data stream to be reindexed from Stack Management -> Index Management -> Index Templates
  • Set index.time_series.start_time and index.time_series.end_time index settings to match the lowest and highest @timestamp values in the old data stream. ( Set the values to one second before and after the lowest and highest @timestamp values )
  • Set the index.number_of_shards index setting to the sum of all primary shards of all backing indices of the old data stream.
  • Set index.number_of_replicas to zero and unset the index.lifecycle.name index setting.
  • Set the index pattern setting to match the format of applicable data streams.
"index_patterns": ["metrics-dummy.cluster-*"]

Step 4
Create the index template after setting all the parameters mentioned in Step 3. (Here we will create a clone template hence the name metrics-dummy.cluster-copy)

POST _index_template/metrics-dummy.cluster-copy
{
  "index_patterns": ["metrics-dummy.cluster-*"],
  "template": {
    "settings": {
      "index": {
        "number_of_shards" : 2,
        "number_of_replicas": 0,
        "mode": "time_series",
        "codec": "best_compression",
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_hot"
            }
          }
        },
….
….
….
….
                    "count": {
                      "type": "long",
                      "time_series_metric": "counter"
                    }
                  }
                },
                "uptime": {
                  "properties": {
                    "sec": {
                      "type": "long",
                      "time_series_metric": "gauge"
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "aliases": {}
  }
}

Step 5
Now navigate to the created index template Stack Management -> Index Management -> Index Templates and click on the Manage-> Edit.
Under Logistics, enable the Create data stream and set the priority to 300 (it should be greater than that of the metrics-dummy.cluster-default index template).

Step 6
Delete the existing data stream by performing the following steps in the Dev tools.

DELETE /_data_stream/<data_stream>

Example:

DELETE /_data_stream/metrics-dummy.cluster-default

Step 7
Copy data from the temporary index to the new index by performing the following steps in the Dev tools.

POST _reindex
{
  "conflicts": "proceed",
  "source": {
    "index": "temp_index"
  },
  "dest": {
    "index": "<index_name>",
    "op_type": "create"
  }
}

Example:

POST _reindex
{
  "conflicts": "proceed",
  "source": {
    "index": "temp_index"
  },
  "dest": {
    "index": "metrics-dummy.cluster-default",
    "op_type": "create"
  }
}

Step 8
Verify data is reindexed completely and the conflicts are resolved.

Step 9
Now navigate to the created index template Stack Management -> Index Management -> Index Templates and click on the Manage-> Edit.
Under Logistics, unset the priority which was set in Step 5.

Step 10
Invoke the rollover api on the destination data stream without any conditions set.

POST /<data_stream>/_rollover

Example:

POST /metrics-dummy.cluster-default/_rollover

Step 11
Delete temporary index and index template by performing the following step in the Dev tools.

DELETE temp_index
DELETE metrics-dummy.cluster-copy

Step 12
Start the data stream by going to the Integrations -> <integration_name> -> Integration policies and open configuration of integration and enable the Collect <integration_name> metrics toggle and save the integration.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions