Skip to content

[Fleet] Add namespace-specific index and component templates #121118

Closed as not planned

Description

Fleet and Elastic Agent users need a mechanism to customize how their data is being ingested, mapped, and stored that will be preserved across Stack and integration package upgrades. This issue outlines how we plan to structure index and component templates to support user customizations to Fleet-managed data streams, on a namespace-level of granularity.

Scope

The goal is provide a future-proof naming scheme and template structure that will allow users to add the following customizations to Fleet-managed data streams:

  • Mappings (additive and non-additive)
  • ILM policy
  • Number of replicas, primaries, and routing shards
  • Refresh interval
  • Other general index settings (query.*, etc.)

This scheme does not allow customizing:

  • Ingest pipelines
    • Elasticsearch does not support more an arbitrary number of ingest node pipelines so the only existing way to customize ingest pipelines is to modify the one installed by Fleet which will not be preserved across package upgrades. See Specify multiple ingest pipelines for a data stream elasticsearch#61185
    • If Elasticsearch were to add a default_pipelines setting which is an array of pipelines, it’s likely that this customization scheme would be compatible.

Design

Existing scheme (as of 8.2)

The existing scheme that we use in Fleet today installs a single index template for each dataset in a package that matches data streams for all namespaces. It has the following properties:

  • name: <type>-<dataset>
  • matches: <type>-<dataset>-*
  • priority: 200
  • Component templates (highest to lowest precedence):
    • .fleet_agent_id_verification-1
      • final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
    • .fleet_globals-1
      • global settings and mappings applied to every data stream (eg. event.ingested)
    • <type>-<dataset>@custom
      • user-defined customizations (settings and/or mappings - for all namespaces)
    • <type>-<dataset>@package
      • package-defined mappings and settings

New proposed scheme

In order to preserve user customizations across upgrades, it’s important that we store their overrides in a separate component template that Fleet can copy over to new versions of the package’s index template. In this updated scheme, we will add an additional index template that is namespace-specific and of higher priority than the base template:

  • name: <type>-<dataset>-<namespace>
  • matches: <type>-<dataset>-<namespace>
  • priority: 250
  • Components (highest to lowest precedence):
    • .fleet_agent_id_verification-1
      • final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
    • .fleet_globals-1
      • global settings and mappings applied to every data stream (eg. event.ingested)
    • <type>-<dataset>-<namespace>@custom
      • namespace-specific user-defined customizations
    • <type>-<dataset>@custom
      • user-defined customizations (settings and/or mappings - for all namespaces)
    • <type>-<dataset>@package
      • package-defined mappings and settings

During package upgrades, Fleet would preserve the contents of both the ‘global’ custom template (<type>-<dataset>@custom) and the namespace-specific ones (<type>-<dataset>-<namespace>@custom) while replacing all of the other templates (including the index template). This would allow the user’s customizations to be preserved and to override any package-specific settings and mappings.

Like the ‘global’ custom template we offer today, we would allow users to directly edit the namespace-specific templates with arbitrary settings and mappings in order to override those supplied by the package. We would also use the template to store customizations that we plan to support directly in the UI (eg. setting the ILM policy).

We will not remove the base index template we install today that matches a wildcard namespace (<type>-<dataset>-*) because Elastic Agent standalone requires this template to be installed.

Changing a namespace for an existing integration policy

If a user edits an existing integration policy to point to a new namespace, we can offer them the option to copy over any customizations from the previous namespace’s <type>-<dataset>-<namespace>@custom template. We would not delete the old templates since this could affect the existing data streams and indices or any standalone agents ingesting data into this namespace.

As a separate enhancement, we could offer a ‘cleanup’ UI either in Fleet or Stack Management that shows index templates that are not currently in use.

Customize API

In order to facilitate automated usage of this scheme, we should provide a high-level package customization Kibana Fleet API in Kibana that allows admins to make customizations without worrying about the low-level details of how the templates are configured, whether or not a data stream needs to be rolled over, or how to apply the setting changes retroactively to backing indices. The main usecase for this is for standalone Agent usage. This may also be used to power in-app features for making customizations (eg. setting the ILM policy).

# Write custom settings and mappings to all namespaces
# Writes to `<type>-<dataset>@custom` templates
PUT /api/fleet/epm/nginx/customize
{
  "settings": { … },
  "mappings": { … },
}

# Add or update a namespace for an integration, creates the namespace-specific templates
# Write custom settings and mappings to namespace
# Writes to `<type>-<dataset>-<namespace>@custom` templates
PUT /api/fleet/epm/nginx/customize/namespace/foo
{
  "settings": { … },
  "mappings": { … },
}

# Removes a namespace, deleting namespace-specific templates
# Does not delete data indices or data streams
DELETE /api/fleet/epm/nginx/customize/namespace/foo

All of the other APIs should also create these namespaces automatically. For example, if an integration policy is added for the nginx package on the bar namespace, the POST /api/fleet/package_policies API should also create the appropriate namespace templates if they don't already exist.

There are additional use cases for this API outside of index templates, for example there have been other requests for namespace-specific transforms. We should design this API to accommodate future use cases easily.

Upgrade considerations

For packages that were installed before this scheme was introduced, Fleet should automatically add the appropriate namespace-specific index and component templates in order to facilitate a consistent experience for end-users. See #121099

For upgrades where any @custom components already exist, they should be retained and not removed so that they are still present once the new package version is installed. This means existing templates should also not get overwritten.

Open questions

  • When should namespace-specific templates be deleted when using the product?
    • If we're going to support a generic API that doesn't require integration or agent policies to point to namespaces, then I don't think we can do any automated cleanup else we could delete configuration that is in use by a standalone agent.
  • There are separate @custom component templates for each data stream in an integration package, while the API design proposed here would apply to the entire integration. This can present problems if a user manually edits a single component template so the data streams are not in sync, for example the source of truth is now ambiguous. How would we solve this?
    • Have a single, managed component template that is used for customizations that apply to the entire integration. Leave the @custom templates unmanaged and never edit them. (@joshdover votes for this one)
    • Store the customizations set on this API in a Kibana Saved Object and use this as the source of truth. Manual user edits to @custom templates would then be merged in after settings from this SO. This would allow manual additions and modifications to @custom templates to be preserved, however deletes would be lost.
  • How should namespace renames work? If a user renames the namespace field on an integration policy or agent policy, should we attempt to copy any customizations on the previous namespace when creating the new namespace? If not or if the new namespace already exists, should we warn the user that settings/mappings are going to change for this data?
  • How do we handle when a new dataset is added for an existing package? Should we keep a copy of any custom settings/mappings in a Saved Object and automatically apply them to all datasets during package upgrades?
  • Should the management APIs allow changes to mappings? If so when and how would the user expect these to take effect e.g would a rollover be automatic?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Team:FleetTeam label for Observability Data Collection Fleet teamenhancementNew value added to drive a business result

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions