Skip to content

Snapshot lifecycle management #38461

Closed
Closed
@dakrone

Description

@dakrone

ILM has been included in Elasticsearch, which allows us to manage the lifecycle
of an index, however, this lifecycle management does not currently include
periodic snapshots of the index.

In order to provide a full replacement for other cluster periodic management
tools out there (such as Curator), we should add snapshot management to
Elasticsearch.

Ideally this would fall under the same sort of management than ILM provides, the
difference, however, is that snapshots are multi-index whereas index lifecycle
policies are applied to a single index (and all actions are executed on a single
index).

We need a way of specifying a periodic and/or scheduled snapshots of a given set
of indices using a specific repository, perhaps something like this (all of the
API is made up)

PUT /_slm/policy/snapshot-every-day
{
  // Run this every day at 2:30am
  "schedule": "0 30 2 * * ?",

  // What the snapshot should be named, supporting date-math
  "name": "<production-snap-{now/d}>",

  // Which snapshot repository to use for the snapshot
  "repository": "my-s3-repository",

  // "config" is a map of all the options that the regular snapshot API takes
  "config": {
    "indices": ["foo-*", "important"],
    "ignore_unavailable": true,
    "include_global_state": false
  }
}

Elasticsearch will then manage taking snapshots of the given indices for the
repository on the schedule specified. The status of the snapshots would have to
be stored somewhere, likely in an index (.tasks perhaps?)

Some other things that would be nice (but not required) to support:

  • Snapshots every N minutes. Where N only starts counting from the completion of
    the previous snapshot (for example, a snapshot every 30 minutes that takes 4
    minutes to complete would start a snapshot at 00:00, and then the next would
    be 00:34 - 30 minutes after the completion of the previous snapshot).
  • Retention of snapshots. Specifying something like "max_count": 10 meaning to
    keep the last 10 snapshots, or "max_age": "7d" meaning to keep a weeks'
    worth of snapshots, the old snapshot deletion would be managed by ES.

Task Checklist

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions