Skip to content

[ML] Add an estimate model memory endpoint for anomaly detection #53219

Closed
@droberts195

Description

@droberts195

At present the ML UI has functionality to calculate a rough estimate of the model memory requirement for certain types of anomaly detection jobs. However, it doesn't cover all detector functions and doesn't cover population jobs.

The ML API in Elasticsearch should provide an endpoint that encapsulates the various formulas, can be extended to cover all possible configurations, and can be kept up to date when model sizes change.

The inputs to this endpoint will be:

  1. An analysis_config, in the same format as would be provided to the create job endpoint - documented in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html#ml-put-job-path-parms
  2. Overall cardinalities for the by, over and partition fields
  3. Max bucket cardinalities for influencer fields that are not also by, over or partition fields

An example of the proposed request format is:

POST _ml/anomaly_detectors/_estimate_model_memory
{
  "analysis_config": {
    "bucket_span": "10m",
    "detectors": [
      {
        "function": "sum",
        "field_name": "bytes",
        "partition_field_name": "src_ip"
      }
    ],
    "influencers": [ "src_ip", "dest_ip" ]
  },
  "overall_cardinality": {
    "src_ip": 567483
  },
  "max_bucket_cardinality": {
    "dest_ip": 7456
  }
}

An example of the proposed response format is:

{
  "model_memory_estimate": "836mb"
}

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions