Skip to content

[ingest] Per processor metrics #33387

Closed
Closed
@jakelandis

Description

@jakelandis

Currently if trying to review metrics for the ingest node, the lowest granularity is per pipeline [1]. Ideally the metrics would also show per process information too.

The current behavior of GET _nodes/stats/ingest

"nodes": {
  ...
      "ingest": {
        "total": {
          "count": 25,
          "time_in_millis": 8,
          "current": 0,
          "failed": 0
        },
        "pipelines": {
          "mypipeline3": {
            "count": 0,
            "time_in_millis": 0,
            "current": 0,
            "failed": 0
          },
          "mypipeline2": {
            "count": 6,
            "time_in_millis": 0,
            "current": 0,
            "failed": 0
          },
          "mypipeline": {
            "count": 19,
            "time_in_millis": 8,
            "current": 0,
            "failed": 0
          }
        }
      }
    }

The proposal is here is to add the same stats as the pipeline has, but also to the processor. For example:

{
   "mypipeline":{
      "count":19,
      "time_in_millis":8,
      "current":0,
      "failed":0,
      "processors":[
         {
            "set":{
            "count":19,
            "time_in_millis":4,
            "current":0,
            "failed":0
            }
         },
         {
            "rename":{
            "count":19,
            "time_in_millis":4,
            "current":0,
            "failed":0
           }
         }
      ]
   }
}

Since each processor can have a on_failure processor, and that on_failure processor can also have multiple processors with other on_failure processors, the resultant JSON can (if pipeline are defined this way) result in quite verbose tree like structures. However, I suspect that most pipeline don't go too deep into setting multiple on_failure handlers.

EDIT: After further discussion, the on_failure processors should be considered part of the parent processor.

Some recent additions to the ingest node capability should also be addressed here too. (open for discussion)

  • conditional if Ingest: Add conditional per processor #32398 . Attempts should be made to hide implementation details for the if conditional report any metrics that include the time take and/errors produced by the if conditional as part of the processor itself.
  • calling other pipelines via the pipeline processor : INGEST: Add Pipeline Processor #32473 . The pipeline processor should show as any other processor, however, the metrics for this pipeline AND that pipeline should both increase.

Also, tag's should be supported in the naming. Likely via the main name, for example rename:my_tag if the tag is defined.

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html#ingest-stats

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions