Skip to content

Schema for metrics #474

Open
Open
@axw

Description

@axw

In elastic/beats#11836 we identified the need to maintain some kind of central definition of metrics that are implemented by both Beats and Elastic APM. The APM agents now produce a subset of system.cpu.*, system.memory.*, and system.process.* metrics defined by Metricbeat. The APM agents are now starting to define language/runtime-specific metrics, and Metricbeat will also start producing at least some of these.

We would like to propose extending ECS to cover metrics.

Organisation

I propose we introduce new field sets for:

  • JVM (jvm.*)
  • Go (golang.*)
  • Node.js (nodejs.*)
  • .NET (dotnet.*)
  • Python (python.*)
  • Ruby (ruby.*)

These field sets may contain a mixture of metrics (e.g. heap usage, GC pause timings) and runtime details (e.g. JVM implementation name and version).

Existing system metrics should be renamed to fit into the existing ECS field sets, e.g.

  • system.cpu.* -> host.cpu.*
  • system.memory.* -> host.memory.*
  • system.process.* -> process.*

Naming conventions

Ideally, we would extend the conventions documentation to cover metric naming. Specifically, we should ensure the consistent use of units in metric names, to ensure metrics are discoverable and self-documenting, both of which are important in Metrics Explorer.

One challenge here is that some existing metrics have inconsistent naming. For example:

  • system.diskio.read.count, system.process.cpu.user.ticks (we should either pick .reads or .ticks.count)
  • system.process.cpu.total.value (.value isn't meaningful, and there's no unit)
  • system.memory.total vs. system.memory.used.bytes (both are bytes, but only one says so in the name)

We should review https://github.com/elastic/beats/blob/master/docs/devguide/event-conventions.asciidoc#standardised-names, revising it for inclusion in the ECS conventions.

Key points:

  • all metrics must specify a unit. This means finding an alternative to the following rule from the Beats conventions, since "value" is not a unit:

    If a field name matches the namespace used for nested fields, add .value to the field name.

  • units are not necessary SI, they may be domain-specific. e.g. "objects", "mallocs" are fine

Alternatively to including units in names, we could wait for elastic/elasticsearch#31244 or elastic/elasticsearch#33267. However, there are existing fields in ECS that include units (network.bytes, http.request.bytes, ...), so it may be best to not wait, and do that a future revision across the board.

Open questions:

  • What do we do about existing non-compliant naming? A few options:
    • Rename in Elastic Stack 8.0. This may mean breaking existing dashboards, or introducing field aliases. This option is implied in the proposed change to splitting the system metrics into host.* and process.* field sets above.
    • Include existing metrics as-is, with a TODO to make their naming consistent in the future.
    • Don't include these existing metrics in ECS.

Metrics implementation guide

The definition of metrics is sometimes subtle and platform-specific; without providing a guide to this, it is easy for inconsistencies to arise in implementations. We should provide a detailed guide to calculating these metrics either in ECS, or in a companion document linked from ECS.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions