Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate process metrics with semconv yaml #330

Merged
merged 15 commits into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .yamllint
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
extends: default

ignore-from-file:
- .gitignore

rules:
document-start: disable
octal-values: enable
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,20 @@ release.
([#484](https://github.com/open-telemetry/semantic-conventions/pull/484))
- Depluralize labels for pod (`k8s.pod.labels.*`) and container (`container.labels.*`) resources
([#625](https://github.com/open-telemetry/semantic-conventions/pull/625))
- BREAKING: Generate process metrics from YAML
([#330](https://github.com/open-telemetry/semantic-conventions/pull/330))
- Rename `process.threads` to `process.thread.count`
- Rename `process.open_file_descriptors` to `process.open_file_descriptor.count`
- Rename attributes for `process.cpu.*`
- `state` to `process.cpu.state`
- Change attributes for `process.disk.io`
- Instead of `direction` use `disk.io.direction` from global registry
- Change attributes for `process.network.io`
- Instead of `direction` use `network.io.direction` from global registry
- Rename attributes for `process.context_switches`
- `type` to `process.context_switch_type`
- Rename attributes for `process.paging.faults`
- `type` to `process.paging.fault_type`

### Features

Expand Down
220 changes: 200 additions & 20 deletions docs/system/process-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,17 @@ metrics](/docs/runtime/README.md#metrics).

<!-- toc -->

- [Metric Instruments](#metric-instruments)
* [Process](#process)
- [Attributes](#attributes)
- [Process Metrics](#process-metrics)
* [Metric: `process.cpu.time`](#metric-processcputime)
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
* [Metric: `process.cpu.utilization`](#metric-processcpuutilization)
* [Metric: `process.memory.usage`](#metric-processmemoryusage)
* [Metric: `process.memory.virtual`](#metric-processmemoryvirtual)
* [Metric: `process.disk.io`](#metric-processdiskio)
* [Metric: `process.network.io`](#metric-processnetworkio)
* [Metric: `process.thread.count`](#metric-processthreadcount)
* [Metric: `process.open_file_descriptor.count`](#metric-processopen_file_descriptorcount)
* [Metric: `process.context_switches`](#metric-processcontext_switches)
* [Metric: `process.paging.faults`](#metric-processpagingfaults)

<!-- tocstop -->

Expand All @@ -35,27 +43,199 @@ metrics](/docs/runtime/README.md#metrics).
> * SHOULD introduce a control mechanism to allow users to opt-in to the new
> conventions once the migration plan is finalized.

## Metric Instruments
## Process Metrics

### Process
### Metric: `process.cpu.time`
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved

Below is a table of Process metric instruments.
This metric is [recommended][MetricRecommended].

| Name | Instrument Type ([\*](/docs/general/metrics.md#instrument-types)) | Unit | Description | Labels |
|---------------------------------|----------------------------------------------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `process.cpu.time` | Counter | s | Total CPU seconds broken down by different states. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.cpu.utilization` | Gauge | 1 | Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.memory.usage` | UpDownCounter | By | The amount of physical memory in use. | |
| `process.memory.virtual` | UpDownCounter | By | The amount of committed virtual memory. | |
| `process.disk.io` | Counter | By | Disk bytes transferred. | `direction` SHOULD be one of: `read`, `write` |
| `process.network.io` | Counter | By | Network bytes transferred. | `direction` SHOULD be one of: `receive`, `transmit` |
| `process.threads` | UpDownCounter | {thread} | Process threads count. | |
| `process.open_file_descriptors` | UpDownCounter | {count} | Number of file descriptors in use by the process. | |
| `process.context_switches` | Counter | {count} | Number of times the process has been context switched. | `type` SHOULD be one of: `involuntary`, `voluntary` |
| `process.paging.faults` | Counter | {fault} | Number of page faults the process has made. | `type`, if specified, SHOULD be one of: `major` (for major, or hard, page faults), `minor` (for minor, or soft, page faults). |
<!-- semconv metric.process.cpu.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.cpu.time` | Counter | `s` | Total CPU seconds broken down by different states. |
<!-- endsemconv -->

## Attributes
<!-- semconv metric.process.cpu.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.cpu.state` | string | The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `system` | Recommended |

Process metrics SHOULD be associated with a [`process`](/docs/resource/process.md#process) resource whose attributes provide additional context about the process.
`process.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `system` | system |
| `user` | user |
| `wait` | wait |
<!-- endsemconv -->

### Metric: `process.cpu.utilization`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.cpu.utilization(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.cpu.utilization` | Gauge | `1` | Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process. |
<!-- endsemconv -->

<!-- semconv metric.process.cpu.utilization(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.cpu.state` | string | The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `system` | Recommended |

`process.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `system` | system |
| `user` | user |
| `wait` | wait |
<!-- endsemconv -->

### Metric: `process.memory.usage`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.memory.usage(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.memory.usage` | UpDownCounter | `By` | The amount of physical memory in use. |
<!-- endsemconv -->

<!-- semconv metric.process.memory.usage(full) -->
<!-- endsemconv -->

### Metric: `process.memory.virtual`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.memory.virtual(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.memory.virtual` | UpDownCounter | `By` | The amount of committed virtual memory. |
<!-- endsemconv -->

<!-- semconv metric.process.memory.virtual(full) -->
<!-- endsemconv -->

### Metric: `process.disk.io`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.disk.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.disk.io` | Counter | `By` | Disk bytes transferred. |
<!-- endsemconv -->

<!-- semconv metric.process.disk.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`disk.io.direction`](../attributes-registry/disk.md) | string | The disk IO operation direction. | `read` | Recommended |

`disk.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `read` | read |
| `write` | write |
<!-- endsemconv -->

### Metric: `process.network.io`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.network.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.network.io` | Counter | `By` | Network bytes transferred. |
<!-- endsemconv -->

<!-- semconv metric.process.network.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`network.io.direction`](../attributes-registry/network.md) | string | The network IO operation direction. | `transmit` | Recommended |

`network.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `transmit` | transmit |
| `receive` | receive |
<!-- endsemconv -->

### Metric: `process.thread.count`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.thread.count(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.thread.count` | UpDownCounter | `{thread}` | Process threads count. |
<!-- endsemconv -->

<!-- semconv metric.process.thread.count(full) -->
<!-- endsemconv -->

braydonk marked this conversation as resolved.
Show resolved Hide resolved
### Metric: `process.open_file_descriptor.count`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.open_file_descriptor.count(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.open_file_descriptor.count` | UpDownCounter | `{count}` | Number of file descriptors in use by the process. |
<!-- endsemconv -->

<!-- semconv metric.process.open_file_descriptor.count(full) -->
<!-- endsemconv -->

### Metric: `process.context_switches`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.context_switches(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.context_switches` | Counter | `{count}` | Number of times the process has been context switched. |
<!-- endsemconv -->

<!-- semconv metric.process.context_switches(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.context_switch_type` | string | Specifies whether the context switches for this data point were voluntary or involuntary. | `voluntary` | Recommended |

`process.context_switch_type` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `voluntary` | voluntary |
| `involuntary` | involuntary |
<!-- endsemconv -->

### Metric: `process.paging.faults`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.paging.faults(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.paging.faults` | Counter | `{fault}` | Number of page faults the process has made. |
<!-- endsemconv -->

<!-- semconv metric.process.paging.faults(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.paging.fault_type` | string | The type of page fault for this data point. Type `major` is for major/hard page faults, and `minor` is for minor/soft page faults. | `major` | Recommended |

`process.paging.fault_type` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `major` | major |
| `minor` | minor |
<!-- endsemconv -->

[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.26.0/specification/document-status.md
[MetricRecommended]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.26.0/specification/metrics/metric-requirement-level.md#recommended
120 changes: 120 additions & 0 deletions model/metrics/process-metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
groups:
- id: attributes.process.cpu
prefix: process.cpu
type: attribute_group
brief: "Attributes for process CPU metrics."
attributes:
- id: state
joaopgrassi marked this conversation as resolved.
Show resolved Hide resolved
brief: "The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels."
type:
allow_custom_values: true
members:
- id: system
value: 'system'
- id: user
value: 'user'
- id: wait
value: 'wait'

- id: metric.process.cpu.time
type: metric
metric_name: process.cpu.time
brief: "Total CPU seconds broken down by different states."
instrument: counter
unit: "s"
attributes:
- ref: process.cpu.state

- id: metric.process.cpu.utilization
type: metric
metric_name: process.cpu.utilization
brief: "Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process."
instrument: gauge
unit: "1"
attributes:
- ref: process.cpu.state

- id: metric.process.memory.usage
type: metric
metric_name: process.memory.usage
brief: "The amount of physical memory in use."
instrument: updowncounter
unit: "By"
attributes: []

- id: metric.process.memory.virtual
type: metric
metric_name: process.memory.virtual
brief: "The amount of committed virtual memory."
instrument: updowncounter
unit: "By"
attributes: []

- id: metric.process.disk.io
type: metric
metric_name: process.disk.io
prefix: process.disk
brief: "Disk bytes transferred."
instrument: counter
unit: "By"
attributes:
- ref: disk.io.direction

- id: metric.process.network.io
type: metric
metric_name: process.network.io
brief: "Network bytes transferred."
instrument: counter
unit: "By"
attributes:
- ref: network.io.direction

- id: metric.process.thread.count
type: metric
metric_name: process.thread.count
brief: "Process threads count."
instrument: updowncounter
unit: "{thread}"
attributes: []

- id: metric.process.open_file_descriptor.count
type: metric
metric_name: process.open_file_descriptor.count
brief: "Number of file descriptors in use by the process."
instrument: updowncounter
unit: "{count}"
attributes: []

- id: metric.process.context_switches
type: metric
metric_name: process.context_switches
brief: "Number of times the process has been context switched."
instrument: counter
unit: "{count}"
attributes:
- id: process.context_switch_type
brief: "Specifies whether the context switches for this data point were voluntary or involuntary."
type:
allow_custom_values: true
members:
- id: voluntary
value: 'voluntary'
- id: involuntary
value: 'involuntary'

- id: metric.process.paging.faults
type: metric
metric_name: process.paging.faults
brief: "Number of page faults the process has made."
instrument: counter
unit: "{fault}"
attributes:
- id: process.paging.fault_type
brief: "The type of page fault for this data point. Type `major` is for major/hard page faults, and `minor` is for minor/soft page faults."
type:
allow_custom_values: true
members:
- id: major
value: 'major'
- id: minor
value: 'minor'
Loading
Loading