-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI/CD conventions for metrics #1111
Comments
We can use label area:cicd instead of |
Currently have a pull request open to change the metrics in the Git Provider Receiver component within the OTEL Collector to better match the new conventions set in the registry. I think this can help provide contextual implementation details as part of this conversation. |
Let's create additional issues for the separate concerns of metrics:
Ie let's have smaller PRs to address them separately |
Area(s)
area:cicd
Is your change request related to a problem? Please describe.
This issue is to discuss attributes specific to metrics and as part of the CI/CD Working Group and Semantic Conventions WG.
Also a challenge specific to metrics can the time series cardinality when CICD observes metrics for individual builds.
Describe the solution you'd like
Following #1075 (by adjusting the vocabulary here below to align with #1075) we should define metric attributes for
Additionally it should be possible to opt-in to metrics specific to a particular pipelineRun.
These could be metrics about the agent which executes a pipelineRun, the OS, network, jvm, the number of failed/total tests …
We need to specify the attribute which should link these metrics to the pipelineRun, eg.
pipeline.run.id
Metrics specific to a pipelineRun are of high cardinality. We should document this as a warning and give guidance how these metrics can be efficiently encoded in the OTel protocol, ie by using resource attributes instead of metric attributes wherever possible.
Describe alternatives you've considered
Span metrics could be used for duration and count of pipelineRuns, however this relies on the pipelineRuns having completed.
This is due to limitations inherent in using traces to represent pipelineRuns, a span can only be sent when complete.
Due to this limitation it could be preferable for the CICD system to expose metrics directly about the duration, count and status of pipelineRuns. These pipelineRuns could account also for in progress builds.
Additional context
CICD metrics were discussed at KubeCon March 2024 SemConv users meeting.
High cardinality was highlighted as an issue for per build metrics.
Notes on how to deal with cardinality were:
This added information might make it easier to identify pipelineRuns that need investigation.
but backends (eg. Prometheus) would still have the cardinality issue when storing the time series
(metric / resource attributes would be flattened into time series).
The text was updated successfully, but these errors were encountered: