Skip to content

Commit

Permalink
determine the UUID's namespace
Browse files Browse the repository at this point in the history
Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
  • Loading branch information
jpkrohling committed Sep 13, 2023
1 parent 73c8230 commit 4a17ffa
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 22 deletions.
22 changes: 11 additions & 11 deletions docs/resource/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,8 @@ as specified in the [Resource SDK specification](https://github.com/open-telemet

**[1]:** A string value having a meaning that helps to distinguish a group of services, for example the team name that owns a group of services. `service.name` is expected to be unique within the same namespace. If `service.namespace` is not specified in the Resource then `service.name` is expected to be unique for all services that have no explicit namespace defined (so the empty/unspecified namespace is simply one more valid namespace). Zero-length namespace string is assumed equal to unspecified namespace.

**[2]:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled service). It is preferable for the ID to be persistent and stay the same for the lifetime of the service instance, however it is acceptable that the ID is ephemeral and changes during important lifetime events for the service (e.g. service restarts). If the service has no inherent unique ID that can be used as the value of this attribute it is recommended to generate a random Version 1 or Version 4 [RFC 4122](https://www.ietf.org/rfc/rfc4122.txt) UUID. Services aiming for reproducible UUIDs may also use Version 5. UUIDs are typically recommended, as we only need an opaque yet reproducible value for the purposes of identifying a service instance. Similar to what can be seen in the man page for the `/etc/machine-id` file, the underlying data, such as pod name and namespace should be treated as confidential by this algorithm, being the user's choice to expose it or not via another resource attribute.
SDKs are required to follow the following algorithm when generating `service.instance.id`:
**[2]:** MUST be unique for each instance of the same `service.namespace,service.name` pair (in other words `service.namespace,service.name,service.instance.id` triplet MUST be globally unique). The ID helps to distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled service). It is preferable for the ID to be persistent and stay the same for the lifetime of the service instance, however it is acceptable that the ID is ephemeral and changes during important lifetime events for the service (e.g. service restarts). If the service has no inherent unique ID that can be used as the value of this attribute it is recommended to generate a random Version 1 or Version 4 [RFC 4122](https://www.ietf.org/rfc/rfc4122.txt) UUID. Services aiming for reproducible UUIDs may also use Version 5. UUIDs are typically recommended, as we only need an opaque yet reproducible value for the purposes of identifying a service instance. Similar to what can be seen in the man page for the `/etc/machine-id` file, the underlying data, such as pod name and namespace should be treated as confidential by this algorithm, being the user's choice to expose it or not via another resource attribute. When a UUID v5 is generated, the UUID's namespace MUST be set to: `${telemetry.sdk.name}.${telemetry.sdk.language}.${service.name}`, so that the same service yields the same ID if the same value (`host.id`, `/etc/machine-id`, and so on) remains the same. It would still yield different results for different services on the same host.
SDKs are required to use the following algorithm when generating `service.instance.id`:
- If the user has provided a `service.instance.id`, via environment
variable, configuration or custom resource detection, this will
always be used and honored over generated IDs.
Expand All @@ -117,19 +117,19 @@ SDKs are required to follow the following algorithm when generating `service.ins
* `host.id`
- When the SDK is running in an environment where a `/etc/machine-id`
(see [MACHINE-ID(5)](https://www.freedesktop.org/software/systemd/man/machine-id.html))
is available, the machine-id should be used as the input for generating a UUID v5 along with
the `service.name`.
is available, the machine-id should be used as the input for generating a UUID v5.
- When the SDK is running on a Windows environment and there's a reasonable way to read
registry keys for the SDK, the registry key
`HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid` can be used in a
similar way to Linux' machine-id above.
- When no other source is available the SDK MUST generate an ID. This
ID SHOULD follow version 1 or 4 of RFC 4122 UUIDs. This would also typically be the case for
service instances running on Kubernetes, given that the pod's name and namespace are not unique
enough to determine a service instance's identity: on pods with multiple containers, the
`service.instance.id` would yield the same results for all containers, which is not desirable.
And given that the services are ephemeral on Kubernetes, the `service.instance.id` would change
on each restart, being therefore no different than a completely new UUID per process.
- When no other source is available the SDK MUST generate a value using UUID v1 or v4.
This would also typically be the case for service instances running on Kubernetes,
given that the pod's name and namespace are not unique enough to determine a service
instance's identity and the container name cannot easily be infered: on pods with
multiple containers, the `service.instance.id` would yield the same results for all
containers, which is not desirable. And given that the services are ephemeral on
Kubernetes, the `service.instance.id` would change on each restart, being therefore
no different than a completely new UUID per process.
<!-- endsemconv -->

Note: `service.namespace` and `service.name` are not intended to be concatenated for the purpose of forming a single globally unique name for the service. For example the following 2 sets of attributes actually describe 2 different services (despite the fact that the concatenation would result in the same string):
Expand Down
26 changes: 15 additions & 11 deletions model/resource/service_experimental.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,13 @@ groups:
use Version 5. UUIDs are typically recommended, as we only need an opaque yet reproducible value for
the purposes of identifying a service instance. Similar to what can be seen in the man page for the
`/etc/machine-id` file, the underlying data, such as pod name and namespace should be treated as
confidential by this algorithm, being the user's choice to expose it or not via another resource attribute.
confidential by this algorithm, being the user's choice to expose it or not via another resource attribute.
When a UUID v5 is generated, the UUID's namespace MUST be set to:
`${telemetry.sdk.name}.${telemetry.sdk.language}.${service.name}`, so that the same service yields the
same ID if the same value (`host.id`, `/etc/machine-id`, and so on) remains the same. It would still
yield different results for different services on the same host.
SDKs are required to follow the following algorithm when generating `service.instance.id`:
SDKs are required to use the following algorithm when generating `service.instance.id`:
- If the user has provided a `service.instance.id`, via environment
variable, configuration or custom resource detection, this will
Expand All @@ -50,17 +54,17 @@ groups:
* `host.id`
- When the SDK is running in an environment where a `/etc/machine-id`
(see [MACHINE-ID(5)](https://www.freedesktop.org/software/systemd/man/machine-id.html))
is available, the machine-id should be used as the input for generating a UUID v5 along with
the `service.name`.
is available, the machine-id should be used as the input for generating a UUID v5.
- When the SDK is running on a Windows environment and there's a reasonable way to read
registry keys for the SDK, the registry key
`HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid` can be used in a
similar way to Linux' machine-id above.
- When no other source is available the SDK MUST generate an ID. This
ID SHOULD follow version 1 or 4 of RFC 4122 UUIDs. This would also typically be the case for
service instances running on Kubernetes, given that the pod's name and namespace are not unique
enough to determine a service instance's identity: on pods with multiple containers, the
`service.instance.id` would yield the same results for all containers, which is not desirable.
And given that the services are ephemeral on Kubernetes, the `service.instance.id` would change
on each restart, being therefore no different than a completely new UUID per process.
- When no other source is available the SDK MUST generate a value using UUID v1 or v4.
This would also typically be the case for service instances running on Kubernetes,
given that the pod's name and namespace are not unique enough to determine a service
instance's identity and the container name cannot easily be infered: on pods with
multiple containers, the `service.instance.id` would yield the same results for all
containers, which is not desirable. And given that the services are ephemeral on
Kubernetes, the `service.instance.id` would change on each restart, being therefore
no different than a completely new UUID per process.
examples: ["my-k8s-pod-deployment-1", "627cc493-f310-47de-96bd-71410b7dec09"]

0 comments on commit 4a17ffa

Please sign in to comment.