Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate schema-next generation #1426

Open
lmolkova opened this issue Sep 23, 2024 · 5 comments
Open

Automate schema-next generation #1426

lmolkova opened this issue Sep 23, 2024 · 5 comments
Assignees
Labels
tooling Regarding build, workflows, build-tools, ...

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Sep 23, 2024

schema-next contains a list of transformations to apply to attribute/metrics/event/resource to convert them from previous version to current one.

It relies on contributors and reviewers to update it manually and there is no check that verifies that file was updated and that it's a valid transformation (both attributes/metrics/etc exist, the type is the same, etc).

We can automate generation of this file if we formalize it in yaml.

  1. Attribute/metric/event rename is applied along with deprecation (Don't require brief for deprecated attributes, auto-generate it in markdown and code #1419). We can formalize deprecation and update schema-next automatically.

    - id: registry.foo.deprecated 
      ...
      attributes:
        - id:  foo.id
          deprecated:
            renamed_to: foo.unique_id
            note: >
              Optional note to provide any other info
        - id:  foo.another_attribute
          deprecated: removed

    This would cover all of the existing cases we support in schema-next.yaml.

  2. There are changes that we don't yet have transformation defined for that don't involve deprecation - for those we probably need to analyze the diff between versions:

    • attribute type/metric unit or instrument was changed
    • default value has changed
    • ...

    For this category we can also look into combining it with the changelog - changelog could be auto-generated based on the actual diff. Or changelog format can be made more specific so that we can generate transformations from it.

Arguably, we need a combination of changelog and semantic conventions to be available to the codegen as well (e.g. if attribute type has changed, I might want to generate some extra documentation, do something custom like type conversion, or maybe get alerted in some custom way.

@lmolkova lmolkova added the tooling Regarding build, workflows, build-tools, ... label Sep 23, 2024
@lmolkova lmolkova changed the title Yamlify and automate schema-next generation Automate schema-next generation Sep 23, 2024
@lquerel
Copy link
Contributor

lquerel commented Sep 23, 2024

Thanks, Liudmila, for specifying this. With that in place in semconv, Weaver will be able to automatically update the OTEL schema for any future version of the registry.

I would suggest using a representation similar to renaming for the removal of attributes, as it's possible to include a note for both removal and renaming.

- id: registry.foo.deprecated 
  ...
  attributes:
    - id:  foo.id
      deprecated:
        renamed_to: foo.unique_id
        note: >
          Optional note to provide any other info
    - id:  foo.another_attribute
      deprecated: 
        removed: true
        note: >
          Removed, report shared memory usage with `metric.system.memory.shared` metric

Another option, which could be grammatically more regular, might be…

- id: registry.foo.deprecated 
  ...
  attributes:
    - id: foo.id
      deprecated:
        action: renamed
        renamed_to: foo.unique_id
        note: >
          Optional note to provide any other info
    - id: foo.another_attribute
      deprecated:
        action: removed
        note: >
          Optional note to provide any additional info about the removal

@lmolkova
Copy link
Contributor Author

We use the following trick for requirement_level:

- id: attr.one 
  requirement_level: recommended
- id: attr.two
  requirement_level: 
    recommended: some optional note

Not sure what we do on the weaver side and if we're able to preserve the note to the codegen. If we can, than we should reuse the same trick. If not, we should fix it in some way.

@lmolkova
Copy link
Contributor Author

just realized - maybe we don't need to overcomplicate it and can just use note on the attribute?

- id: foo.id
  deprecated:
    renamed_to: foo.unique_id
  note: we can write anything here
- id: foo.another_attribute
  deprecated: 
    removed: true # or deprecated: removed
  note: another optional note

@lmolkova
Copy link
Contributor Author

lmolkova commented Oct 2, 2024

Based on the discussion in the tooling call, agreed on

- id: foo.id
  deprecated:
    action: renamed
    renamed_to: foo.unique_id
  note: we can write anything here
- id: foo.another_attribute
  deprecated: 
    action: removed
  note: another optional note

@lquerel
Copy link
Contributor

lquerel commented Oct 5, 2024

@lmolkova @jsuereth Things are a bit more complex than expected regarding the deprecated field…

I found the following “complex/contextual” declarations in the current registry for the deprecated field:

  • Split to url.path and `url.query.
  • Replaced by one of server.address, client.address or http.request.header.host, depending on the usage.
  • Replaced by server.address on client spans and client.address on server spans.

These types of deprecation depend on a context that is not fully expressed in free-text form. We could probably encode that in some form, but it will quickly become complex, and I doubt that anyone is really using the OTEL telemetry schema as it is today. I see multiple ways to pursue this task:

  1. We consider the OTEL schema unused and stop updating it, but we keep the free-text form for the deprecated field (mostly for documentation purposes).
  2. We introduce a new field, changes, in groups and attributes to represent renaming, removal, splitting of an attribute, metric, or similar adjustments in context. See below for a basic example.
  3. We create a much more complex deprecated field structure to attach context to the action.

What do you think? In my opinion, option (1) is probably the most reasonable choice at the moment. Option (2) could also work, but transformations like split are complex to express and, more importantly, seem overly complicated for external tools to support.

groups:
  - id: registry.http
    type: attribute_group
    display_name: HTTP Attributes
    brief: 'This document defines semantic convention attributes in the HTTP namespace.'
    attributes:
      - id: http.request.method
        stability: stable
        type:
          members:
            - ...
        brief: 'HTTP request method.'
        examples: ["GET", "POST", "HEAD"]
        note: ""
        changes:
          - 1.25.0:
            - action: renamed  
              rename_from: http.method
    changes:
      - 1.24.0:
        - action: removed_attribute
          name: http.body 
  - ...

Related GH issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tooling Regarding build, workflows, build-tools, ...
Projects
Status: Improve YAML Schema
Development

No branches or pull requests

3 participants