-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New component: Template Receiver #26312
Comments
As per the discussion in the SIG call on Aug 30, could this functionality be provided either by the existing yaml provider (https://github.com/open-telemetry/opentelemetry-collector/tree/main/confmap/provider/yamlprovider) or by a new template provider? |
I spent a couple hours digging into the For the sake of simplicity, let's suppose that we would have a template file that contains nothing but a "template type" and a single templated receiver configuration. Roughly: type: my_filelog_template
template: |
filelog:
include: {{ .my_log_file }} A new map[string]any{
"templates": {
"my_filelog_template": "filelog:\n include: {{ .my_log_file }}",
},
} From there, this is automatically converted into a Then, the aggregate This converter basically would crawl the
type ConfTemplate struct {
Type string
Parameters map[string]any
}
tmpl := aggregateConf.Sub("templates").Sub("my_filelog_template")
This assumes that the user has correctly used I think this makes some sense, but do I seem to be missing anything? Notably, the above leaves out the ability to template multiple related components together. I think this would require different logic:
type TemplateReceiver struct {
Receivers map[component.ID]confmap.Conf
Processors map[component.ID]confmap.Conf
PartialPipelines []PartialPipeline
}
type PartialPipeline struct {
Receivers []component.ID
Processors []component.ID
}
|
I got inspired and put together a draft of my design as a template provider and converter working in combination. See open-telemetry/opentelemetry-collector#8344 I think it's a better solution in several ways. There are a few aspects to work through still but I believe it's ready for some feedback if anyone can take a look. |
Just so that we know what we currently support: one way of doing something similar today would be something like the following. Given the 'template': # template.yaml
receivers:
filelog/general:
# include: {{ .general_log }}, if you want a default just define this option
# start_at: {{ .start_at }}, if you want a default just define this option
multiline:
line_start_pattern: '\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d+Z|/\w+/\w+/mysqld,'
operators:
- type: regex_parser
regex: '(?P<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d+Z)\s+(?P<tid>\d+)\s+(?P<command>\w+)(\s+(?P<message>(?s).+\S))?'
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%sZ'
filelog/error:
# include: {{ .error_log }}, if you want a default just define this option
# start_at: {{ .start_at }}, if you want a default just define this option
multiline:
line_start_pattern: '\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d+Z'
operators:
- type: regex_parser
regex: '(?P<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d+Z)\s+(?P<tid>\d+)\s+\[(?P<mysql_severity>[^\]]+)]\s+(?P<message>[\d\D\s]+)'
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%sZ' You can provide another 'parameter' file: # parameters.yaml
receivers:
filelog/general:
include: /var/log/mysql/general.log
start_at: end
filelog/error:
include: /var/log/mysqld.log
start_at: end or if you want a flatter structure: # parameters.yaml, second version
receivers::filelog/general::include: /var/log/mysql/general.log
receivers::filelog/general::start_at: end
receivers::filelog/error::include: /var/log/mysqld.log
receivers::filelog/error::start_at: end and then you merge the files by passing:
IMO we need to think carefully as to what is it that we don't like about the current state and proceed after that. With this approach we have type validation, support for defaults, support for any section of the configuration (be it receivers or anything else including e.g. |
@mx-psi, thanks for highlighting the current capabilities. Ultimately, my goal here is to provide a mechanism that unlocks better usability. It's not just a matter of reducing the number of parameters which a user must consider. I think a templating system should allow collector and/or usability experts to create well defined abstractions which can be provided to newer users. In my opinion, the config merging which you've highlighted may be enough in some specific cases but I think there are a lot of limitations to the approach. I'll highlight several others below, but most importantly I think it is not an abstraction for the following reasons:
Compared to the above, a template provider/converter would allow users to reason about templates in a much more abstract manner, based only inputs and outputs. In addition to the differences noted above, there are several other capabilities to a templating system which cannot currently be solved via config merging.
receivers:
template/foo:
nodes: [a, b, c, d]
groups: [x, y, z]
bar: true
type: foo
template: |
{{ if not .nodes }}
{{ error "Please specify at least one 'node'!" }}
{{ end }}
receivers:
{{ range $i, $node := .nodes }}
some_node_receiver/{{ $i }}:
hostname: {{ $node }}
include_bar: {{ .bar }}
{{ end }}
{{ if (gt (len .groups) 0) }}
processors:
{{ range $i, $group := .groups }}
some_processor/{{ $i }}:
do_something_if: attributes.group_name == {{ toUpper $group }}
another_processor//{{ $i }}:
{{ end }}
{{ end }}
pipelines:
{{ range $i, $node := .nodes }}
logs/{{ $i }}:
receivers: [ some_node_receiver/{{ $i }} ]
{{ if (gt (len .groups) 0) }}
processors:
{{ range $i, $group := .groups }}
- some_processor/{{ $i }}:
- another_processor/{{ $i }}:
{{ end }}
{{ end }}
{{ end }}
I see these as a secondary concern for templates because documentation of parameters and simple examples can show users how to use them. That said, I think we could increase usability further by having a well defined parameter type system.
I think we can separate the notion of defaults into two groups:
{{ if not .foo }}
{{ $foo := "bar" }}
{{ end }}
I think the proposed implementation in open-telemetry/opentelemetry-collector#8344 could be adapted easily enough to support templating of other component types but I agree it is not intended to generalize to all parts configuration. To broaden the context of my proposal, I recall the following requests for which I believe templates would be a solution: |
For (1), (2) and (4), I agree that the current configuration resolution system is lacking. I think the solution, if possible, should build upon the existing confmap resolver instead of adding a separate templating system. It may be that we can't do that (I am currently not convinced that is the case), but it is important we know why we are not extending the existing system if we don't do that in the end.
For (3), we can already reuse configuration for multiple components. Something like this works today: receivers:
filelog/general: ${file:/path/to/reusable/filelog/definition.yaml}
filelog/custom: ${file:/path/to/reusable/filelog/definition.yaml} One would have to specify the parameters twice (and would still suffer from issues (1) & (4)) but this kind of reusability exists today to some extent.
On the other templating capabilities, I think (2) and (4) are supported by the current system. We can support multiple components and pipelines, and we can reuse pipelines in multiple places. What capabilities are lacking?
I can see the appeal of (1) and (3), but I am not convinced about supporting them natively on the Collector. Having custom functions or loops feels definitely like the job of something like Helm, or a configuration management tool like Chef/Puppet/Saltstack/Ansible, not of the Collector. It's unclear where to draw the line, but IMO our role should be to provide configuration capabilities that help basic users and that interact well with configuration management systems, and leave more advanced capabilities to those systems. Personally, I feel like improving abstraction on the current system is still okay in that it helps with this interaction but I would draw the line there (but I would like to see what other community members think). |
Re (2), you can define multiple components and pipelines in a single file with the current system, but can you abstract them so that the user feels they are working with one simple component? My understanding is that you can't really do this. Re (4), can you show me how this is done? I like to think I'm a bit more familiar than the typical user of the collector but I still don't see it.
I see, the file would contain only the parameters, but the context would have to be managed some other way. This seems like a poor usability experience which again requires the user to have a deep understanding of config layering.
The line I am proposing is that our templates would use Go's text/template and support only what that package gives us. Perhaps at a future point someone would provide a clear enough need to draw a new line, but I'm not proposing that now. The loops and such are basically free capabilities which we would get even if the goal was only to support simple parameter substitution. I know for certain that they would be very useful, but it shouldn't be additional maintenance burden if that's the concern.
This issue was originally a proposal for a receiver but given the feedback I received I worked out an implementation that is fully based in the confmap package. It defines a new |
I've rebooted this issue with the updated design here: open-telemetry/opentelemetry-collector#8372 |
Great points raised.
I think this is the sticky part IMO. Is it OK for OTEL to implicitly create dependencies to config management systems? To me, this sounds like a restriction how I as an OTEL user can make use of the collector. What if an external config management system has no place in my architecture? |
Closing in favor of open-telemetry/opentelemetry-collector#8372 |
Motivation
The process of "developing" a configuration for the Collector often requires
detailed knowledge of one or more collector components,
a sophisticated understanding of how to interface with an external technology,
or just a non-trivial amount of effort working through necessary data manipulations.
We should provide an easy way to capture and share useful portions of configuration.
Proposal
A new
template
receiver which can present users with simplified configurations. In the simplest case, a template file contains a partially preconfigured receiver. Thetemplate
receiver allows the user to reference such a template file and provide only the parameters necessary to complete the configuration.In addition to a templated configuration, the template file may contain machine-readable metadata about the template, such as a title, description, version, a schema describing the parameters accepted by the template, and a schema describing the telemetry emitted by the template. A schema which defines parameters may automatically apply default values and/or enforce type requirements (e.g. must be an int between 1-65535, or must match an enumerated list of strings).
Templating can be achieved using Go's
text/template
package, which allows for simple string insertions, but also more advanced control flow. (e.g. render a section for each item in a slice usingfor
, or only render a section of configif
a value indicates the need.)In order to ensure that a template file is 1) fully self-defined, and 2) valid yaml, the structure of a template file should be defined such that the templated configuration is a raw string or byte sequence within a valid yaml schema.
Simple Example
config.yaml
my_otlp_template.yaml
Multiple Receivers
In many cases, it would be useful to configure multiple related receivers using a single template. For example, a database may emit logs to several distinct files, each with a different format. In this case, a template may serve as a single solution for the database by encapsulating multiple instances of the
filelog
receiver, each with appropriate parsing behaviors.Example
my_mysql_template.yaml
Processors
In some cases it would be helpful to apply predefined processor configurations to data ingested by receivers.
The addition of processors complicates the templated configuration somewhat in the sense that we may now also need to include partial pipelines in the template as well. This allows a defined order of processors, as well as explicitly declared relationships between recievers and processors. It should not be necessary (or even allowed) to declare exporters, as it should be understood that the template receiver will "export" according to its position in the top-level service graph.
Example
my_mysql_template.yaml
Multiple Data Types
It may be helpful for a single templated configuration to encapsulate solutions for multiple data types. For example, reading logs and also scrape metrics.
Implemenation Details
Templated Pipelines
Partial pipelines should be defined within the templated configuration because in some cases a template may generate entire receivers or processor configs. (e.g. given a list of nodes, generate a receiver config corresponding to each node) In such cases, it would be necessary to also generate the corresponding partial pipelines.
Data Type Support / Validation
The
template
receiver will have to declare that it supports all data types. However, a given template may support any subset of types. Therefore, it is not possible to validate correct usage of the template receiver until the factory attempts to build the receiver given a specific config.When a factory function is called, e.g.
CreateMetricsReceiver
, the factory can do the following:Internal Collector Instance
Each instance of the
template
receiver should construct and manage its own internal collector instance. In order to do this, it will render the templated config and complete the partial pipelines by attaching an "exporter" which will simply pass data through to the main service graph.Open Questions / Future Work
Internal Service Graph
Currently, I believe that such a solution necessarily requires each instance of the
template
receiver to construct and manage its own instance ofotelcol.Collector
, but it would be better to reduce the scope of responsibility for the component to manage. For example, it own-telemetry configuration should not be a concern of thetemplate
receiver, as it should inherit these settings from the collector as a whole. However, it's not clear to me that this is possible currently. Possibly, this is a use case that favors open-telemetry/opentelemetry-collector#8111.Template Processor / Template Exporter
I believe in the future it may also be beneficial to create
template
processor andtemplate
exporter components.A templated processor would allow for several predefined processing operations to be packaged together. For example, migration of data from one version of semantic conventions to another could be written purely in configuration and easily shared.
A template exporter would allow us to prepend common processing steps onto one or more exporters.
Telemetry data types supported
All
Is this a vendor-specific component?
Code Owner(s)
No response
Sponsor (optional)
No response
Additional context
This proposal is based on observIQ's pluginreceiver.
The text was updated successfully, but these errors were encountered: