-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add isolating log record processor #4062
Conversation
SIG meeting notes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level feedback:
(1) I think so far the specification for the SDKs was far more opinionated than this PR, by describing not only capabilities but the desired architecture, i.e. standard components like processor, exporter, and standard ways to wire them together.
I think "multiple pipelines" is a significant capability change that requires a description of the architecture. E.g. it may require bifurcating the processing chain, so how would that be achieved? I don't think this is enough to just say "allow making a copy of log record".
(2) Whose responsibility should it be to decide on parallel pipelines? Should it be happening just because of the implementation of specific components? Or should it be user-driven decision by setting up the parallel pipelines accordingly?
(3) Does this capability only make sense to logs, or to other signals as well? Why or why not? I think we should be consistent.
👍
The biggest problem here is that almost stable Logs SDKs have already different architectures (and different ways of processing logs). I am not sure if e.g. Java, JavaScript .NET would like to say e.g. "before making a change in a processor, make a copy so that other processors are not affected". E.g. in Go we have a comment like this: "Before modifying a Record, the implementation must use Record.Clone to create a copy that shares no state with the original".
I am not sure if I follow. I think it can be both of "because of the implementation of specific components" and "user-driven decision by setting up the parallel pipelines" at the same time 😉
I think it makes sense for all signals. I will look into this. At last, I think that this PR needs more feedback from other SDK maintainers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared and @JonasKunz I'm trying to bring together your two proposals. It seems after reading this and #4024, there is a desire for OTel SDKs to support automatic "fanout" processors. I think I would rather see us specify that SDKs should support an idiomatic fanout processor which hides details related to avoiding shared-mutable references, copy-on-write protections, and so on.
My thoughts and perspective from .NET... Originally we had an immutable structure No one that I can recall has complained about that. But one request which has come up a bunch is the ability to filter. ProcessorA(Export records matching ABC) -> ProcessorB(Export all records) There is no good way to accomplish that currently in .NET. I always envisioned some kind of fork processor might help with that. Thinking out loud and combining it with this idea to copy the
PS: When I say "Copy" we do have a pool so it isn't necessarily an allocation. |
I just want to share that making pools for log records may be "dangerous" as the user-defines processor/exporter may want to try to use the log records after |
I just want to add that even for implementation that does not "fully" share the log record between processors (see: #4067) this processor may still be handy. E.g. currently in Go most of the log record data is not shared. But there is an exception: log attributes. This is done for performance optimization to reduce the heap allocations (allows zero-heap allocation). |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
This has been open for a little while, and albeit many people are on holidays, we are safe as this is a Development addition. Merging. |
Follows #4062 ## Why 1. There is a need to make a deep copy of a `ReadWriteRecord` in order to implement an experimental isolating processor outside of the SDK (as it is experimental, we prefer to land it to contrib repository first). 2. Allow fine-grained control for the users to make a deep copy only when necessary. This allows users to have more control on the processing e.g. pass a clone to asynchronous processor to avoid race conditions or make performance improvements.
## Why Fixes open-telemetry#4010 (read the issue for more details) The SDK SHOULD offer a way to make independent log processing. I find it as a missing feature that should be covered by by the specification. ## What This PR tries to achieve it in the following way: - SDK SHOULD provide an isolating processor allowing independent log processing pipelines. Source: open-telemetry#4010 (comment) This approach should be solving the issue at least for Java, .NET, Python, JavaScript, Go. This proposal should not make any existing log SDKs not compliant with the specification. I think that the same approach could be also adopted in tracing for span processors (but I have not yet investigated it in depth). ## Remarks I created a separate issue for specifying whether (by default) a change to a record made in a processor is propagated to next registered processor. See open-telemetry#4065
Follows open-telemetry#4062 ## Why 1. There is a need to make a deep copy of a `ReadWriteRecord` in order to implement an experimental isolating processor outside of the SDK (as it is experimental, we prefer to land it to contrib repository first). 2. Allow fine-grained control for the users to make a deep copy only when necessary. This allows users to have more control on the processing e.g. pass a clone to asynchronous processor to avoid race conditions or make performance improvements.
Why
Fixes #4010 (read the issue for more details)
The SDK SHOULD offer a way to make independent log processing. I find it as a missing feature that should be covered by by the specification.
What
This PR tries to achieve it in the following way:
This approach should be solving the issue at least for Java, .NET, Python, JavaScript, Go. This proposal should not make any existing log SDKs not compliant with the specification.
I think that the same approach could be also adopted in tracing for span processors (but I have not yet investigated it in depth).
Remarks
I created a separate issue for specifying whether (by default) a change to a record made in a processor is propagated to next registered processor. See #4065