Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why ContextExecutorService#wrap is deprecated? #295

Closed
michaldo opened this issue Sep 23, 2024 · 11 comments
Closed

Why ContextExecutorService#wrap is deprecated? #295

michaldo opened this issue Sep 23, 2024 · 11 comments
Labels
question Further information is requested

Comments

@michaldo
Copy link

I would like to parallelize IO task and I would like pass current observation to tasks. I found solution here: https://stackoverflow.com/a/78765658/2365727

There are 2 signatures of method ContextExecutorService.wrap(). One simple, intuitive and works for me and second is more complex but allows deeper customization.

Up to these point everything is perfect. Library allows simplified usage with default settings and powerful usage with full customization.

But default usage is deprecated. Why?

@chemicL
Copy link
Collaborator

chemicL commented Sep 23, 2024

There are a few configurable items there and your application needs to make informed decisions. Specifically, the deprecated method used the global ContextRegistry. It should be explicitly specified instead. Also an informed decision needs to be made regarding existing ThreadLocal values prior to setting them to the captured ones -> if a snapshot does not contain some values, should the ones for which ThreadLocalAccessors exist be cleaned or not? There's no obvious answer, hence the library doesn't make it easy to shoot oneself in the foot so to speak. Hope this helps.

@chemicL chemicL added question Further information is requested feedback-provided labels Sep 23, 2024
@michaldo
Copy link
Author

the library doesn't make it easy to shoot oneself in the foot, but encourage clients to play with loaded gun.

Actually, when client wants simply pass context to task from executor service, the library encounters him to implement ContextSnapshot. Simple need requires complex work to accomplish.

Then developer digs into library code, scans SO and finds ContextSnapshotFactory.builder().build().captureAll()

There's no obvious answer. No. Obvious answer is ContextSnapshotFactory.builder().build().captureAll(). Let's analyze defaults one by one:

  1. When I do not pass context registry, take global one Makes sense? Absolutely. Is there any well known context registry other than global one to consider? No.
  2. When I do not specify rule to capture key, take all Makes sense? Absolutely.
  3. When I do not mention about clear missing, do not clear. Makes sense? Honestly, I do not know. Someone decided it will be default. I'm fine with it. Especially, I think in real life it does not matter.

Considering all above, my proposal is modify ContextExecutorService#wrap(ExecutorService) to use default snapshot builder and clean up deprecation

@chemicL
Copy link
Collaborator

chemicL commented Sep 24, 2024

IIRC the deprecation came from @rstoyanchev's suggestion in #105 (comment) and I agree.

This API is not for simply wrapping tasks but for wrapping an entire ExecutorService. In this instance it should be an object shared between different tasks, long living and configured once for the lifetime of the app in most cases. You'd probably want to also combine that instantiation with the configuration of the ContextSnapshotFactory which then you can use readily as the Supplier of ContextSnapshot.

Honestly, I do not know. Someone decided it will be default. I'm fine with it. Especially, I think in real life it does not matter.

It actually does matter quite a lot, that's why the configuration is there. A real life fundamental problem in context propagation in reactive applications has dictated the API to evolve in this direction. This should also be of interest as it seems your application is going to run Java's Stream API on a shared executor. One task might have a value and another might not. This can lead to leakages and for instance show unexpected identifiers in unrelated work units. Ignorance of this aspect is something that we specifically want to prevent. As to the loaded gun analogy - correct, you decided to work with a complicated scenario, better understand the considerations. The context-propagation library is a low-level detail mostly meant for use in library code, not necessarily by end users.

Regarding the other enumerated points, I agree. Not on the clearing missing values though.

@michaldo please refrain from opening PRs until the discussion is settled.

Unless @rstoyanchev changed his opinion, I'd leave the API as is.

@michaldo
Copy link
Author

Let's start with this:
The context-propagation library is a low-level detail mostly meant for use in library code

I never want use low-level API in business code. Please answer here or here https://stackoverflow.com/questions/78746378/spring-boot-3-micrometer-tracing-in-mdc/78765658#78765658
https://stackoverflow.com/questions/78889603/traceid-propagation-to-virtual-thread
how solve it with high-level API and I will be totally satisfied and I promise never look again into context-propagation library.

About "clearing missing". I don't know 100% what should be default: true or false. Clearing missing may be important in real life reactive applications. From my perspective, reactive is outdated by virtual threads. For virtual threads clearing missing issue does not exists. It means considering clearing missing is dead end and I'm happy with any choice about default value.
It means also that context propagation is something more than reactive and reactive should nor dictate API evolution.

Finally, I think that one PR is worth more than thousand words I never hesistate clarify discussion with PR

@rstoyanchev
Copy link
Collaborator

rstoyanchev commented Sep 24, 2024

Thanks for raising this. I can see how it's not very obvious what to do, but as @chemicL pointed out, there are important choices to understand. The global ContextRegistry propagates every ThreadLocal it knows about, through registrations made by the app or by 3rd party libraries. That brings overhead, and potential side effects. Many times it's not easy to customize this in code deep on the callstack, but ExecutorService is a shared component that should be easy to configure for the specific needs.

That said, I think there is room for improvement. We could add an alternative wrap method that is a more friendly place to start:

public static ExecutorService wrap(ExecutorService service, ContextSnapshotFactory snapshotFactory) {
    return new ContextExecutorService<>(service, snapshotFactory::captureAll);
}

This should make it more obvious what you need to provide, and the choices you need to consider. We can also update the Javadoc of the other wrap method to explain better what's involved in creating a snapshot, and maybe have some example code snippets.

Also a good time to use this opportunity to review the Javadoc on ContextSnapshotFactory and its Builder to ensure there is guidance around what each choice means. For example, the method for ContextRegistry can mention the choice of using the global instance vs a custom one.

@michaldo
Copy link
Author

Friendly alternative would be for me something like

public static ExecutorService wrapObserve(ExecutorService service) {
    return new ContextExecutorService<>(service, Observation.SNAPSHOT_FACTORY::captureAll);
}

Because what I really want is pass high level Observation abstraction from caller to task.

@rstoyanchev
Copy link
Collaborator

The context-propagation library is at a lower level, and has no knowledge of Observation, so that can't be solved here. However, with the suggested improvement, this looks straight forward:

ContextExecutorService.wrap(executorService, Observation.SNAPSHOT_FACTORY);

@michaldo
Copy link
Author

The context-propagation library is at a lower level, and has no knowledge of Observation, so that can't be solved here. However, with the suggested improvement, this looks straight forward:

ContextExecutorService.wrap(executorService, Observation.SNAPSHOT_FACTORY);

Good idea, but maybe better:

public interface Observation {
   ...
   public static ExecutorService observe/wrap(ExecutorService service) {
        return ContextExecutorService.wrap(service, Observation.SNAPSHOT_FACTORY);
   }

michaldo added a commit to michaldo/micrometer that referenced this issue Sep 26, 2024
As developer, I would like parallelize my code and observe subtasks (https://stackoverflow.com/questions/78746378/spring-boot-3-micrometer-tracing-in-mdc/78765658#78765658 https://stackoverflow.com/questions/78889603/traceid-propagation-to-virtual-thread).

I found solution with `ContextExecutorService`, but according to discussion micrometer-metrics/context-propagation#295, ContextExecutorService is low level API, which should not be used by application code.

`Observation` is enough high level to be used by application code.
@chemicL
Copy link
Collaborator

chemicL commented Oct 2, 2024

@michaldo the ContextSnapshot can carry any ThreadLocal and Map-like structure context capture. While you might be interested only in transporting Observation-related ThreadLocal data, the primitives in this library deal with any registered accessors.

As @rstoyanchev suggested, I added a PR to improve the API to be more intuitive (accepts a ContextSnapshotFactory and captures all ThreadLocal data) and explains in the Javadoc what can be used to obtain a Supplier of ContextSnapshot.

@chemicL chemicL removed the question Further information is requested label Oct 2, 2024
@michaldo
Copy link
Author

I'm courious the following:

As I understand this discussion, default method ContextExecutorService.wrap cannot exists because clearMissing issue is very important.
On the other hand, for similar purpose exists reactor.core.publisher.Hooks#enableAutomaticContextPropagation.
Why default is possible in reactor and not possible in metrics?

@chemicL
Copy link
Collaborator

chemicL commented Oct 10, 2024

The mechanism in reactor-core is specifically designed for reactive chains where Threads are reused to perform unrelated tasks from different chains. That's why the clearMissing flag is set to true - so there are no leaks from other tasks and that there is proper isolation of different layers of a reactive chain.

For instance in one layer a ThreadLocal should be present, while in an outer it should be cleared before any work is done.

Actually you touched on a very important aspect for which this flag exists. The previous default when this flag was not there was to not clear any values for missing keys in a provided context. That led to leaks. Fixing that "sane default" impact was quite an undertaking on both Reactor and Micrometer teams :)

Since context-propagation is agnostic of where it's going to be used it can not come up with a default as each case will be different. In case of Virtual Threads it might not play a role whether you clear missing values, while for wrapping a ForkJoinPool used in java.util.stream usages it will be required as it will be a similar issue as with reactive streams.

I think with the improvements suggested by @rstoyanchev that were added via #305 we can now close this issue. Having in mind the above explanation, I don't feel providing an easier way to propagate generic contexts would be helpful. The risks are more concerning.

@chemicL chemicL closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2024
@chemicL chemicL added question Further information is requested and removed feedback-provided labels Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants