Add log export benchmark to measure the cost paid for async abstraction #2027

cijothomas · 2024-08-14T17:32:13Z

The current export method is marked async. Even when scenarios where export is done synchronously, the async is causing some unnecessary overheads (likely related to creating/updating the async state machines etc...)
This PR just shows the overhead is ~10% (250 ns ->275 ns). Not sure if this is mimicking the actual exporter correctly, would love to get some feedback on this.

codecov · 2024-08-14T17:38:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.7%. Comparing base (72ac56f) to head (ca96de4).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #2027     +/-   ##
=======================================
- Coverage   76.7%   76.7%   -0.1%     
=======================================
  Files        122     122             
  Lines      20828   20828             
=======================================
- Hits       15993   15992      -1     
- Misses      4835    4836      +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

TommyCpp · 2024-08-14T17:45:10Z

opentelemetry-sdk/benches/log_exporter.rs

+impl LogProcessor for ExportingProcessorWithFuture {
+    fn emit(&self, data: &mut LogData) {
+        let mut exporter = self.exporter.lock().expect("lock error");
+        futures_executor::block_on(exporter.export(vec![data.clone()]));


Do we block on runtime our LogProcrssor? It's generally not really performat

We could try https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.html#method.block_on and see if there is difference. Maybe spin up the runtime as part of the Processor so we can reuse them

+1. This is not testing the async runtime. Better to create the async runtime outside of the timing loop, and then block using tokio runtime.

Modified to use Tokio runtime's block_on, and perf drastically dropped. (280ns to 350 ns)

#[derive(Debug)] struct ExportingProcessorWithFuture { exporter: Mutex<NoOpExporterWithFuture>, rt: Runtime, } impl ExportingProcessorWithFuture { fn new(exporter: NoOpExporterWithFuture) -> Self { let rt = Runtime::new().unwrap(); ExportingProcessorWithFuture { exporter: Mutex::new(exporter), rt, } } } impl LogProcessor for ExportingProcessorWithFuture { fn emit(&self, data: &mut LogData) { let mut exporter = self.exporter.lock().expect("lock error"); // futures_executor::block_on(exporter.export(vec![data.clone()])); self.rt.block_on(exporter.export(vec![data.clone()])); } fn force_flush(&self) -> LogResult<()> { Ok(()) } fn shutdown(&self) -> LogResult<()> { Ok(()) } }

TommyCpp

So this is measuring the case when the user doesn't have an async runtime?

cijothomas · 2024-08-14T17:52:42Z

So this is measuring the case when the user doesn't have an async runtime?

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

TommyCpp · 2024-08-14T17:57:26Z

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

OK so it's testing the case where the exporter doesn't need async runtime. Does it covers the case where the exporter requires async runtime. e.g HTTP or gRPC exporter based on async HTTP and gRPC crate?

TommyCpp

likely related to creating/updating the async state machines etc.

State machine should create during compiling time so we shouldn't pay any cost during runtime.

Running state machine does have runtime overhead, but given the async function doesn't do anything the state machine probably only have one state. I think the overhead is mostly coming from the runtime setup here rather than the state machine itself.

cijothomas · 2024-08-14T18:34:38Z

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

OK so it's testing the case where the exporter doesn't need async runtime. Does it covers the case where the exporter requires async runtime. e.g HTTP or gRPC exporter based on async HTTP and gRPC crate?

It doesn't.

cijothomas · 2024-08-14T18:35:35Z

likely related to creating/updating the async state machines etc.

State machine should create during compiling time so we shouldn't pay any cost during runtime.

Running state machine does have runtime overhead, but given the async function doesn't do anything the state machine probably only have one state. I think the overhead is mostly coming from the runtime setup here rather than the state machine itself.

Not sure I understand. There is no runtime here. So the overhead has to be coming from the usage of async. (The benchmarks are both doing exact same thing, only differing in the use of async for export)

TommyCpp · 2024-08-14T22:27:58Z

There is no runtime here

futures_executor::block_on is a runtime(well, maybe async executor will be a better name) as async tasks doesn't run without a execturor

All asynchronous computation occurs within an executor

It doesn't change tha fact async tasks has overhead because of the need of exector. I probably went into the too details sutff here

opentelemetry-sdk/benches/log_exporter.rs

cijothomas · 2024-08-15T00:13:15Z

@TommyCpp @lalitb
the PR is just to show that forcing async in places where it is not required has some overhead. The exporter trait is currently forcing async and is only good if the exporter actually need async call like doing http or grpc etc. In other words, our exporter is making assumptions that every exporter will need to do io calls and benefit from async but that is not the case.

The discussion about runtime etc. is not very relevant (apologies my knowledge of async runtime is very limited).

I have some workarounds in mind but first step was to validate the assumption that introducing async where it is not required is causing unnecessary perf hit.

TommyCpp · 2024-08-15T00:16:30Z

The discussion about runtime etc. is not very relevant (apologies my knowledge of async runtime is very limited).

Yep. Sorry I kind of hijacked the discussion 😞

I have some workarounds

Would it make sense to offer two versions(one for sync, one for async). It's not uncommon in Rust eco system. One example would be reqwest, which offers async and blocking implementation

TommyCpp

Benchmark looks good. I have a few thoughts on how to setup runtime/try differnt runtime to minimize the cost. I can explore it offline once we merge it

cijothomas · 2024-08-16T16:07:26Z

Opened an issue to track the change : #2031
Wasn't planning to merge this benchmark originally, but no harm in keeping it either.

cijothomas added 2 commits August 14, 2024 17:29

Add log benchmarks to measure the overhead of async trait for export

fbdc08f

Add

9440be9

cijothomas requested a review from a team August 14, 2024 17:32

fmt fix

17d80df

TommyCpp reviewed Aug 14, 2024

View reviewed changes

lalitb reviewed Aug 14, 2024

View reviewed changes

opentelemetry-sdk/benches/log_exporter.rs Show resolved Hide resolved

TommyCpp approved these changes Aug 15, 2024

View reviewed changes

lalitb approved these changes Aug 15, 2024

View reviewed changes

TommyCpp and others added 2 commits August 15, 2024 11:17

Merge branch 'main' into cijothomas/future-exporter

9758e0b

Merge branch 'main' into cijothomas/future-exporter

7b3ec69

cijothomas mentioned this pull request Aug 16, 2024

Should Exporter force async? #2031

Closed

cijothomas added 2 commits August 16, 2024 09:17

clip

9bdaea0

fmr

ca96de4

cijothomas merged commit 108161c into open-telemetry:main Aug 16, 2024

cijothomas deleted the cijothomas/future-exporter branch September 4, 2024 00:46

cijothomas mentioned this pull request Sep 17, 2024

Migrate BatchExporterProcessor to async open-telemetry/opentelemetry-dotnet#5838

Closed

4 tasks

Add log export benchmark to measure the cost paid for async abstraction #2027

Add log export benchmark to measure the cost paid for async abstraction #2027

Uh oh!

Conversation

cijothomas commented Aug 14, 2024

Uh oh!

codecov bot commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TommyCpp Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

TommyCpp Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

lalitb Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

cijothomas Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

TommyCpp left a comment

Choose a reason for hiding this comment

Uh oh!

cijothomas commented Aug 14, 2024

Uh oh!

TommyCpp commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TommyCpp left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cijothomas commented Aug 14, 2024

Uh oh!

cijothomas commented Aug 14, 2024

Uh oh!

TommyCpp commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cijothomas commented Aug 15, 2024

Uh oh!

TommyCpp commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TommyCpp left a comment

Choose a reason for hiding this comment

Uh oh!

cijothomas commented Aug 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Aug 14, 2024 •

edited

Loading

TommyCpp commented Aug 14, 2024 •

edited

Loading

TommyCpp left a comment •

edited

Loading

TommyCpp commented Aug 14, 2024 •

edited

Loading

TommyCpp commented Aug 15, 2024 •

edited

Loading