Skip to content

Conversation

cijothomas
Copy link
Member

The current export method is marked async. Even when scenarios where export is done synchronously, the async is causing some unnecessary overheads (likely related to creating/updating the async state machines etc...)
This PR just shows the overhead is ~10% (250 ns ->275 ns). Not sure if this is mimicking the actual exporter correctly, would love to get some feedback on this.

@cijothomas cijothomas requested a review from a team August 14, 2024 17:32
Copy link

codecov bot commented Aug 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.7%. Comparing base (72ac56f) to head (ca96de4).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #2027     +/-   ##
=======================================
- Coverage   76.7%   76.7%   -0.1%     
=======================================
  Files        122     122             
  Lines      20828   20828             
=======================================
- Hits       15993   15992      -1     
- Misses      4835    4836      +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

impl LogProcessor for ExportingProcessorWithFuture {
fn emit(&self, data: &mut LogData) {
let mut exporter = self.exporter.lock().expect("lock error");
futures_executor::block_on(exporter.export(vec![data.clone()]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we block on runtime our LogProcrssor? It's generally not really performat

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.html#method.block_on and see if there is difference. Maybe spin up the runtime as part of the Processor so we can reuse them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. This is not testing the async runtime. Better to create the async runtime outside of the timing loop, and then block using tokio runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified to use Tokio runtime's block_on, and perf drastically dropped. (280ns to 350 ns)

#[derive(Debug)]
struct ExportingProcessorWithFuture {
    exporter: Mutex<NoOpExporterWithFuture>,
    rt: Runtime,
}

impl ExportingProcessorWithFuture {
    fn new(exporter: NoOpExporterWithFuture) -> Self {
        let rt = Runtime::new().unwrap();
        ExportingProcessorWithFuture {
            exporter: Mutex::new(exporter),
            rt,      
        }
    }
}

impl LogProcessor for ExportingProcessorWithFuture {
    fn emit(&self, data: &mut LogData) {
        let mut exporter = self.exporter.lock().expect("lock error");
        // futures_executor::block_on(exporter.export(vec![data.clone()]));
        self.rt.block_on(exporter.export(vec![data.clone()]));
    }

    fn force_flush(&self) -> LogResult<()> {
        Ok(())
    }

    fn shutdown(&self) -> LogResult<()> {
        Ok(())
    }
}

Copy link
Contributor

@TommyCpp TommyCpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is measuring the case when the user doesn't have an async runtime?

@cijothomas
Copy link
Member Author

So this is measuring the case when the user doesn't have an async runtime?

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

@TommyCpp
Copy link
Contributor

TommyCpp commented Aug 14, 2024

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

OK so it's testing the case where the exporter doesn't need async runtime. Does it covers the case where the exporter requires async runtime. e.g HTTP or gRPC exporter based on async HTTP and gRPC crate?

Copy link
Contributor

@TommyCpp TommyCpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely related to creating/updating the async state machines etc.

State machine should create during compiling time so we shouldn't pay any cost during runtime.

Running state machine does have runtime overhead, but given the async function doesn't do anything the state machine probably only have one state. I think the overhead is mostly coming from the runtime setup here rather than the state machine itself.

@cijothomas
Copy link
Member Author

Its immaterial if user has async runtime or not. The exporter (eg: exporters to systems like Windows ETW, Linux user_events) does not need async runtime, as it is executing synchronously.

OK so it's testing the case where the exporter doesn't need async runtime. Does it covers the case where the exporter requires async runtime. e.g HTTP or gRPC exporter based on async HTTP and gRPC crate?

It doesn't.

@cijothomas
Copy link
Member Author

likely related to creating/updating the async state machines etc.

State machine should create during compiling time so we shouldn't pay any cost during runtime.

Running state machine does have runtime overhead, but given the async function doesn't do anything the state machine probably only have one state. I think the overhead is mostly coming from the runtime setup here rather than the state machine itself.

Not sure I understand. There is no runtime here. So the overhead has to be coming from the usage of async. (The benchmarks are both doing exact same thing, only differing in the use of async for export)

@TommyCpp
Copy link
Contributor

TommyCpp commented Aug 14, 2024

There is no runtime here

futures_executor::block_on is a runtime(well, maybe async executor will be a better name) as async tasks doesn't run without a execturor

All asynchronous computation occurs within an executor

It doesn't change tha fact async tasks has overhead because of the need of exector. I probably went into the too details sutff here

@cijothomas
Copy link
Member Author

@TommyCpp @lalitb
the PR is just to show that forcing async in places where it is not required has some overhead. The exporter trait is currently forcing async and is only good if the exporter actually need async call like doing http or grpc etc. In other words, our exporter is making assumptions that every exporter will need to do io calls and benefit from async but that is not the case.

The discussion about runtime etc. is not very relevant (apologies my knowledge of async runtime is very limited).

I have some workarounds in mind but first step was to validate the assumption that introducing async where it is not required is causing unnecessary perf hit.

@TommyCpp
Copy link
Contributor

TommyCpp commented Aug 15, 2024

The discussion about runtime etc. is not very relevant (apologies my knowledge of async runtime is very limited).

Yep. Sorry I kind of hijacked the discussion 😞

I have some workarounds

Would it make sense to offer two versions(one for sync, one for async). It's not uncommon in Rust eco system. One example would be reqwest, which offers async and blocking implementation

Copy link
Contributor

@TommyCpp TommyCpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark looks good. I have a few thoughts on how to setup runtime/try differnt runtime to minimize the cost. I can explore it offline once we merge it

@cijothomas
Copy link
Member Author

Opened an issue to track the change : #2031
Wasn't planning to merge this benchmark originally, but no harm in keeping it either.

@cijothomas cijothomas merged commit 108161c into open-telemetry:main Aug 16, 2024
@cijothomas cijothomas deleted the cijothomas/future-exporter branch September 4, 2024 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants