-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async/callback based interface between Span/Log Processor and Span/Log Exporter #1239
Comments
To start with, this new interface can be quickly integrated with Zipkin, and OTLP HTTP exporter, as the existing ext::http::client support callback mechanism. |
This is also for Log Processor and Log Exporter. |
Hi @lalitb, I would like to explore on this, obviously with your guidance if this is ok. |
Thanks, @DebajitDas . Have assigned it to you now. Feel free to ask any questions. |
@DebajitDas Also, you may want to look into a related PR #63, which was an attempt to add async support using libevent. The PR adds support for files i/o and timers and networking was to follow. The PR was never merged for various reasons. JFYI, Feel free to decide on the best approach on adding the async support :) |
Thanks @lalitb . I am looking at C++ async mechanism and see if that can be used to achieve asynchronous Export Call. |
I create another issue #1243 to implement async send for Some framework implement their own event and timer management(e.g. redis). It need to implement different poller for different system, IOCP for Windows, epoll for linux(maybe io_uring for modern linux kernel), kqueue for BSD, evport for Solaris and etc. I'm thinking to use |
@DebajitDas You are right. The abstract method http::client::Session::SendRequest takes a callback function as an argument, and it's curl implementation http::client::curl::Session::SendRequest is synchronous i.e, this method returns only after request is completed and callback function is called. This should probably change once @owent implements #1243. But we can take full advantage of this change only once the Exporter::export() also takes a callback as an argument. Also once we support multiple HTTP clients along with curl (#1145 and #1146), these clients may be supporting request handling through polling and it would be good if the SDK is already prepared to handle that.
@owent agree on this. Any async mechanism requires an event loop to run, and this is missing in C++. std::async potentially uses a separate thread to run the operation which is not a true async mechanism. That is the reason PR #63 to use libevent should be something we can think about. It would be also useful to use this for metrics implementation, where we need to poll for metrics from multiple instruments simultaneously. What are your thoughts on using libevent? |
In my opnion, we should let the user decide which event loop to use. Should we also implement a adaptor just like hiredis? And we can use libevent by default. |
I also use libwebsockets in our project, it alse support to use libuv , libevent, libev, libubox/uloop and etc. as event loop. I think libuv and libevent is widely used now. |
From the interface perspective of Processor ---> Exporter, callback mechanism could be achieved by the following:
I might be wrong in my understanding and not at par with the context. Please correct me if I am missing something. |
I would prefer to offload most of the complexity of thread/concurrency management to Span/Log Processor and let the exporter logic be focused on the serialization and transport. This will also ease the work for exporter developers. If we can scope this issue for only interface change, and think about concurrency separately, the rough changes for ElasticsearchLogExporter would be (not tested): class ResponseHandler : public http_client::EventHandler
{
public:
/**
* Creates a response handler, that by default doesn't display to console
*/
ResponseHandler(bool console_debug = false,
nostd::function_ref<bool(ExportResult)> result_callback,
std::shared_ptr<Session> &&session ) :
console_debug_{console_debug}, session_{std::move(session)}
{
}
/**
* Automatically called when the response is received, store the body into a string and notify any
* threads blocked on this result
*/
void OnResponse(http_client::Response &response) noexcept override
{
// Log error if response body has error
session->FinishSession();
result_callback(sdk::common::ExportResult::kSuccess);
}
void OnEvent(http_client::SessionState state, nostd::string_view reason) noexcept override
{
//Log reason / state
session->FinishSession();
result_callback(sdk::common::ExportResult::kFailure)
}
};
void ElasticsearchLogExporter::Export(
const nostd::span<std::unique_ptr<sdklogs::Recordable>> &records,
nostd::function_ref<bool(ExportResult)> result_callback) noexcept
{
// Create a connection to the ElasticSearch instance
auto session = http_client_->CreateSession(options_.host_ + std::to_string(options_.port_));
auto request = session->CreateRequest();
request->SetUri(options_.index_ + "/_bulk?pretty");
request->SetMethod(http_client::Method::Post);
request->AddHeader("Content-Type", "application/json");
request->SetTimeoutMs(std::chrono::milliseconds(1000 * options_.response_timeout_));
body_vec = serialise(records);
request->SetBody(body_vec);
std::unique_ptr<ResponseHandler> handler(new ResponseHandler(options_.console_debug_, result_callback, session));
session->SendRequest(*handler);
} Again, would be good to have more thoughts before finalizing, and also can discuss in the community meeting. |
Agree. Can we also provide a internal event loop and let the exporters decide whether to use it or spawn a new thread to finish |
Yes, we definitely need to have an internal event loop. In the case of HTTP exporter, the decision of using the event loop for IO should be with ext::http::client implementation. So if the ext::http::curl want to use curl_multi_poll it should use the event loop if available. Do you think it would be beneficial for non-HTTP exporters? |
Yes. And if other exporters use any SDK which allow users to get the raw socket fd. We can also use our internal event loop to handle it. |
@lalitb Thanks for clarifying the scope of this issue. With the above, we need to maintain such that RequestHandler object remains valid even when callback is called in a separate thread. I have these follow up questions now:
|
It's a good point. Probably we need to pass the ownership of ResultHandler from exporter to HttpClient, and let it clean up after calling. This would need changes in HTTP client library
Yes, we need to do that for backward compatibility. Though maintaining ABI compatibility would be tricky, as any external exporter (built against the old SDK) would no longer work with the new SDK, and would need recompilation.
True. |
Thanks @lalitb for clarifying the doubts. |
After the new |
If it is possible to start with one of the exporters say Zipkin that would be good. Zipkin uses HTTP sync client as of now, so good use-case to convert that to use HTTP Async client, and see how it works. But if we are modifying the HTTPClient library ( say to transfer ownership of EventHandler, and manage callbacks), this may require changes in other exporters. |
Have created #1250 for supporting the event loop framework. Feel free to modify the issue, or add more data there. |
As pointed out in [https://github.com/open-telemetry/opentelemetry-specification/issues/2434#issuecomment-1076997385]
|
I use the variable name What's your suggestion about the default value? |
Yes, I had that in mind about the default value used in max_concurrent_requests and to start with we can go with 8 as default value. |
This issue was marked as stale due to lack of activity. It will be closed in 7 days if no furthur activity occurs. |
According to the latest https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md , Export() will never be called concurrently. |
#1413 fixes this. |
As of now, the flow of spans from Span Processor to Exporter is synchronous, which means the processor has to wait for the upload/export status of the ongoing request before invoking a new request.
Application -> Span::OnEnd() -> SimpleProcessor::OnEnd -> Mutex Lock -> Exporter::Export()
The application has to wait for the span to be uploaded to continue further with business logic.
The Batch processor simplifies this by batching/caching the spans in the processor and uploading them in a separate thread.
This allows the Application to continue processing business logic without waiting for spans to export, and also Span::OnEnd() will not block even when export is ongoing in a separate thread ( there is no lock due to the use of CircularBuffer to store Spans).
With BatchProcessor, while the interface between application and processor is fast (no wait), the interface between processor and exporter is still synchronous. As the processor needs to wait for the ongoing span(s) export to finish before invoking new export. This causes a bottleneck if the application is creating spans faster than the time taken by the exporter to export the spans, and eventually, the CircularBuffer gets full. Once the CircularBuffer gets full, Span::OnEnd() needs to wait/block on the buffer to be available for writing.
The current design for the interface between processor and exporter is driven by specification as below
So, the solution could be to return the exporter::Export() result as a callback or async mechanism,
This allows the exporter to invoke the result_callback when the export status is available, and the processor to invoke the export without waiting for the ongoing export to complete.
Also, the interface change should be backward compatible, and shouldn't break the existing/external exporters. One option would be to have the processor decide which export mechanism to use based on the configurable parameter. The default should be as existing ( i.e, synchronous export).
The text was updated successfully, but these errors were encountered: