Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUILD] Improve compilation speeds by avoiding std::regex instantiation in header file #2431

Closed
kylepl opened this issue Dec 5, 2023 · 5 comments
Assignees

Comments

@kylepl
Copy link
Contributor

kylepl commented Dec 5, 2023

As a follow-up to #2430, I realized that std::regex instantiation takes a decent chunk of time (like 500-750ms on my dev laptop). and given it does not need to exist (since it is needed only once, not for every translation unit that users Tracer), seems like a nice thing to improve.

I'm looking for advice on how to do so - basically where to put a definition in a .cc file. All of /api/ is just headers from what I can see, and so looking for guidance about that.

Thanks!

@github-actions github-actions bot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 5, 2023
@marcalff marcalff self-assigned this Dec 12, 2023
@marcalff
Copy link
Member

@kylepl

Thanks for the previous cleanup on regex.

For architectural reasons, the entire opentelemetry-cpp API MUST be header only, this is a very strong constraint. We even managed the not so trivial feat to implement header only singletons in C++.

Regarding the use of regex in the SDK, this part is trivial, and I think all uses are located in a *.cc file already.

About the use of regex in the API, the only place that depends on it comes from trace_state.h.

The following methods:

  • TraceState::FromHeader()
  • TraceState::Get()
  • TraceState::Set()

all depends indirectly on regex.

Now, because trace_state.h is used in span_context.h, and because span_context.h is used in many places, indeed the regex code is compiled in many translation units.

Possible techniques to cut down dependencies like this are:

  • move code in a different helper class, using friends declarations if needed, and using a forward declaration to cut
  • move the inlined, in class implementation of a method to an inlined, outside of class implementation, defined in a separate header file (trace_state_inline.h)

The main challenge here is to do so without breaking the API, so that existing user code calling my_trace_state->Set() still build and run properly.

From a quick investigation on this topic, I can't think of a viable fix to limit the regex exposure from trace_state.h (without breaking changes that is).

A totally different solution can be to also force OPENTELEMETRY_HAVE_WORKING_REGEX to be 0, so that the alternate (non regex) code is used instead, even when a "working" regex is available.

This would require yet another cmake compiling option.

@kylepl
Copy link
Contributor Author

kylepl commented Dec 13, 2023

I appreciate the detailed response!

For my local builds, I am already using OPENTELEMETRY_HAVE_WORKING_REGEX=0, which does do the job.

Can you expand on the rationale for the header only requirement? As in, totally understand it can't be changed now, but I'm curious what benefit it provides other than potentially simpler inclusion.

@marcalff
Copy link
Member

marcalff commented Dec 13, 2023

Can you expand on the rationale for the header only requirement? As in, totally understand it can't be changed now, but I'm curious what benefit it provides other than potentially simpler inclusion.

Please check this discussion:

#1520 (comment)

The main issue is adoption of opentelemetry: someone shipping an instrumented library should not break the link, or force the link to change, for applications using the instrumented library.

@marcalff marcalff removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 13, 2023
@marcalff marcalff changed the title Improve compilation speeds by avoiding std::regex instantiation in header file [BUILD] Improve compilation speeds by avoiding std::regex instantiation in header file Dec 13, 2023
@marcalff
Copy link
Member

Improving build time is desirable, but this issue is not bug.

A work around is to build without std::regex (with minor tweaks to unset OPENTELEMETRY_HAVE_WORKING_REGEX).

Closing as not planned, as there is no known way to improve this while keeping a header only API, which is a strong constraint that must be satisfied.

@marcalff marcalff closed this as not planned Won't fix, can't repro, duplicate, stale Dec 13, 2023
@kylepl
Copy link
Contributor Author

kylepl commented Dec 13, 2023

Thanks for the follow-up information.

Perhaps I misunderstand, and I do not say this too confidently as I am no linker expert, but it seems possible to achieve the goals of:

When instrumenting library X, the library owner has no control, and should not be concerned with, whether the library will be deployed:
   * in a binary using opentelemetry-cpp (SDK) or not
   * in a binary using a given SDK, exporter, processor, etc
   * in a binary using another SDK, exporter, processor, etc

while still having .cc files - they just need to be linked into the library's archive too. Basically my belief is the symbols need to get to the archive, but they do not need to come in from headers in each compilation unit, but can come from linkage with another one.

And I'll re-iterate, not saying to change it now, just trying to understand.

But closing the bug seems fine, as that is a much larger change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants