Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ BMI modules not compiled/linked with the same runtime as ngen likely won't work. #440

Open
hellkite500 opened this issue Sep 1, 2022 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@hellkite500
Copy link
Member

hellkite500 commented Sep 1, 2022

The C++ adapter interface to use a c++ compiled, dynamically linked (shared) library requires that the library and ngen share a compatible c++ ABI, which isn't exactly trivial (see this post for a good description, as well as this for some additional context).

Current behavior

Compile a C++ bmi dynamic library module with different compiler than ngen is built with. Load the model with ngen, and it will likely fail to even initialize, as the C++ BMI interface passes a std::string value to the Initialize function. The model library won't get the correct string, and/or may seg fault.

Expected behavior

Must either document C++ compatability/building for C++ based models, and ensure that builds are ABI compatible (built with the same compiler, flags, runtime, ect...)

OR

Use a C style linkage for the API and pass all the C++ std::string variables as char * for compatibility between dynamic boundaries. This will require an adaptation of "C++ BMI" to provide a header with this interface that would be used for dynamic linking.

Steps to replicate behavior (include URLs)

  1. Compile the test_bmi_c++ with clang (specifically reproducible with clang-1316.0.21.2.5)
  2. Compile ngen with gcc (specifically reproducible with 12.2.0)

Run test/test_bmi_cpp and all tests will fail to setup because Initialize fails to get a valid config string.

@mattw-nws mattw-nws added the documentation Improvements or additions to documentation label Nov 21, 2022
@mattw-nws
Copy link
Contributor

Documentation label covers only part of this issue, but there is a documentation part.

@mattw-nws
Copy link
Contributor

Since I'm coming through doing triage... yes, we encountered this in passing when we found we had to switch to the legacy ABI (use -D_GLIBCXX_USE_CXX11_ABI=0) with the Intel compilers (2023.3) to link properly with a dependent library--the C++ modules that had already been compiled broke and had to be recompiled with the same option.

I don't particularly like the recommendation to use C linkage for C++ models, though this is always an option for any implementer. While this is kind of a job hazard for C++ development generally, it's worth noting in the documentation for C++ models, perhaps citing this switch as an example case.

@PhilMiller
Copy link
Contributor

Just adding a deeper explanatory note about the considerations here, since I came across this by chance.

Upshot

For the particular case we care about, of std::string, we may be able to use dlsym to look for an ABI tag in the loaded library that matches what we have compiled in the engine, and error out informatively if that fails. I'll look into this, now that I've thought of it.

Deeper discussion follows. Feel free to glaze your eyes and stop reading here.

Compilers

In practice, all C++ compilers in use on any given platform do agree on the language-level ABI where it goes beyond C (e.g. name mangling, dynamic dispatch, exception unwinding, argument passing, etc). I think there's one or two rare cases where there have been changes, and the compilers involved emit warnings on the platforms involved when they see code that would potentially be affected.

C++ Standard library and compiler runtime library

The bigger differences have been within and across implementations of

  • versions of the language standard [1]
  • the language support library (what LLVM calls compiler-rt, IIRC; containing mostly exception handling support bits),
  • choice of C++ standard library implementation (GNU's libstdc++ vs LLVM's libc++ on Linux or macOS, either of those vs MS's implementation on Windows with WSL or Cygwin)

Language standard of course has to actually be in strict agreement, and different compilers may quietly cause disagreements between different modules because of diverging default settings. None of those that have happened are recent enough to impact us. [1]

The language support library and C++ standard library are generally less problematic. Typical configuration on most systems and package manager environment is that every compiler will default to referencing the host system's main installation unless told otherwise on the command line.

std::string

The specific case of the C++11 std::string (and std::list) ABI controlled by _GLIBCXX_USE_CXX11_ABI is a bit special. It's only relevant with GNU libstdc++, which is often the default on Linux even when building with Clang, since it's usually the platform default. The standard changed in a way that required an ABI break for some implementations [2]. Many platforms that use GNU's libstdc++ defaulted to everything using the old ABI for a long time. It doesn't surprise me that Intel shipped a compiler configured to diverge from that practice.

  • [1] See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1654r0.html §3
  • [2] Specifically, GNU's libstdc++ used a reference-counted copy-on-write string implementation, which is no longer permitted because of thread safety considerations. Others had taken a (very-)short-string-optimization approach that remained permissible, and GNU libstdc++ has since adopted as its new ABI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants