-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Describe the issue
Python wheels containing MIGraphx and/or ROCm EP fail when built in a pyenv environment. We're consistently seeing errors with libonnxruntime_providers_shared.so not being found, and failures when trying to just do a simple import.
I've tested and built things in various containers + Conda/Venv environments and don't see the issue. Builds off 1.19.0 work without any issue using a pyenv environment as well.
Currently the only way forward is for a user is to hard set LD_LIBRARY_PATH to find the proper location of the installed wheel files when using newer wheels.
Urgency
This is blocking developement/testing off the latest changes for an upcomming ROCm release
Target platform
Ubuntu 24.04
Build script
Uses the same build script used in CI for both MIGraphX and ROCM EPs
Error / output
Example output error case
Python 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import onnxruntime
2025-02-05 02:24:29.493499078 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2318 CreateInferencePybindStateModule] Init provider bridge failed.
>>> s = onnxruntime.InferenceSession('model.onnx', providers=['MIGraphXExecutionProvider'])
2025-02-05 02:24:59.112588500 [W:onnxruntime:, graph.cc:4381 CleanUnusedInitializersAndNodeArgs] Removing initializer 'bert.pooler.dense.bias'. It is not used by any node and should be removed from the model.
2025-02-05 02:24:59.112627071 [W:onnxruntime:, graph.cc:4381 CleanUnusedInitializersAndNodeArgs] Removing initializer 'bert.pooler.dense.weight'. It is not used by any node and should be removed from the model.
*************** EP Error ***************
EP Error /onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1499 void onnxruntime::ProviderSharedLibrary::Ensure() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_shared.so with error: libonnxruntime_providers_shared.so: cannot open shared object file: No such file or directory
when using ['MIGraphXExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
****************************************
Ironically paths for the wheel are correct and the files exist when looking at pip
onnxruntime-rocm 1.21.0 /opt/conda/envs/py_3.12/lib/python3.12/site-packages pip
It appears as if RPath is baked into the wheel somehow based on the pyenv and only after 1.21.1. I don't have this issue with earlier wheels (1.19(
Visual Studio Version
No response
GCC / Compiler Version
No response