Open
Description
Hello, as the RedisAI team knows, CrayLabs uses RedisAI within SmartSim. Currently we have a few issues that we figured would best be grouped together such that they can be tracked as an "epic" of sorts.
For RedisAI 1.2.3
General
- RAI doesn't build with GCC 10 Build error with gcc 10.3.0 #777
- workaround in place from @DvirDukhan is here remove duplicate symbols for real gcc10 compatibility #825
TensorFlow Cmake
- Currently, if not downloaded by the
get_deps.sh
script, thefindTensorFlow.cmake
file is used to determine the location of tensorflow. This file is out of date. for RAI 1.2.x newer tensorflow versions should be ok, but it throws errors.
ONNX
- Due to RedisAI vendoring ONNX, it is impossible to build RedisAI with the standard ONNX libraries. This brings a few headaches
- anyone looking to compile on an arch other than ubuntu latest is forced to download the vendored version, its dependencies, build and then manually build RedisAI. There are also no instructions for this we are aware of.
- The vendored version, to our knowledge, doesn't compile on OSX (I see this is being worked on now onnxruntime 1.7.2 build and documentation #785 )
- At the current moment, the
get_deps
script for ONNX on OSX points to a dead link Unable to build RedisAI from Source - bash get_deps.sh #743 - the release notes state that ONNX 1.6 is supported, however the
get_deps
script for OSX seems to point to 1.7.1 (which doesn't exist). what version is actually supported? - When are the standard ONNX shared libraries going to work with RedisAI again? do y'all plan on maintaining a fork of ONNX forever?
- Answer from RAI: Not forever, coming weeks will revert back to ONNX because Support plugging in custom user-defined allocators for sharing between sessions microsoft/onnxruntime#8059 has been merged - GLIBC issues when compiling for GPU on SUSE Linux 15.2 lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826
Backends
- In general it seems like RedisAI relies on docker images to perform builds for the backends. Because of the environments (OS, system libraries, etc) these docker containers have, the backends are built with newer versions of GLIBC and other dependencies that make them unusable on older OS's even though those backend libraries (libtorch, tensorflow, etc) are readily supported. it seems as if others are running into these issues as well
GLIBC_2.27 not found
on Xenial GPU Docker image #724 - For example, we currently do not use the Torch backend for this reason, we
pip install
the torch library and use the shared libraries that come with that. this works fine by just settingTorch_DIR
and then bypassing torch in theget_deps
script but invoking torch in the build. - I've documented this for ONNX in ticket lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826
Key Point In general I think it would be best if the cmake and build setup were enabled to allow the user to pass environment variables that specify the locations of the backends they have already built.
For
- PyTorch this works (pass Torch_DIR)
- For TensorFlow this does not (see above)
- For ONNX this does not because of the vendored ONNX version.
This would also help users who would like to try to get RedisAI built for AMD GPUs
More clarity about the roadmap in terms of expected dates and versions would be much appreciated as well. ex. #591
Metadata
Metadata
Assignees
Labels
No labels