Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so: cannot open shared object file: No such file or directory #41

Open
mandox-yh opened this issue Aug 15, 2019 · 4 comments

Comments

@mandox-yh
Copy link

mandox-yh commented Aug 15, 2019

I run the code sparse attention on p100, but it goes wrong.

System information

  • OS Platform and Distribution: ubuntu 18.04
  • TensorFlow installed from: pip install
  • TensorFlow version: 1.14.0(gpu)
  • Python version: 3.6.5
  • GCC: 7.4.0
  • CUDA version: 10.0
  • GPU model and memory: P100

Here's the traceback

Traceback (most recent call last):
  File "/home/zyh/sparse_attention-master/attention.py", line 4, in <module>
    from blocksparse import BlocksparseTransformer
  File "/root/anaconda3/lib/python3.6/site-packages/blocksparse/__init__.py", line 3, in <module>
    from blocksparse.utils import (
  File "/root/anaconda3/lib/python3.6/site-packages/blocksparse/utils.py", line 16, in <module>
    _op_module = tf.load_op_library(os.path.join(data_files_path, 'blocksparse_ops.so'))
  File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so: cannot open shared object file: No such file or directory

I check the libtensorflow_framework.so through find . -name libtensorflow_framework.so , however, it doesn't exist. Next, I find libtensorflow_framework.so1 at /root/anaconda3/lib/python3.6/site-packages/tensorflow/, so I copy the libtensorflow_framework.so1 to the libtensorflow_framework.so, and I append it to LD_LIBRARY_PATH through export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"path/to/your/libtensorflow".
It still not works.

Please help.
Thanks in advance

@andrewjungdg
Copy link

I also the get same problem with very similar environment, except that my Tensorflow was installed from Conda.

@galv
Copy link

galv commented Aug 21, 2019

I'm fairly certain that you are using a tensorflow version that is too new. See here: tensorflow/tensorflow#30175

Tensorflow changed the name of its shared object file in version 1.14. Maybe 1.13 will work for you, though. You can also install blocksparse from source to avoid this problem. Only Scott or someone from openai can comment on what version of tensorflow the wheel installed by pip was linked against.

My guess is that the SONAME of that file also changed as well, which is why your copying may not be working. I am not sure, though, and won't debug this myself.

@wasdfghjklr
Copy link

I installed tensorflow 1.13.1 and ran the code, but it goes wrong again.

I'm fairly certain that you are using a tensorflow version that is too new. See here: tensorflow/tensorflow#30175

Tensorflow changed the name of its shared object file in version 1.14. Maybe 1.13 will work for you, though. You can also install blocksparse from source to avoid this problem. Only Scott or someone from openai can comment on what version of tensorflow the wheel installed by pip was linked against.

My guess is that the SONAME of that file also changed as well, which is why your copying may not be working. I am not sure, though, and won't debug this myself.

@serser
Copy link

serser commented Sep 11, 2021

I came with the same problem. It turns out to be using horovod 0.16 with tensorflow 1.14. Force reinstalling horovod 0.15 and the problem disappears.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants