-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatible versions of Eigen and Dynet for Cuda 11.1 #1652
Comments
getting same error (in google colab):
the below lines are working:
and here it fails:
|
I've managed to setup a working DyNet with GPU version on Google Colabthe colab notebook below has all the installation and some test tutorial snippets from the dynet docs. I've double-confirmed GPU is used by opening a colab terminal and running this command: main parts that helped in the source installation script are: using precisely this Eigen version: and remove the compute_30 architecture with sed:
for having dynet globbaly in python - this is what worked for me:
several resources helped: |
Thanks albertbn! The above versions and steps worked for me. |
@albertbn @vahini01 I followed the updated steps you shared for manual GPU-installation and that resolved the installation issue for me as well, thank you! However, I see the following error when trying to run a training job with GPU enabled (
It's not clear to me why I am facing this error. Do you have any ideas? Have you faced this after installation? |
For these memory issues you should contact the dynet team. I think there
might be an open issue on this already. I also experienced a memory/
segmentation fault with a somewhat larger example than the toy 101 example
- it fails on CPU as well.
I had to switch to another framework to complete my project.
…On Wed, 26 Oct 2022 at 6:16, Karthik Gopalakrishnan < ***@***.***> wrote:
@albertbn <https://github.com/albertbn> @vahini01
<https://github.com/vahini01> I followed the updated steps you shared for
manual GPU-installation and that resolved the installation issue for me as
well, thank you!
However, I see the following error when trying to run a training job with
GPU enabled (--dynet-gpu and --dynet-mem as 3000):
[dynet] random seed: 2959804024
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
[dynet] initializing CUDA
[dynet] CUDA driver/runtime versions are 11.1/11.0
Request for 1 GPU ...
[dynet] Device Number: 0
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 1
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 2
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 3
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 4
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 5
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 6
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device Number: 7
[dynet] Device name: Tesla V100-SXM2-16GB
[dynet] Memory Clock Rate (KHz): 877000
[dynet] Memory Bus Width (bits): 4096
[dynet] Peak Memory Bandwidth (GB/s): 898.048
CUDA failure in cudaMemGetInfo( &free_bytes, &total_bytes )
the provided PTX was compiled with an unsupported toolchain.
[dynet] FAILED to get free memory
[dynet] Device(s) selected: 0CUDA failure in cudaMalloc(&ptr, n)
the provided PTX was compiled with an unsupported toolchain.
terminate called after throwing an instance of 'dynet::cuda_exception'
what(): cudaMalloc(&ptr, n)
It's not clear to me why I am facing this error. Do you have any ideas?
Have you faced this after installation?
—
Reply to this email directly, view it on GitHub
<#1652 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADJB45YIHFZEIXPGZYXRCD3WFCPABANCNFSM5VT7MLXQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@neubig could you please help with this? Running on CPU is taking WAY too long and I would really like to use my GPUs :) |
Hi,
I couldn't find any compatible versions.
System Specifications:
Cuda Version - 11.1
Tried:
Build Command:
To avoid Unsupported GPU architecture for compute_30 during build time, the below command is used.
cmake .. -DEIGEN3_INCLUDE_DIR=../eigen -DPYTHON='which python' -DBACKEND=cuda -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.1
Does anyone know how to fix the above issue?
The text was updated successfully, but these errors were encountered: