Skip to content

could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR #70

@devindrown

Description

@devindrown

Setup Medaka v0.8.1 to run with GPU, but it crashes consistently get this error during runtime Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR.

I'm seeing references to gpu_options.allow_growth = True online but not sure how that would be implemented with this code.

System:
Ubuntu 18.04
Cuda 10.1
tensorflow-gpu 1.12 (also tried 1.14 and 2.0.0-beta1)

2019-08-07 22:35:13.583084: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-08-07 22:35:13.584900: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
File "/home/dmdrown/medaka/venv/bin/medaka", line 11, in
load_entry_point('medaka==0.8.1', 'console_scripts', 'medaka')()
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/medaka-0.8.1-py3.6-linux-x86_64.egg/medaka/medaka.py", line 363, in main
args.func(args)
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/medaka-0.8.1-py3.6-linux-x86_64.egg/medaka/inference.py", line 462, in predict
tag_name=args.tag_name, tag_value=args.tag_value, tag_keep_missing=args.tag_keep_missing
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/medaka-0.8.1-py3.6-linux-x86_64.egg/medaka/inference.py", line 388, in run_prediction
class_probs = model.predict_on_batch(x_data)
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1294, in predict_on_batch
outputs = self.predict_function(inputs)
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3292, in call
run_metadata=self.run_metadata)
File "/home/dmdrown/medaka/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Fail to find the dnn implementation.
[[{{node bidirectional/CudnnRNN_1}}]]
[[classify/truediv/_123]]
(1) Unknown: Fail to find the dnn implementation.
[[{{node bidirectional/CudnnRNN_1}}]]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions