Skip to content

RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device #794

@JusticeGH

Description

@JusticeGH

I have been trying to get whisperX to work on my GTX 970, but have been running into a myriad of problems. Please bear with me as I’m a beginner in all things programming.

I followed all the installation instructions to the letter and then ran the following command:
whisperx.exe "C:\Users\Justin\Music\Kraft Punk Soundbites\WAV\Hey what's up_ I'm Kraft Punk.wav" --model large-v2 --device cuda --batch_size 1 --compute_type float32 --output_dir "C:\Users\Justin\Desktop\" --language en --diarize --min_speakers 1 --max_speakers 1 --hf_token XXXXXXXXXXXXXXXXX

I then ran into the following error:

torchvision is not available - cannot save figures
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.3. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint C:\Users\Justin\.cache\torch\whisperx-vad-segmentation.bin`
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x.
>>Performing transcription...
Traceback (most recent call last):
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Justin\miniconda3\envs\whisperx\Scripts\whisperx.exe\__main__.py", line 7, in <module>
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\whisperx\transcribe.py", line 176, in cli
    result = model.transcribe(audio, batch_size=batch_size, chunk_size=chunk_size, print_progress=print_progress)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\whisperx\asr.py", line 218, in transcribe
    for idx, out in enumerate(self.__call__(data(audio, vad_segments), batch_size=batch_size, num_workers=num_workers)):
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\transformers\pipelines\pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\transformers\pipelines\pt_utils.py", line 125, in __next__
    processed = self.infer(item, **self.params)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\transformers\pipelines\base.py", line 1112, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\whisperx\asr.py", line 152, in _forward
    outputs = self.model.generate_segment_batched(model_inputs['inputs'], self.tokenizer, self.options)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\whisperx\asr.py", line 47, in generate_segment_batched
    encoder_output = self.encode(features)
  File "C:\Users\Justin\miniconda3\envs\whisperx\lib\site-packages\whisperx\asr.py", line 86, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device

I did run some basic diagnostics to check if CUDA is available:

>>> import torch
>>> import sys
>>> print('A', sys.version)
A 3.10.14 | packaged by Anaconda, Inc. | (main, Mar 21 2024, 16:20:14) [MSC v.1916 64 bit (AMD64)]
>>> print('B', torch.__version__)
B 2.3.0
>>> print('C', torch.cuda.is_available())
C True
>>> print('D', torch.backends.cudnn.enabled)
D True
>>> device = torch.device('cuda')
>>> print('E', torch.cuda.get_device_properties(device))
E _CudaDeviceProperties(name='NVIDIA GeForce GTX 970', major=5, minor=2, total_memory=4095MB, multi_processor_count=13)
>>> print('F', torch.tensor([1.0, 2.0]).cuda())
F tensor([1., 2.], device='cuda:0')
>>> import torch
>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.2926, 0.4866, 0.1281],
        [0.6154, 0.8456, 0.5436],
        [0.4880, 0.7883, 0.2404],
        [0.6841, 0.2353, 0.2622],
        [0.9875, 0.0566, 0.4680]])

Also ran nvidia-smi:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.23                 Driver Version: 551.23         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 970       WDDM  |   00000000:01:00.0  On |                  N/A |
| 47%   30C    P2             50W /  250W |     980MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

I have asked the NVIDIA Developer Forum, but they say it's not a CUDA error:

One or more of the software stacks (perhaps the whisperx.exe executable) you are using have not been compiled to support a GTX 970. This isn’t a CUDA setup issue (which is what this forum is about) but rather a problem with the software stack.

I'm at a loss and would really appreciate any help :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions