Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUTLASS submodule not found. #2047

Closed
sbstratos79 opened this issue Oct 9, 2022 · 22 comments
Closed

CUTLASS submodule not found. #2047

sbstratos79 opened this issue Oct 9, 2022 · 22 comments
Labels
bug-report Report of a bug, yet to be confirmed

Comments

@sbstratos79
Copy link

sbstratos79 commented Oct 9, 2022

Describe the bug
Running webui.sh with --xformers flag gives the error:
RuntimeError: CUTLASS submodule not found. Did you forget to run `git submodule update --init --recursive` ?

To Reproduce
Steps to reproduce the behavior:

  1. Add --xformers flag to webui-user.sh
  2. Run ./webui.sh
  3. See the error during xformers installation.

Expected behavior
xformers should install.

Desktop (please complete the following information):

Additional context
Before the --xformers flag was added, I was using the new cross attention method in a conda env after manually installing xformers and editing the attention.py in ldm modules. It was working without any issues. I have since uninstalled conda and tried using this new flag with a fresh venv.
Btw, I have tried git submodule update --init --recursive in stable-diffusion folder. It didn't help.

@sbstratos79 sbstratos79 added the bug-report Report of a bug, yet to be confirmed label Oct 9, 2022
@C43H66N12O12S2
Copy link
Collaborator

Does entering pip install xformers outside venv give the same error?

@sbstratos79
Copy link
Author

sbstratos79 commented Oct 9, 2022

Does entering pip install xformers outside venv give the same error?

Yes.

@C43H66N12O12S2
Copy link
Collaborator

Do pip install -U -I xformers outside venv. If that fails as well, open an issue upstream.

@r3nor
Copy link

r3nor commented Oct 9, 2022

I am facing the same error. Have you found any solution? I tried all the solutions @C43H66N12O12S2 has commented, but none of them worked for me.

@mifortin
Copy link

mifortin commented Oct 9, 2022

Try "pip install xformers==0.0.12"? seems 0.0.13 doesn't like being installed on my PC.

@r3nor
Copy link

r3nor commented Oct 9, 2022

@mifortin Yes, using v0.0.12 did work. v0.0.13 is not working. Thank you.

@trufty
Copy link
Contributor

trufty commented Oct 9, 2022

0.0.12 was the solution for me using docker as well.

Also docker users, you have to use the devel Nvidia Cuda container or install nvcc manually for xformers to install correctly.
FROM nvidia/cuda:11.7.x-devel-ubuntu22.04

I chose 22.04 because it comes with python 3.10 by default.

@luckyycode
Copy link

luckyycode commented Oct 9, 2022

0.0.12 was the solution for me using docker as well.

Also docker users, you have to use the devel Nvidia Cuda container or install nvcc manually for xformers to install correctly. FROM nvidia/cuda:11.7.x-devel-ubuntu22.04

I chose 22.04 because it comes with python 3.10 by default.

which dockerized sd version are you using? I get "triton" module missing if I use this image with xformers==0.0.12. I tried the same cuda container, installed pip and I get missing modules "triton", "boto" and "psycorg2"

I also tried to install it manually, but I get exactly the same error as this #2073

#xformers
set DEBIAN_FRONTEND=noninteractive
set NVCC_FLAGS="-allow-unsupported-compiler"
set TORCH_CUDA_ARCH_LIST=8.6
apt install -y g++
git clone https://github.com/facebookresearch/xformers.git repositories/xformers
cd repositories/xformers
git submodule update --init --recursive
pip install -r requirements.txt
pip install -e .
pip install cutlass
EOF```

@trufty
Copy link
Contributor

trufty commented Oct 9, 2022

@luckyycode build xformers in your dockerfile like this facebookresearch/xformers#473 (comment) and comment out xformers install in launch.py

That was a pain to work around...

@slix
Copy link

slix commented Oct 9, 2022

@luckyycode build xformers in your dockerfile like this facebookresearch/xformers#473 (comment) and comment out xformers install in launch.py

That was a pain to work around...

A possible workaround that I'm trying: facebookresearch/xformers#473 (comment)

@BuffMcBigHuge
Copy link

I was able to solve this by running the following script on Ubuntu:

cd stable-diffusion-webui
python -m venv venv
source venv/bin/activate
cd ..
git clone https://github.com/facebookresearch/xformers/
cd xformers
pip install --verbose --no-deps -e .
pip install -r requirements.txt
pip install functorch==0.2.1 ninja bitsandbytes
pip install -U --pre triton
deactivate
cd ../stable-diffusion-webui
bash webui.sh

@TwistedFromArma
Copy link

im still having this error. Pain as nohalf option gives ram space problems running sd2.1

@manyhats-mike
Copy link

manyhats-mike commented Dec 13, 2022

Make sure you have 'setuptools' installed first

pip install setuptools

Then you can follow the above steps, but you will also need to initialize the submodules to grab cutlass or it will complain

cd stable-diffusion-webui
python -m venv venv
source venv/bin/activate
cd ..
git clone https://github.com/facebookresearch/xformers/
cd xformers
git submodule update --init --recursive # THIS IS NEEDED TO GET PAST CUTLASS ERROR
pip install --verbose --no-deps -e .
pip install -r requirements.txt
pip install functorch==0.2.1 ninja bitsandbytes
pip install -U --pre triton
deactivate
cd ../stable-diffusion-webui
bash webui.sh

@GregHilston
Copy link

GregHilston commented Dec 21, 2022

@manyhats-mike I'm running into the follow error after following your instructions on a Linux host OS, and attempting to generate an image. Any ideas why?

I run the program with $ ./webui.sh --no-half --no-half-vae --medvram --opt-split-attention --xformers

  0%|                                                                                                 | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('An astronaut riding a horse', '', 'None', 'None', 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 0, 0, 0, False, False, False, False, '', 1, '', 0, '', True, False, False) {}
Traceback (most recent call last):
  File "/home/ghilston/Git/stable-diffusion-webui/modules/call_queue.py", line 45, in f
    res = list(func(*args, **kwargs))
  File "/home/ghilston/Git/stable-diffusion-webui/modules/call_queue.py", line 28, in f
    res = func(*args, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/txt2img.py", line 49, in txt2img
    processed = process_images(p)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/processing.py", line 464, in process_images
    res = process_images_inner(p)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/processing.py", line 567, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/processing.py", line 699, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_samplers.py", line 507, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_samplers.py", line 422, in launch_sampling
    return func()
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_samplers.py", line 507, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_samplers.py", line 321, in forward
    x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond={"c_crossattn": [cond_in[a:b]], "c_concat": [image_cond_in[a:b]]})
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 167, in forward
    return self.get_v(input * c_in, self.sigma_to_t(sigma), **kwargs) * c_out + input * c_skip
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 177, in get_v
    return self.inner_model.apply_model(x, t, cond)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1212, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward
    h = module(h, emb, context)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward
    x = block(x, context=context[i])
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_hijack_checkpoint.py", line 4, in BasicTransformerBlock_forward
    return checkpoint(self._forward, x, context)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 107, in forward
    outputs = run_function(*args)
  File "/home/ghilston/Git/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 262, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ghilston/Git/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 227, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None)
  File "/home/ghilston/Git/xformers/xformers/ops/fmha/__init__.py", line 193, in memory_efficient_attention
    return _memory_efficient_attention(
  File "/home/ghilston/Git/xformers/xformers/ops/fmha/__init__.py", line 289, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "/home/ghilston/Git/xformers/xformers/ops/fmha/__init__.py", line 311, in _memory_efficient_attention_forward
    out, *_ = op.apply(inp, needs_gradient=False)
  File "/home/ghilston/Git/xformers/xformers/ops/fmha/cutlass.py", line 103, in apply
    out, lse = cls.OPERATOR(
  File "/home/ghilston/Git/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_ops.py", line 442, in __call__
    return self._op(*args, **kwargs or {})
NotImplementedError: Could not run 'xformers::efficient_attention_forward_cutlass' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'xformers::efficient_attention_forward_cutlass' is only available for these backends: [BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:140 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at ../aten/src/ATen/functorch/DynamicLayer.cpp:488 [backend fallback]
Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:291 [backend fallback]
Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback]
Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback]
AutogradOther: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback]
AutogradCPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback]
AutogradCUDA: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback]
AutogradXLA: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:51 [backend fallback]
AutogradMPS: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:59 [backend fallback]
AutogradXPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback]
AutogradHPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:68 [backend fallback]
AutogradLazy: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:55 [backend fallback]
Tracer: registered at ../torch/csrc/autograd/TraceTypeManual.cpp:296 [backend fallback]
AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:482 [backend fallback]
AutocastCUDA: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:324 [backend fallback]
FuncTorchBatched: registered at ../aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:743 [backend fallback]
FuncTorchVmapMode: fallthrough registered at ../aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1064 [backend fallback]
VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at ../aten/src/ATen/functorch/TensorWrapper.cpp:189 [backend fallback]
PythonTLSSnapshot: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at ../aten/src/ATen/functorch/DynamicLayer.cpp:484 [backend fallback]
PythonDispatcher: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback]

If I remove the --xformers argument , and uninstall the xformers package, I can successfully generate images

@AlexNolasco
Copy link

Problem still persists Ubuntu 22.04.1 LTS

@Lesani
Copy link

Lesani commented Dec 28, 2022

problem persists in a fresh ubuntu 22.10 install as well, manually installing the 0.0.12 version worked, 0.0.13 does not want to install, going on about "CUTLASS submodule not found"

I stand corrected, only the installation worked, xformers itself did not, even after installing triton as it requests upon startup...

I don't know what to do, xformers worked great on windows, I switched to ubuntu because even on my 24gb vram gpu I regularly ran out when Dreambooth requestsd like 16 or so, because other stuff was reserving so much I guess... but installing A1111 ubuntu is so much more non-automatic.....

@AlexNolasco
Copy link

Docker may be an alternative
https://github.com/AbdBarho/stable-diffusion-webui-docker

But other things may fail, e.g. CLIP interrogator.

problem persists in a fresh ubuntu 22.10 install as well, manually installing the 0.0.12 version worked, 0.0.13 does not want to install, going on about "CUTLASS submodule not found"

I stand corrected, only the installation worked, xformers itself did not, even after installing triton as it requests upon startup...

I don't know what to do, xformers worked great on windows, I switched to ubuntu because even on my 24gb vram gpu I regularly ran out when Dreambooth requestsd like 16 or so, because other stuff was reserving so much I guess... but installing A1111 ubuntu is so much more non-automatic.....

@GregHilston
Copy link

I'm still unable to resolve this.

Unrelated to this thread, but @AlexNolasco , which repo have you been using to run Dreambooth?

@elephantpanda
Copy link

I'm running:
pip install xformers==0.0.12
and I'm getting the error:
running build_ext error: [WinError 2] The system cannot find the file specified
I am running Python3.8

@nancy6o6
Copy link

I have the same error when running"pip install xformers" on python 3.9.12.
I figure it by downgrading python to 3.8.5 and run "python -m pip install --upgrade pip" and "python -m pip install --upgrade setuptools" to reload the version matching python3.8.5. Then I can install xformers 0.0.13 using pip install.

@AlexNolasco
Copy link

I'm still unable to resolve this.t
I simply reinstalled and it has been working since.

Unrelated to this thread, but @AlexNolasco , which repo have you been using to run Dreambooth?
Not running Dreambooth

Symbiomatrix pushed a commit to Symbiomatrix/stable-diffusion-webui that referenced this issue Sep 17, 2023
@haha889
Copy link

haha889 commented May 14, 2024

Can xformers only be installed on Linux systems?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed
Projects
None yet
Development

No branches or pull requests