[Bug]: Stable diffusion model failed to load when using `--precision half` in command line and enable FP8

### Checklist

- [x] The issue exists after disabling all extensions
- [x] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
- [ ] The issue exists in the current version of the webui
- [x] The issue has not been reported before recently
- [ ] The issue has been reported before but has not been fixed yet

### What happened?

I switched the branch of sd webui to the dev branch, and used `git pull` to ensure that sd webui is the latest version. Then I used the following command to start sd webui.
```
python launch.py --xformers --precision half
```
Before this, I had already enabled fp8, so after starting sd webui with the above command, there was a model loading failure.

### Steps to reproduce the problem

1. switch sd webui branch to the dev branch.
2. Using `git pull` to latest version. The current sd webui version hash：`a30b19d`
3. Using this command to start sd webui.
```
python launch.py --xformers --precision half
```
4. Loading SDXL model.
5. Then Failed.

### What should have happened?

The SDXL model is able to load properly.

### What browsers do you use to access the UI ?

_No response_

### Sysinfo

[sysinfo-2024-07-01-15-40.json](https://github.com/user-attachments/files/16056039/sysinfo-2024-07-01-15-40.json)


### Console logs

```Shell
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: v1.9.4-169-ga30b19dd
Commit hash: a30b19dd5536f463222e484aef2daf466b49ee85
Launching Web UI with arguments: --xformers --precision half
ldm/sgm GroupNorm32 replaced with normal torch.nn.GroupNorm due to `--precision half`.
Loading weights [e3c47aedb0] from E:\Softwares\stable-diffusion-webui\models\Stable-diffusion\animagine-xl-3.1.safetensors
Running on local URL:  http://127.0.0.1:7860
Creating model from config: E:\Softwares\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(

To create a public link, set `share=True` in `launch()`.
Startup time: 13.3s (prepare environment: 2.3s, import torch: 3.9s, import gradio: 0.8s, setup paths: 1.4s, initialize shared: 0.8s, other imports: 0.6s, load scripts: 0.4s, create ui: 0.3s, gradio launch: 2.7s).
Loading VAE weights specified in settings: E:\Softwares\stable-diffusion-webui\models\VAE\sdxl_fp16_fix_vae.safetensors
Applying attention optimization: xformers... done.
loading stable diffusion model: RuntimeError
Traceback (most recent call last):
  File "D:\Softwares\Python310\lib\threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "D:\Softwares\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "D:\Softwares\Python310\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "E:\Softwares\stable-diffusion-webui\modules\initialize.py", line 149, in load_model
    shared.sd_model  # noqa: B018
  File "E:\Softwares\stable-diffusion-webui\modules\shared_items.py", line 175, in sd_model
    return modules.sd_models.model_data.get_sd_model()
  File "E:\Softwares\stable-diffusion-webui\modules\sd_models.py", line 648, in get_sd_model
    load_model()
  File "E:\Softwares\stable-diffusion-webui\modules\sd_models.py", line 800, in load_model
    sd_model.cond_stage_model_empty_prompt = get_empty_cond(sd_model)
  File "E:\Softwares\stable-diffusion-webui\modules\sd_models.py", line 683, in get_empty_cond
    d = sd_model.get_learned_conditioning([""])
  File "E:\Softwares\stable-diffusion-webui\modules\sd_models_xl.py", line 32, in get_learned_conditioning
    c = self.conditioner(sdxl_conds, force_zero_embeddings=['txt'] if force_zero_negative_prompt else [])
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\repositories\generative-models\sgm\modules\encoders\modules.py", line 141, in forward
    emb_out = embedder(batch[embedder.input_key])
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\modules\sd_hijack_clip.py", line 234, in forward
    z = self.process_tokens(tokens, multipliers)
  File "E:\Softwares\stable-diffusion-webui\modules\sd_hijack_clip.py", line 276, in process_tokens
    z = self.encode_with_transformers(tokens)
  File "E:\Softwares\stable-diffusion-webui\modules\sd_hijack_clip.py", line 354, in encode_with_transformers
    outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=self.wrapped.layer == "hidden")
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 822, in forward
    return self.text_model(
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 740, in forward
    encoder_outputs = self.encoder(
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 654, in forward
    layer_outputs = encoder_layer(
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 383, in forward
    hidden_states, attn_weights = self.self_attn(
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 272, in forward
    query_states = self.q_proj(hidden_states) * self.scale
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\Softwares\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 527, in network_Linear_forward
    return originals.Linear_forward(self, input)
  File "E:\Softwares\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 116, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got Half and Float8_e4m3fn


Stable diffusion model failed to load
```


### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Stable diffusion model failed to load when using `--precision half` in command line and enable FP8 #16122

Checklist

What happened?

Steps to reproduce the problem

What should have happened?

What browsers do you use to access the UI ?

Sysinfo

Console logs

Additional information

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: Stable diffusion model failed to load when using --precision half in command line and enable FP8 #16122

Description