Foreword

This repository is a fork of the NVIDIA Stable-Diffusion-WebUI-TensorRT repository.
And this repository will Enhanced some features and fix some bugs. If you have any questions, please feel free to open an issue. issue please make sure to provide detailed information about the issue you are facing.
If you feel this repository is helpful to you, May give me this repo a star :D

TensorRT Extension for Stable Diffusion

This extension enables the best performance on NVIDIA RTX GPUs for Stable Diffusion with TensorRT. You need to install the extension and generate optimized engines before using the extension. Please follow the instructions below to set everything up. Supports Stable Diffusion 1.5,2.1, SDXL, SDXL Turbo, and LCM. For SDXL and SDXL Turbo, we recommend using a GPU with 12 GB or more VRAM for best performance due to its size and computational intensity.

🚧Forge webui only run already export checkpoint and lora!🚧

Export feature only on Automatic1111 works.
if you know sdw forge else alternatives endpoint get(hijack) about checkpoint,Lora data modules,pull requests it :D
any details on #1 issue and discuss.

DEMO

Fix Tab TensorRT LoRA refresh lora checkpoints list(can hot reload)
Add lora not available will auto export lora

Features

TensorRT LoRA tab - hot reload(refresh) LoRA checkpoints.
When using TensorRT LoRA to generate, the console displays the initial loading TensorRT LoRA data progress bar.
Fixed TensorRT LORA's inability to read subdirectories under the LORA directory and the inability to read files with parentheses and dots in their names.
Add a new feature to automatically export LoRA when the selected LoRA is not available.
support automatic1111 webui and forge webui.

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

support non-webui?

Right now only supprt A41 webui.

I would be very happy, if other non-webui extensions could be developed, but in fact, the current situation does not have many development extension documents to support developers' motivation to develop extensions, so this can only take time to settle.

Installation

Example instructions for Automatic1111:

Start the webui.bat
Select the Extensions tab and click on Install from URL
Copy the link to this repository and paste it into URL for extension's git repository
Click Install

ISSUE FIXED

if you pop the error message like this:

> The procedure entry point ... in the dynamic link library ... your_sdw_path\venv\Lib\site-packages\nvidia\cudnn\bin\cudnn64_8.dll some error like this message

A: Just this path your_sdw_path\venv\Lib\site-packages\nvidia\cudnn\bin\ create backup folder and move all file to backup folder. And try to run the webui.bat again.

How to use

Go to Settings → User Interface → Quick Settings List, add sd_unet. Apply these settings, then reload the UI.
Go to Settings → Extra Networks → When adding to prompt, refer to Lora by. click "Filename". Apply these settings, then reload the UI.
Click on the “Generate Default Engines” button. This step takes 2-10 minutes depending on your GPU. You can generate engines for other combinations.
Back in the main UI, select “Automatic” from the sd_unet dropdown menu at the top of the page if not already selected(None).
You can now start generating images accelerated by TRT. If you need to create more Engines, go to the TensorRT tab.

when the selected "checkpoint" is not available won't be automatically generate engines, user need to manually generate engines.

🚧Frist time generate engines need wait more longer cause "timing cache" need cache🚧

Happy prompting!

LoRA

To use LoRA / LyCORIS checkpoints they first need to be converted to a TensorRT format. This can be done in the TensorRT extension in the Export LoRA tab.

Select a LoRA checkpoint from the dropdown.
Export. (This will not generate an engine but only convert the weights in ~20s)
You can use the exported LoRAs as usual using the prompt embedding.

More Information

TensorRT uses optimized engines for specific resolutions and batch sizes. You can generate as many optimized engines as desired. Types:

The "Export Default Engines” selection adds support for resolutions between 512 x 512 and 768x768 for Stable Diffusion 1.5 and 2.1 with batch sizes 1 to 4. For SDXL, this selection generates an engine supporting a resolution of 1024 x 1024 with a batch size of 1.
Static engines support a single specific output resolution and batch size.
Dynamic engines support a range of resolutions and batch sizes, at a small cost in performance. Wider ranges will use more VRAM.
The first time generating an engine for a checkpoint will take longer. Additional engines generated for the same checkpoint will be much faster.

Each preset can be adjusted with the “Advanced Settings” option. More detailed instructions can be found here.

Common Issues/Limitations

HIRES FIX: If using the hires.fix option in Automatic1111 you must build engines that match both the starting and ending resolutions. For instance, if the initial size is 512 x 512 and hires.fix upscales to 1024 x 1024, you must generate a single dynamic engine that covers the whole range.

Resolution: When generating images, the resolution needs to be a multiple of 64. This applies to hires.fix as well, requiring the low and high-res to be divisible by 64.

Failing CMD arguments:

medvram and lowvram Have caused issues when compiling the engine.
api Has caused the model.json to not be updated. Resulting in SD Unets not appearing after compilation.
Failing installation or TensorRT tab not appearing in UI: This is most likely due to a failed install. To resolve this manually use this guide.

Requirements

Driver:

Linux: >= 450.80.02

Windows: >= 452.39

We always recommend keeping the driver up-to-date for system wide performance improvements. (This repository is only tested on Windows 10)

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
scripts		scripts
timing_caches		timing_caches
.gitignore		.gitignore
LICENSE		LICENSE
datastructures.py		datastructures.py
exporter.py		exporter.py
info.md		info.md
install.py		install.py
lib.py		lib.py
model_helper.py		model_helper.py
model_manager.py		model_manager.py
ui_trt.py		ui_trt.py
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Foreword

TensorRT Extension for Stable Diffusion

🚧Forge webui only run already export checkpoint and lora!🚧

DEMO

Features

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

support non-webui?

Installation

ISSUE FIXED

How to use

when the selected "checkpoint" is not available won't be automatically generate engines, user need to manually generate engines.

🚧Frist time generate engines need wait more longer cause "timing cache" need cache🚧

LoRA

More Information

Common Issues/Limitations

Requirements

Star History

About

Uh oh!

Languages

License

Yomisana/Stable-Diffusion-WebUI-TensorRT-Enhanced

Folders and files

Latest commit

History

Repository files navigation

Foreword

TensorRT Extension for Stable Diffusion

🚧Forge webui only run already export checkpoint and lora!🚧

DEMO

Features

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

🚧Forge webui only run already export checkpoint and lora!🚧

support non-webui?

Installation

ISSUE FIXED

How to use

when the selected "checkpoint" is not available won't be automatically generate engines, user need to manually generate engines.

🚧Frist time generate engines need wait more longer cause "timing cache" need cache🚧

LoRA

More Information

Common Issues/Limitations

Requirements

Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages