Adds the ability to convert loaded model's Unet module into TensortRT. Requires version least after commit 339b5315 (currently, it's the dev
branch after 2023-05-27). Only tested to work on Windows.
Loras are baked in into the converted model. Hypernetwork support is not tested. Controlnet is not supported. Textual inversion works normally.
NVIDIA is also working on releaseing their version of TensorRT for webui, which might be more performant, but they can't release it yet.
There seems to be support for quickly replacing weight of a TensorRT engine without rebuilding it, and this extension does not offer this option yet.
Apart from installing the extension normally, you also need to download zip with TensorRT from NVIDIA.
You need to choose the same version of CUDA as python's torch library is using. For torch 2.0.1 it is CUDA 11.8.
Extract the zip into extension directory, so that TensorRT-8.6.1.6
(or similarly named dir) exists in the same place as scripts
directory and trt_path.py
file. Restart webui afterwards.
You don't need to install CUDA separately.
- Slect the model you want to optimize and make a picture with it, including needed loras and hypernetworks.
- Go to a
TensorRT
tab that appears if the extension loads properly. - In
Convert to ONNX
tab, pressConvert Unet to ONNX
.- This takes a short while.
- After the conversion has finished, you will find an
.onnx
file with model inmodels/Unet-onnx
directory.
- In
Convert ONNX to TensorRT
tab, configure the necessary parameters (including writing full path to onnx model) and pressConvert ONNX to TensorRT
.- This takes very long - from 15 minues to an hour.
- This takes up a lot of VRAM: you might want to press "Show command for conversion" and run the command yourself after shutting down webui.
- After the conversion has finished, you will find a
.trt
file with model inmodels/Unet-trt
directory.
- In settings, in
Stable Diffusion
page, useSD Unet
option to select newly generated TensorRT model. - Generate pictures.
Stable diffusion 2.0 conversion should fail for both ONNX and TensorRT because of incompatible shapes, but you may be able to rememdy this by chaning instances of 768 to 1024 in the code.