Stable Diffusion inference on TPUs #1227

sayakpaul · 2023-01-03T04:49:42Z

Colab Notebook: https://colab.research.google.com/drive/1-z5bm5WqnPxYp6Hl4oAyC0ZwkTOvtRzd?usp=sharing

I tried running inference with Stable Diffusion (v1) on (Colab) TPUs. It worked; however, a couple of questions:

Instead of just three images, we get this:

This likely has to do with the distribution on the TPU cores. Has anyone looked into filtering the extra images?
The inference time also seems slower (~39 seconds to generate three images). Could it be because of the non-TF ops that run when we call model.text_to_image()?

To mitigate the second issue, @deep-diver and I created SavedModels to eliminate most of the non-TF ops from the pipeline. Details can be found in this notebook. The tokenizer, however, still is not fully TF. But I couldn't load the SavedModels in a TPU strategy scope, leads to:

---------------------------------------------------------------------------
ResourceExhaustedError                    Traceback (most recent call last)
[<ipython-input-16-b1735bf71429>](https://localhost:8080/#) in <module>
      3     for k in gcs_paths:
      4         predict_fns.append(
----> 5             load_saved_model(gcs_paths[k])
      6         )

11 frames
[/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in shape(self)
   1261         # `_tensor_shape` is declared and defined in the definition of
   1262         # `EagerTensor`, in C.
-> 1263         self._tensor_shape = tensor_shape.TensorShape(self._shape_tuple())
   1264       except core._NotOkStatusException as e:
   1265         raise core._status_to_exception(e) from None

ResourceExhaustedError: Failed to allocate request for 7.03MiB (7372800B) on device ordinal 0
	Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors.

Ccing @LukeWood @ianstenbit.

The text was updated successfully, but these errors were encountered:

tirthasheshpatel · 2024-02-12T19:57:47Z

I tried to run your notebook, @sayakpaul, and it just seems to be stuck on the generation step. Seems like Stable Diffision might be broken on TPU. Will try to take a look what's causing this issue.

sachinprasadhs added the keras-team-review-pending label Feb 12, 2024

tirthasheshpatel self-assigned this Feb 26, 2024

tirthasheshpatel removed the keras-team-review-pending label Feb 26, 2024

sachinprasadhs added type:Bug Something isn't working stat:awaiting keras-eng labels Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable Diffusion inference on TPUs #1227

Stable Diffusion inference on TPUs #1227

sayakpaul commented Jan 3, 2023

tirthasheshpatel commented Feb 12, 2024

Stable Diffusion inference on TPUs #1227

Stable Diffusion inference on TPUs #1227

Comments

sayakpaul commented Jan 3, 2023

tirthasheshpatel commented Feb 12, 2024