diff --git a/notebooks/stable-diffusion-xl/README.md b/notebooks/stable-diffusion-xl/README.md index 9684f3bc02e..d116f4e1e27 100644 --- a/notebooks/stable-diffusion-xl/README.md +++ b/notebooks/stable-diffusion-xl/README.md @@ -32,8 +32,6 @@ The tutorial consists of the following steps: - Download the Stable Diffusion XL Base model from a public source using the [OpenVINO integration with Hugging Face Optimum](https://huggingface.co/blog/openvino). - Run Text2Image generation pipeline using Stable Diffusion XL base - Run Image2Image generation pipeline using Stable Diffusion XL base -- Download and convert the Stable Diffusion XL Refiner model from a public source using the [OpenVINO integration with Hugging Face Optimum](https://huggingface.co/blog/openvino). -- Run 2-stages Stable Diffusion XL pipeline ## Segmind-VegaRT diff --git a/notebooks/stable-diffusion-xl/stable-diffusion-xl.ipynb b/notebooks/stable-diffusion-xl/stable-diffusion-xl.ipynb index f9fb73ae9d6..bcd5d60ec5c 100644 --- a/notebooks/stable-diffusion-xl/stable-diffusion-xl.ipynb +++ b/notebooks/stable-diffusion-xl/stable-diffusion-xl.ipynb @@ -1,7 +1,6 @@ { "cells": [ { - "attachments": {}, "cell_type": "markdown", "id": "00af7d21-9b28-4cc4-8103-bb46ba1264f0", "metadata": {}, @@ -28,8 +27,6 @@ "- Download the Stable Diffusion XL Base model from a public source using the [OpenVINO integration with Hugging Face Optimum](https://huggingface.co/blog/openvino).\n", "- Run Text2Image generation pipeline using Stable Diffusion XL base\n", "- Run Image2Image generation pipeline using Stable Diffusion XL base\n", - "- Download and convert the Stable Diffusion XL Refiner model from a public source using the [OpenVINO integration with Hugging Face Optimum](https://huggingface.co/blog/openvino).\n", - "- Run 2-stages Stable Diffusion XL pipeline\n", "\n", ">**Note**: Some demonstrated models can require at least 64GB RAM for conversion and running.\n", "\n", @@ -37,7 +34,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "786314ec-65e4-4251-8c5a-c62efb2a5769", "metadata": {}, @@ -53,9 +49,6 @@ " - [Run Image2Image generation pipeline](#Run-Image2Image-generation-pipeline)\n", " - [Select inference device SDXL Refiner model](#Select-inference-device-SDXL-Refiner-model)\n", " - [Image2Image Generation Interactive Demo](#Image2Image-Generation-Interactive-Demo)\n", - "- [SDXL Refiner model](#SDXL-Refiner-model)\n", - " - [Select inference device](#Select-inference-device)\n", - " - [Run Text2Image generation with Refinement](#Run-Text2Image-generation-with-Refinement)\n", "\n", "\n", "### Installation Instructions\n", @@ -67,7 +60,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "ee62ee05-0388-4b6f-8565-5b8b57f72a09", "metadata": {}, @@ -78,20 +70,19 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "id": "2ecf3e6d-cbc1-4b57-be08-2ded40f182ce", "metadata": { "tags": [] }, "outputs": [], "source": [ - "%pip install -q --extra-index-url https://download.pytorch.org/whl/cpu \"torch>=2.1\" \"torchvision\" \"diffusers>=0.24.0\" \"invisible-watermark>=0.2.0\" \"transformers>=4.33.0\" \"accelerate\" \"onnx!=1.16.2\" \"peft>=0.6.2\"\n", - "%pip install -q \"git+https://github.com/huggingface/optimum-intel.git\"\n", - "%pip install -q \"openvino>=2023.1.0\" \"gradio>=4.19\" \"nncf>=2.9.0\"" + "# %pip install -q --extra-index-url https://download.pytorch.org/whl/cpu \"torch>=2.1\" \"torchvision\" \"diffusers>=0.24.0\" \"invisible-watermark>=0.2.0\" \"transformers>=4.33.0\" \"accelerate\" \"onnx!=1.16.2\" \"peft>=0.6.2\"\n", + "# %pip install -q \"git+https://github.com/huggingface/optimum-intel.git\"\n", + "# %pip install -q \"openvino>=2023.1.0\" \"gradio>=4.19\" \"nncf>=2.9.0\"" ] }, { - "attachments": {}, "cell_type": "markdown", "id": "ed9dfe55-8ae7-4b31-a102-b53b1d2d4941", "metadata": {}, @@ -109,10 +100,22 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "id": "e16d2760-85bd-4a5f-be1b-a7313d960c56", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024-10-17 22:53:35.107765: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n", + "2024-10-17 22:53:35.109501: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.\n", + "2024-10-17 22:53:35.146015: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n", + "To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n", + "2024-10-17 22:53:35.889441: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n" + ] + } + ], "source": [ "from pathlib import Path\n", "from optimum.intel.openvino import OVStableDiffusionXLPipeline\n", @@ -123,7 +126,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "867f589e-919c-455a-8b60-6c7fc5565ebf", "metadata": {}, @@ -143,12 +145,12 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "078aaf84b3c34bae857c58a6aaea6244", + "model_id": "1a2bb853a5b8444cb05d627e6f789a13", "version_major": 2, "version_minor": 0 }, "text/plain": [ - "Dropdown(description='Device:', index=4, options=('CPU', 'GPU.0', 'GPU.1', 'GPU.2', 'AUTO'), value='AUTO')" + "Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')" ] }, "execution_count": 3, @@ -172,7 +174,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "318de1b2", "metadata": {}, @@ -182,14 +183,14 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 4, "id": "6c6cbc44", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "ee1b1540dced43f583af17e2ec584a90", + "model_id": "29e19cfafe5643e484f5376ab941aa2c", "version_major": 2, "version_minor": 0 }, @@ -197,7 +198,7 @@ "Checkbox(value=True, description='Apply weight compression')" ] }, - "execution_count": 22, + "execution_count": 4, "metadata": {}, "output_type": "execute_result" } @@ -215,303 +216,18 @@ }, { "cell_type": "code", - "execution_count": 24, - "id": "534feee4", - "metadata": {}, - "outputs": [], - "source": [ - "def get_quantization_config(compress_weights):\n", - " quantization_config = None\n", - " if compress_weights.value:\n", - " from optimum.intel import OVWeightQuantizationConfig\n", - "\n", - " quantization_config = OVWeightQuantizationConfig(bits=8)\n", - " return quantization_config\n", - "\n", - "\n", - "quantization_config = get_quantization_config(compress_weights)" - ] - }, - { - "cell_type": "code", - "execution_count": 26, + "execution_count": 5, "id": "a4e9bd80-88e7-4f97-a5b3-6274f91a7165", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INFO:nncf:Statistics of the bitwidth distribution:\n", - "+--------------+---------------------------+-----------------------------------+\n", - "| Num bits (N) | % all parameters (layers) | % ratio-defining parameters |\n", - "| | | (layers) |\n", - "+==============+===========================+===================================+\n", - "| 8 | 100% (794 / 794) | 100% (794 / 794) |\n", - "+--------------+---------------------------+-----------------------------------+\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "cc757b6789764ee3acf9e7596dc31acc", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Output()" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INFO:nncf:Statistics of the bitwidth distribution:\n", - "+--------------+---------------------------+-----------------------------------+\n", - "| Num bits (N) | % all parameters (layers) | % ratio-defining parameters |\n", - "| | | (layers) |\n", - "+==============+===========================+===================================+\n", - "| 8 | 100% (32 / 32) | 100% (32 / 32) |\n", - "+--------------+---------------------------+-----------------------------------+\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "47050c8337c042ef88d9e699f83c038d", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Output()" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INFO:nncf:Statistics of the bitwidth distribution:\n", - "+--------------+---------------------------+-----------------------------------+\n", - "| Num bits (N) | % all parameters (layers) | % ratio-defining parameters |\n", - "| | | (layers) |\n", - "+==============+===========================+===================================+\n", - "| 8 | 100% (40 / 40) | 100% (40 / 40) |\n", - "+--------------+---------------------------+-----------------------------------+\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "3b39f6a0573d48ef8ebef899dd6e176d", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Output()" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INFO:nncf:Statistics of the bitwidth distribution:\n", - "+--------------+---------------------------+-----------------------------------+\n", - "| Num bits (N) | % all parameters (layers) | % ratio-defining parameters |\n", - "| | | (layers) |\n", - "+==============+===========================+===================================+\n", - "| 8 | 100% (74 / 74) | 100% (74 / 74) |\n", - "+--------------+---------------------------+-----------------------------------+\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "59e52e4c1d1849e8af882e53fc9a278b", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Output()" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "INFO:nncf:Statistics of the bitwidth distribution:\n", - "+--------------+---------------------------+-----------------------------------+\n", - "| Num bits (N) | % all parameters (layers) | % ratio-defining parameters |\n", - "| | | (layers) |\n", - "+==============+===========================+===================================+\n", - "| 8 | 100% (195 / 195) | 100% (195 / 195) |\n", - "+--------------+---------------------------+-----------------------------------+\n" - ] - }, - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "ffb7e1e954494539af454dec1dd9a2cc", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Output()" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "data": { - "text/html": [ - "
\n"
-      ],
-      "text/plain": []
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "text/html": [
-       "
\n",
-       "
\n" - ], - "text/plain": [ - "\n" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Compiling the vae_decoder to AUTO ...\n", - "Compiling the unet to AUTO ...\n", - "Compiling the vae_encoder to AUTO ...\n", - "Compiling the text_encoder to AUTO ...\n", - "Compiling the text_encoder_2 to AUTO ...\n" - ] - } - ], + "outputs": [], "source": [ "if not model_dir.exists():\n", - " text2image_pipe = OVStableDiffusionXLPipeline.from_pretrained(model_id, compile=False, device=device.value, quantization_config=quantization_config)\n", - " text2image_pipe.half()\n", - " text2image_pipe.save_pretrained(model_dir)\n", - " text2image_pipe.compile()\n", - "else:\n", - " text2image_pipe = OVStableDiffusionXLPipeline.from_pretrained(model_dir, device=device.value)" + " !optimum-cli export openvino -m stabilityai/stable-diffusion-xl-base-1.0 --weight-format int8 {model_dir}\n", + "\n", + "text2image_pipe = OVStableDiffusionXLPipeline.from_pretrained(model_dir, device=device.value)" ] }, { - "attachments": {}, "cell_type": "markdown", "id": "3417085c-e1da-40b7-bff9-acbfd17b3c02", "metadata": {}, @@ -524,7 +240,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 6, "id": "cf168ab0-8bba-4bb6-8da5-0937b5762ef8", "metadata": { "tags": [] @@ -533,12 +249,12 @@ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "0bd701da84414b7cb6ab8d529d12b293", + "model_id": "40eab39b84504df19070913855752f09", "version_major": 2, "version_minor": 0 }, "text/plain": [ - " 0%| | 0/15 [00:00" ] }, - "execution_count": 27, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "import numpy as np\n", + "import torch\n", "\n", - "prompt = \"cute cat 4k, high-res, masterpiece, best quality, soft lighting, dynamic angle\"\n", + "prompt = \"cute cat 4k, high-res, masterpiece, best quality, full hd, extremely detailed, soft lighting, dynamic angle, 35mm\"\n", "image = text2image_pipe(\n", " prompt,\n", - " num_inference_steps=15,\n", + " num_inference_steps=25,\n", " height=512,\n", " width=512,\n", - " generator=np.random.RandomState(314),\n", + " generator=torch.Generator(device=\"cpu\").manual_seed(903512),\n", ").images[0]\n", "image.save(\"cat.png\")\n", "image" ] }, { - "attachments": {}, "cell_type": "markdown", "id": "399ebaaa-74ad-4ef2-a197-bbedb143d1ec", "metadata": {}, @@ -607,9 +322,9 @@ "# Read more in the docs: https://gradio.app/docs/\n", "# if you want create public link for sharing demo, please add share=True\n", "try:\n", - " demo.launch(debug=True)\n", + " demo.launch()\n", "except Exception:\n", - " demo.launch(share=True, debug=True)" + " demo.launch(share=True)" ] }, { @@ -625,7 +340,6 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "0e9a929d-694e-44a9-9f35-e1beca449ad7", "metadata": {}, @@ -637,12 +351,11 @@ ] }, { - "attachments": {}, "cell_type": "markdown", "id": "3993c958-b7ea-47f1-ad10-9d883e9c1860", "metadata": {}, "source": [ - "#### Select inference device SDXL Refiner model\n", + "#### Select inference device SDXL image2image model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "select device from dropdown list for running inference using OpenVINO" @@ -650,22 +363,22 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "id": "27666906-1318-4e7a-afe5-85144a170c9b", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "078aaf84b3c34bae857c58a6aaea6244", + "model_id": "1a2bb853a5b8444cb05d627e6f789a13", "version_major": 2, "version_minor": 0 }, "text/plain": [ - "Dropdown(description='Device:', index=4, options=('CPU', 'GPU.0', 'GPU.1', 'GPU.2', 'AUTO'), value='AUTO')" + "Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')" ] }, - "execution_count": 8, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -676,22 +389,10 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "id": "35926f53-ffe8-4386-beac-f5ab4e78130a", "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Compiling the vae_decoder to AUTO ...\n", - "Compiling the unet to AUTO ...\n", - "Compiling the vae_encoder to AUTO ...\n", - "Compiling the text_encoder_2 to AUTO ...\n", - "Compiling the text_encoder to AUTO ...\n" - ] - } - ], + "outputs": [], "source": [ "from optimum.intel import OVStableDiffusionXLImg2ImgPipeline\n", "\n", @@ -700,19 +401,19 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 11, "id": "48892114-de29-4289-8c0c-1199f912ee01", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { - "model_id": "c0009ea460d04610aa75bbafafd07963", + "model_id": "7ae07bb1bf54428f8e423f9be4d6d94b", "version_major": 2, "version_minor": 0 }, "text/plain": [ - " 0%| | 0/7 [00:00" ] }, - "execution_count": 10, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ + "import torch\n", + "\n", "photo_prompt = \"professional photo of a cat, extremely detailed, hyper realistic, best quality, full hd\"\n", "photo_image = image2image_pipe(\n", " photo_prompt,\n", " image=image,\n", - " num_inference_steps=25,\n", - " generator=np.random.RandomState(356),\n", + " num_inference_steps=50,\n", + " strength=0.75,\n", + " generator=torch.Generator(device=\"cpu\").manual_seed(4891),\n", ").images[0]\n", "photo_image.save(\"photo_cat.png\")\n", "photo_image" ] }, { - "attachments": {}, "cell_type": "markdown", "id": "d163ee59-1228-4f2d-b78f-925a41fffcb8", "metadata": {}, @@ -776,265 +479,40 @@ "# Read more in the docs: https://gradio.app/docs/\n", "# if you want create public link for sharing demo, please add share=True\n", "try:\n", - " demo.launch(debug=True)\n", + " demo.launch()\n", "except Exception:\n", - " demo.launch(share=True, debug=True)" + " demo.launch(share=True)" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "id": "3cc2a9d6-4a39-4690-8089-fd47aecffea0", "metadata": {}, - "outputs": [], - "source": [ - "demo.close()\n", - "del image2image_pipe\n", - "gc.collect()" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "1eca2245-c403-41cb-bf5b-0cc4acfe397e", - "metadata": {}, - "source": [ - "## SDXL Refiner model\n", - "[back to top ⬆️](#Table-of-contents:)\n", - "\n", - "As we discussed above, Stable Diffusion XL can be used in a 2-stages approach: first, the base model is used to generate latents of the desired output size. In the second step, we use a specialized high-resolution model for the refinement of latents generated in the first step, using the same prompt. \n", - "The Stable Diffusion XL Refiner model is designed to transform regular images into stunning masterpieces with the help of user-specified prompt text. It can be used to improve the quality of image generation after the Stable Diffusion XL Base. The refiner model accepts latents produced by the SDXL base model and text prompt for improving generated image." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "dd1d6821", - "metadata": {}, - "source": [ - "select whether you would like to use weight compression to reduce memory footprint" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "aa09681b", - "metadata": {}, - "outputs": [], - "source": [ - "compress_weights" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cbdf5c54", - "metadata": {}, - "outputs": [], - "source": [ - "quantization_config = get_quantization_config(compress_weights)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c8b95e61-d266-4491-8dfc-d2c8f56093a8", - "metadata": {}, - "outputs": [], - "source": [ - "from optimum.intel import (\n", - " OVStableDiffusionXLImg2ImgPipeline,\n", - " OVStableDiffusionXLPipeline,\n", - ")\n", - "from pathlib import Path\n", - "\n", - "refiner_model_id = \"stabilityai/stable-diffusion-xl-refiner-1.0\"\n", - "refiner_model_dir = Path(\"openvino-sd-xl-refiner-1.0\")\n", - "\n", - "\n", - "if not refiner_model_dir.exists():\n", - " refiner = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(refiner_model_id, export=True, compile=False, quantization_config=quantization_config)\n", - " refiner.half()\n", - " refiner.save_pretrained(refiner_model_dir)\n", - " del refiner\n", - " gc.collect()" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "378664aa-b41c-4ecb-854a-9b2ebb0964e7", - "metadata": {}, - "source": [ - "### Select inference device\n", - "[back to top ⬆️](#Table-of-contents:)\n", - "\n", - "select device from dropdown list for running inference using OpenVINO" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "id": "7c672d74-b566-42dc-8508-df399d1e5a3a", - "metadata": {}, - "outputs": [ - { - "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "078aaf84b3c34bae857c58a6aaea6244", - "version_major": 2, - "version_minor": 0 - }, - "text/plain": [ - "Dropdown(description='Device:', index=4, options=('CPU', 'GPU.0', 'GPU.1', 'GPU.2', 'AUTO'), value='AUTO')" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "device" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "id": "0d347c7a-ac71-461b-a9ce-5f9471cb5c97", - "metadata": {}, - "source": [ - "### Run Text2Image generation with Refinement\n", - "[back to top ⬆️](#Table-of-contents:)\n" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "id": "0048e46b-201c-4f16-88b3-fa621d1b6e14", - "metadata": {}, "outputs": [ { - "name": "stderr", + "name": "stdout", "output_type": "stream", "text": [ - "Compiling the vae_decoder to AUTO ...\n", - "Compiling the unet to AUTO ...\n", - "Compiling the text_encoder to AUTO ...\n", - "Compiling the text_encoder_2 to AUTO ...\n", - "Compiling the vae_encoder to AUTO ...\n" + "Closing server running on port: 7860\n" ] }, { "data": { - "application/vnd.jupyter.widget-view+json": { - "model_id": "01302fc565244fa9968df4697cc3817a", - "version_major": 2, - "version_minor": 0 - }, "text/plain": [ - " 0%| | 0/15 [00:00" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "image = refiner(\n", - " prompt=prompt,\n", - " image=np.transpose(latents[None, :], (0, 2, 3, 1)),\n", - " num_inference_steps=15,\n", - " generator=np.random.RandomState(314),\n", - ").images[0]\n", - "image.save(\"cat_refined.png\")\n", - "\n", - "image" - ] } ], "metadata": { @@ -1073,7 +551,257 @@ }, "widgets": { "application/vnd.jupyter.widget-state+json": { - "state": {}, + "state": { + "0e23721ad30646b8a68d4736a45cdf0c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "1349a4a2f7734504a76922dade9c8e32": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "DescriptionStyleModel", + "state": { + "description_width": "" + } + }, + "1a2bb853a5b8444cb05d627e6f789a13": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "DropdownModel", + "state": { + "_options_labels": [ + "CPU", + "AUTO" + ], + "description": "Device:", + "index": 1, + "layout": "IPY_MODEL_0e23721ad30646b8a68d4736a45cdf0c", + "style": "IPY_MODEL_1349a4a2f7734504a76922dade9c8e32" + } + }, + "1ae83f4fff404ea3acd8818671161db8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_320b9c09680641d795467900f33cca5d", + "style": "IPY_MODEL_e02bd56e95094d0882b5241c5c6735a3", + "value": " 25/25 [01:05<00:00,  2.57s/it]" + } + }, + "29e19cfafe5643e484f5376ab941aa2c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "CheckboxModel", + "state": { + "description": "Apply weight compression", + "disabled": false, + "layout": "IPY_MODEL_91020ad62a604da0967535b2b03885c5", + "style": "IPY_MODEL_2ee65a25acbe4633a0f936586016a124", + "value": true + } + }, + "2ee65a25acbe4633a0f936586016a124": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "CheckboxStyleModel", + "state": { + "description_width": "" + } + }, + "320b9c09680641d795467900f33cca5d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "393c1aaf96864b3d8df01f39fe68d3c7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "FloatProgressModel", + "state": { + "bar_style": "success", + "layout": "IPY_MODEL_d58d0b5bdb1d4e8aa464dcdead449ddf", + "max": 37, + "style": "IPY_MODEL_a76bd6112d73434cbad7302e78855cae", + "value": 37 + } + }, + "3b99b40d685147a68dd5a65bf031a81c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_6f1726938dbb4cdaa2142d15d6942798", + "style": "IPY_MODEL_c76fcaeec75841e4863d86ac4fc51d6a", + "value": " 37/37 [01:37<00:00,  2.54s/it]" + } + }, + "40eab39b84504df19070913855752f09": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HBoxModel", + "state": { + "children": [ + "IPY_MODEL_afa827072e4b43bdb9d0e116e7f79fb1", + "IPY_MODEL_55658f2641cd42e798565a9d1abfbb1b", + "IPY_MODEL_1ae83f4fff404ea3acd8818671161db8" + ], + "layout": "IPY_MODEL_f6ab47c9ca2c467e833438c1467e1df3" + } + }, + "55658f2641cd42e798565a9d1abfbb1b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "FloatProgressModel", + "state": { + "bar_style": "success", + "layout": "IPY_MODEL_87f054b2164843c08b412dbdf840f4b8", + "max": 25, + "style": "IPY_MODEL_91a21b8aeda4406f9ece5a3f69ffb97c", + "value": 25 + } + }, + "6f1726938dbb4cdaa2142d15d6942798": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "7910778a7d664f7ea5575c6a5bb3460c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "7ae07bb1bf54428f8e423f9be4d6d94b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HBoxModel", + "state": { + "children": [ + "IPY_MODEL_c543933c42ba4c1faaf2d94fd8f3b457", + "IPY_MODEL_393c1aaf96864b3d8df01f39fe68d3c7", + "IPY_MODEL_3b99b40d685147a68dd5a65bf031a81c" + ], + "layout": "IPY_MODEL_d7b01238d9594534a1c60306c57e62c8" + } + }, + "87f054b2164843c08b412dbdf840f4b8": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "8881850e9c29471393e07f0267e0012e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "91020ad62a604da0967535b2b03885c5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "91a21b8aeda4406f9ece5a3f69ffb97c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "ProgressStyleModel", + "state": { + "description_width": "" + } + }, + "a76bd6112d73434cbad7302e78855cae": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "ProgressStyleModel", + "state": { + "description_width": "" + } + }, + "a8eaf468268041e88241f293c6999584": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "afa827072e4b43bdb9d0e116e7f79fb1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_a8eaf468268041e88241f293c6999584", + "style": "IPY_MODEL_8881850e9c29471393e07f0267e0012e", + "value": "100%" + } + }, + "c543933c42ba4c1faaf2d94fd8f3b457": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_c83be23032084afaaa1581c8b70024d4", + "style": "IPY_MODEL_7910778a7d664f7ea5575c6a5bb3460c", + "value": "100%" + } + }, + "c76fcaeec75841e4863d86ac4fc51d6a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "c83be23032084afaaa1581c8b70024d4": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "d58d0b5bdb1d4e8aa464dcdead449ddf": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "d7b01238d9594534a1c60306c57e62c8": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "e02bd56e95094d0882b5241c5c6735a3": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "f6ab47c9ca2c467e833438c1467e1df3": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + } + }, "version_major": 2, "version_minor": 0 }