You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Notice that we are using a particular CLIP checkpoint, i.e., `openai/clip-vit-large-patch14`. This is because the Stable Diffusion pre-training was performed with this CLIP variant. For more details, refer to the [documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix#diffusers.StableDiffusionInstructPix2PixPipeline.text_encoder).
331
331
332
-
Next, we prepare a PyTorch `nn.module` to compute directional similarity:
332
+
Next, we prepare a PyTorch `nn.Module` to compute directional similarity:
333
333
334
334
```python
335
335
import torch.nn as nn
@@ -410,7 +410,7 @@ It should be noted that the `StableDiffusionInstructPix2PixPipeline` exposes t
410
410
411
411
We can extend the idea of this metric to measure how similar the original image and edited version are. To do that, we can just do `F.cosine_similarity(img_feat_two, img_feat_one)`. For these kinds of edits, we would still want the primary semantics of the images to be preserved as much as possible, i.e., a high similarity score.
412
412
413
-
We can use these metrics for similar pipelines such as the[`StableDiffusionPix2PixZeroPipeline`](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero#diffusers.StableDiffusionPix2PixZeroPipeline)`.
413
+
We can use these metrics for similar pipelines such as the[`StableDiffusionPix2PixZeroPipeline`](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero#diffusers.StableDiffusionPix2PixZeroPipeline).
414
414
415
415
<Tip>
416
416
@@ -550,7 +550,7 @@ FID results tend to be fragile as they depend on a lot of factors:
550
550
* The image format (not the same if we start from PNGs vs JPGs).
551
551
552
552
Keeping that in mind, FID is often most useful when comparing similar runs, but it is
553
-
hard to to reproduce paper results unless the authors carefully disclose the FID
553
+
hard to reproduce paper results unless the authors carefully disclose the FID
554
554
measurement code.
555
555
556
556
These points apply to other related metrics too, such as KID and IS.
0 commit comments