Skip to content

Commit f99a1f8

Browse files
committed
fixed some typos
1 parent 6480349 commit f99a1f8

File tree

2 files changed

+23
-20
lines changed

2 files changed

+23
-20
lines changed

Diffusers_library.ipynb

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -293,12 +293,13 @@
293293
"metadata": {},
294294
"outputs": [],
295295
"source": [
296-
"# model"
296+
"# model \n",
297+
"model"
297298
]
298299
},
299300
{
300301
"cell_type": "markdown",
301-
"id": "17d9afc8",
302+
"id": "e34fe6b5",
302303
"metadata": {
303304
"slideshow": {
304305
"slide_type": "subslide"
@@ -499,8 +500,8 @@
499500
}
500501
},
501502
"source": [
502-
"- Run the ``from_config()`` method to load a configuration and instantiate a scheduler \n",
503-
"- Note that for pipes and models we did something similar using ``from_pretrained()`` instead"
503+
"- Run the ``from_pretrained()`` method to load a configuration and instantiate a scheduler \n",
504+
"- Note that we did something similar to load pipes and models"
504505
]
505506
},
506507
{
@@ -624,7 +625,7 @@
624625
],
625626
"source": [
626627
"less_noisy_sample = scheduler.step(\n",
627-
" model_output=noisy_residual, timestep=13, sample=noisy_sample\n",
628+
" model_output=noisy_residual, timestep=12, sample=noisy_sample\n",
628629
")[\"prev_sample\"]\n",
629630
"less_noisy_sample.shape"
630631
]
@@ -1377,7 +1378,7 @@
13771378
}
13781379
},
13791380
"source": [
1380-
"Tokenizer + test-encoder:\n",
1381+
"Tokenizer + text-encoder:\n",
13811382
"\n",
13821383
"- The **text-encoder** is responsible for transforming a text prompt into an embedding space that can be understood by the U-Net \n",
13831384
"- It is usually a transformer-based encoder that maps a sequence of tokens (generated with a **tokenizer**) into a (large fixed size) text-embedding\n",
@@ -1585,7 +1586,7 @@
15851586
}
15861587
},
15871588
"source": [
1588-
"Next, we load the *K-LMS* scheduler instead of the *PNDMScheduler* from the default pipeline"
1589+
"Next, we load an *LMS* scheduler instead of the *PNDMScheduler* from the default pipeline"
15891590
]
15901591
},
15911592
{
@@ -1742,6 +1743,7 @@
17421743
"**Guidance**\n",
17431744
"\n",
17441745
"For classifier-free guidance, we need $\\tilde z = \\tilde z_x + \\gamma \\big( \\tilde z_{x|y} - \\tilde z_x \\big)$.\n",
1746+
"\n",
17451747
"We need two forward passes: \n",
17461748
"- one with the conditioned input (``text_embeddings``) to get $\\tilde z_{x|y}$ (i.e., the score function $\\nabla_x p(x|y)$)\n",
17471749
"- one with the unconditional embeddings (``uncond_embeddings``) to get $\\tilde z_x$ (i.e., the score function $\\nabla_x p(x)$)\n",
@@ -1818,9 +1820,9 @@
18181820
"source": [
18191821
"**Scheduler**\n",
18201822
"\n",
1821-
"- We initialize the *K-LMS* with the ``num_inference_steps`` hyperparameter \n",
1823+
"- We initialize the *LMS* scheduler with the ``num_inference_steps`` hyperparameter \n",
18221824
"- The scheduler will compute the sigmas $\\sigma_t$ to be used during the denoising process\n",
1823-
"- *K-LMS* computes the next latent to be fed in the U-net as $\\tilde x_t = \\frac{\\tilde x_t}{\\sqrt{\\sigma_t^2 +1}}$"
1825+
"- *LMS* computes the next latent to be fed in the U-net as $\\tilde x_t = \\frac{\\tilde x_t}{\\sqrt{\\sigma_t^2 +1}}$"
18241826
]
18251827
},
18261828
{
@@ -2045,7 +2047,7 @@
20452047
"name": "python",
20462048
"nbconvert_exporter": "python",
20472049
"pygments_lexer": "ipython3",
2048-
"version": "3.11.8"
2050+
"version": "3.12.7"
20492051
},
20502052
"toc": {
20512053
"base_numbering": "1",

diffusion_from_scratch.ipynb

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@
7979
},
8080
{
8181
"cell_type": "code",
82-
"execution_count": 2,
82+
"execution_count": 1,
8383
"id": "91673856",
8484
"metadata": {
8585
"slideshow": {
@@ -111,7 +111,7 @@
111111
},
112112
{
113113
"cell_type": "code",
114-
"execution_count": 3,
114+
"execution_count": 5,
115115
"id": "fa4ddb28",
116116
"metadata": {
117117
"slideshow": {
@@ -628,7 +628,7 @@
628628
"ax.plot(t1.numpy()[0], label='t=10')\n",
629629
"ax.plot(t2.numpy()[0], label='t=12')\n",
630630
"ax.plot(t3.numpy()[0], label='t=30')\n",
631-
"plt.legend()"
631+
"plt.legend();"
632632
]
633633
},
634634
{
@@ -828,7 +828,7 @@
828828
"sqrt_recip_alphas = torch.sqrt(1.0 / alphas)\n",
829829
"alphas_cumprod_prev = F.pad(alphas_cumprod[:-1], (1, 0), value=1.0)\n",
830830
"posterior_variance = betas * (1. - alphas_cumprod_prev) / (1. - alphas_cumprod) # β_t\n",
831-
"\n",
831+
" \n",
832832
"@torch.no_grad()\n",
833833
"def p_sample(model, x, t, t_index):\n",
834834
" \n",
@@ -841,7 +841,7 @@
841841
" \n",
842842
" # Use the NN to predict the mean\n",
843843
" model_mean = sqrt_recip_alphas_t * (\n",
844-
" x - betas_t * model(x, t) / sqrt_one_minus_alphas_cumprod_t)\n",
844+
" x - betas_t * model(x, t) / sqrt_one_minus_alp has_cumprod_t)\n",
845845
"\n",
846846
" # Draw the next sample\n",
847847
" if t_index == 0:\n",
@@ -924,9 +924,9 @@
924924
}
925925
},
926926
"source": [
927-
"Next, we define a function that applies some basic image preprocessing on-the-fly: random horizontal flips and rescaling in the $[-1,1]$ range.\n",
927+
"Next, we define some basic image preprocessing on-the-fly: random horizontal flips, converstion to tensor, and rescaling in the $[-1,1]$ range.\n",
928928
"\n",
929-
"We use the ``with_transform`` functionality for that. "
929+
"We use ``with_transform`` to apply the transformations to the elements in the dataset."
930930
]
931931
},
932932
{
@@ -1064,12 +1064,13 @@
10641064
" for step, batch in enumerate(dataloader):\n",
10651065
" optimizer.zero_grad()\n",
10661066
"\n",
1067+
" # x0\n",
10671068
" batch_size = batch[\"pixel_values\"].shape[0]\n",
10681069
" batch = batch[\"pixel_values\"].to(device)\n",
10691070
"\n",
10701071
" # sample t from U(0,T)\n",
10711072
" t = torch.randint(0, timesteps, (batch_size,), device=device).long()\n",
1072-
"\n",
1073+
" \n",
10731074
" loss = p_losses(model, batch, t)\n",
10741075
"\n",
10751076
" if step % 100 == 0:\n",
@@ -1148,7 +1149,7 @@
11481149
"grid_img = torchvision.utils.make_grid(last_sample, nrow=16)\n",
11491150
"%matplotlib inline\n",
11501151
"plt.figure(figsize = (20,10))\n",
1151-
"plt.imshow(grid_img.permute(1, 2, 0).cpu().numpy(), cmap='gray')"
1152+
"plt.imshow(grid_img.permute(1, 2, 0).cpu().numpy(), cmap='gray');"
11521153
]
11531154
},
11541155
{
@@ -1454,7 +1455,7 @@
14541455
"name": "python",
14551456
"nbconvert_exporter": "python",
14561457
"pygments_lexer": "ipython3",
1457-
"version": "3.11.8"
1458+
"version": "3.12.7"
14581459
},
14591460
"toc": {
14601461
"base_numbering": 1,

0 commit comments

Comments
 (0)