Editing for module 7

iVibudh · Apr 25, 2022 · c9ab308 · c9ab308
1 parent 0738f30
commit c9ab308
Show file tree

Hide file tree

Showing 5 changed files with 1,288 additions and 1,261 deletions.
diff --git a/t81_558_class_07_1_gan_intro.ipynb b/t81_558_class_07_1_gan_intro.ipynb
@@ -79,17 +79,6 @@
     "    COLAB = False"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "id": "P6ol7WtE4NDO"
-   },
-   "source": [
-    "## Note: This Module Requires PyTorch\n",
-    "\n",
-    "This module makes use of PyTorch framework rather than Keras/TensorFlow. While there are versions of [StyleGAN2-ADA that work with TensorFlow 1.0](https://github.com/jeffheaton/t81_558_deep_learning/blob/5e2528a08c302c82919001a3c3c8364c29c1b999/t81_558_class_07_3_style_gan.ipynb), NVIDIA has switched to PyTorch.  Running this notebook in this notebook in Google CoLab is the most straightforward means of completing this module.  Because of this, I designed this module to run in Google CoLab.  It will take some modifications if you wish to run it locally."
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {
@@ -98,56 +87,58 @@
    "source": [
     "# Part 7.1: Introduction to GANS for Image and Data Generation\n",
     "\n",
-    "A generative adversarial network (GAN) is a class of machine learning systems invented by Ian Goodfellow in 2014. [[Cite:goodfellow2014generative]](https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf) Two neural networks contest with each other in a game. Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proven useful for semi-supervised learning, fully supervised learning, and reinforcement learning.  \n",
+    "A generative adversarial network (GAN) is a class of machine learning systems invented by Ian Goodfellow in 2014. [[Cite:goodfellow2014generative]](https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf) Two neural networks compete with each other in a game. The GAN training algorithm starts with a training set and learns to generate new data with the same distributions as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. \n",
     "\n",
-    "This paper used neural networks to automatically generate images for several datasets that we've seen previously: MINST and CIFAR.  However, it also included the Toronto Face Dataset (a private dataset used by some researchers). These generated images are given in Figure 7.GANS.\n",
+    "This chapter makes use of the PyTorch framework rather than Keras/TensorFlow. While there are versions of [StyleGAN2-ADA that work with TensorFlow 1.0](https://github.com/jeffheaton/t81_558_deep_learning/blob/5e2528a08c302c82919001a3c3c8364c29c1b999/t81_558_class_07_3_style_gan.ipynb), NVIDIA has switched to PyTorch for StyleGAN. Running this notebook in this notebook in Google CoLab is the most straightforward means of completing this chapter. Because of this, I designed this notebook to run in Google CoLab. It will take some modifications if you wish to run it locally.\n",
+    "\n",
+    "This original StyleGAN paper used neural networks to automatically generate images for several previously seen datasets: MINST and CIFAR. However, it also included the Toronto Face Dataset (a private dataset used by some researchers). You can see some of these images in Figure 7.GANS.\n",
     "\n",
     "**Figure 7.GANS: GAN Generated Images**\n",
     "![GAN](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/gan-2.png \"GAN Generated Images\")\n",
     "\n",
-    "Only sub-figure D made use of convolutional neural networks.  Figures A-C make use of fully connected neural networks.  As we will see in this module, the role of convolutional neural networks with GANs was greatly increased.\n",
+    "Only sub-figure D made use of convolutional neural networks. Figures A-C make use of fully connected neural networks. As we will see in this module, the researchers significantly increased the role of convolutional neural networks for GANs.\n",
     "\n",
-    "A GAN is called a generative model because it generates new data.  The overall process of a GAN is given by the following diagram in Figure 7.GAN-FLOW.\n",
+    "We call a GAN a generative model because it generates new data. You can see the overall process in Figure 7.GAN-FLOW.\n",
     "\n",
     "**Figure 7.GAN-FLOW: GAN Structure**\n",
     "![GAN Structure](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/gan-1.png \"GAN Structure\")\n",
     "\n",
     "## Face Generation with StyleGAN and Python\n",
     "\n",
-    "GANs have appeared frequently in the media, showcasing their ability to generate extremely photorealistic faces.  One significant step forward for realistic face generation was the NVIDIA StyleGAN series. NVIDIA introduced the origional StyleGAN in 2018. [[Cite:karras2019style]](https://arxiv.org/abs/1812.04948) StyleGAN was followed by StyleGAN2 in 2019, which improved the quality of StyleGAN by removing certian artifacts. [[Cite:karras2019analyzing]](https://arxiv.org/abs/1912.04958) Most recently, in 2020, NVIDIA released StyleGAN2 adaptive discriminator augmentation (ADA), which will be the focus of this module. [[Cite:karras2020training]](https://arxiv.org/abs/2006.06676)  We will see both how to train StyleGAN2 ADA on any arbitray set of images; as well as use pretrained weights provided by NVIDIA.  The NVIDIA weights allow us to generate high resolution photorealistic looking faces, such seen in Figure 7.STY-GAN.\n",
+    "GANs have appeared frequently in the media, showcasing their ability to generate highly photorealistic faces. One significant step forward for realistic face generation was the NVIDIA StyleGAN series. NVIDIA introduced the origional StyleGAN in 2018. [[Cite:karras2019style]](https://arxiv.org/abs/1812.04948) StyleGAN was followed by StyleGAN2 in 2019, which improved the quality of StyleGAN by removing certian artifacts. [[Cite:karras2019analyzing]](https://arxiv.org/abs/1912.04958) Most recently, in 2020, NVIDIA released StyleGAN2 adaptive discriminator augmentation (ADA), which will be the focus of this module. [[Cite:karras2020training]](https://arxiv.org/abs/2006.06676)  We will see both how to train StyleGAN2 ADA on any arbitray set of images; as well as use pretrained weights provided by NVIDIA. The NVIDIA weights allow us to generate high resolution photorealistic looking faces, such seen in Figure 7.STY-GAN.\n",
     "\n",
     "**Figure 7.STY-GAN: StyleGAN2 Generated Faces**\n",
     "![StyleGAN2 Generated Faces](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/stylegan2_images.jpg \"StyleGAN2 Generated Faces\")\n",
     "\n",
-    "The above images were generated with StyleGAN2, using Google CoLab.  Following the instructions in this section, you will be able to create faces like this of your own.  StyleGAN2 images are usually 1,024 x 1,024 in resolution.  An example of a full resolution StyleGAN image can be [found here](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/stylegan2-hires.jpg). \n",
+    "The above images were generated with StyleGAN2, using Google CoLab. Following the instructions in this section, you will be able to create faces like this of your own. StyleGAN2 images are usually 1,024 x 1,024 in resolution.  An example of a full-resolution StyleGAN image can be [found here](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/stylegan2-hires.jpg). \n",
     "\n",
-    "The primary advanced introduced by the adaptive discriminator augmentation is that the training images are augmented in real time. Image augmentation is a common technique in many convolution neural network applications.  Augmentation has the effect of increasing the size of the training set.  Where StyleGAN2 previously required over 30K images for an effective to develop an effective neural network; now much fewer are needed. I used 2K images to train the fish generating GAN for this section.  Figure 7.STY-GAN-ADA demonstrates the ADA process.\n",
+    "The primary advancement introduced by the adaptive discriminator augmentation is that the algorithm augments the training images in real-time. Image augmentation is a common technique in many convolution neural network applications. Augmentation has the effect of increasing the size of the training set. Where StyleGAN2 previously required over 30K images for an effective to develop an effective neural network; now much fewer are needed. I used 2K images to train the fish generating GAN for this section. Figure 7.STY-GAN-ADA demonstrates the ADA process.\n",
     "\n",
     "**Figure 7.STY-GAN-ADA: StyleGAN2 ADA Training**\n",
     "![StyleGAN2 Generated Faces](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/stylegan2-ada-teaser-1024x252.jpg \"StyleGAN2 Generated Faces\")\n",
     "\n",
-    "The figure shows the increasing probability of augmentation, as $p$ increases. For small image sets the discriminator will generally memorize the image set unless the training algorithm makes use of augmentation. Once this memorization occurs, the discriminator is no longer providing useful information to the training of the generator.\n",
+    "The figure shows the increasing probability of augmentation as $p$ increases. For small image sets, the discriminator will generally memorize the image set unless the training algorithm makes use of augmentation. Once this memorization occurs, the discriminator is no longer providing useful information to the training of the generator.\n",
     "\n",
-    "While the above images look much more realistic than images generated earlier in this course, they are not perfect.  Look at Figure 7.STYLEGAN2. There are usually a number of tell-tail signs that you are looking at a computer generated image.  One of the most obvious is usually the surreal, dream-like backgrounds.  The background does not look obviously fake, at first glance; however, upon closer inspection you usually can't quite discern exactly what a GAN generated background actually is.  Also look at the image character's left eye.  It is slightly unrealistic looking, especially near the eyelashes.\n",
+    "While the above images look much more realistic than images generated earlier in this course, they are not perfect. Look at Figure 7.STYLEGAN2. There are usually several tell-tail signs that you are looking at a computer-generated image. One of the most obvious is usually the surreal, dream-like backgrounds. The background does not look obviously fake at first glance; however, upon closer inspection, you usually can't quite discern what a GAN-generated background is. Also, look at the image character's left eye. It is slightly unrealistic looking, especially near the eyelashes.\n",
     "\n",
-    "Look at the following GAN face.  Can you spot any imperfections?\n",
+    "Look at the following GAN face. Can you spot any imperfections?\n",
     "\n",
     "**Figure 7.STYLEGAN2: StyleGAN2 Face**\n",
     "![StyleGAN2 Face](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/gan_bad.jpg \"StyleGAN2 Face\")\n",
     "\n",
-    "* Image A demonstrates the very abstract backgrounds usually associated with a GAN generated image.\n",
+    "* Image A demonstrates the abstract backgrounds usually associated with a GAN-generated image.\n",
     "* Image B exhibits issues that earrings often present for GANs. GANs sometimes have problems with symmetry, particularly earrings.\n",
-    "* Image C contains an abstract background, as well as a highly distorted secondary image.\n",
+    "* Image C contains an abstract background and a highly distorted secondary image.\n",
     "* Image D also contains a highly distorted secondary image that might be a hand.\n",
     "\n",
-    "There are a number of websites that allow you to generate GANs of your own without any software.\n",
+    "Several websites allow you to generate GANs of your own without any software.\n",
     "\n",
     "* [This Person Does not Exist](https://www.thispersondoesnotexist.com/)\n",
     "* [Which Face is Real](http://www.whichfaceisreal.com/)\n",
     "\n",
-    "The first site generates high resolution images of human faces.  The second site presents a quiz to see if you can detect the difference between a real and fake human faceimage.\n",
+    "The first site generates high-resolution images of human faces. The second site presents a quiz to see if you can detect the difference between a real and fake human face image.\n",
     "\n",
-    "In this module you will learn to create your own StyleGAN2 pictures using Python."
+    "In this chapter, you will learn to create your own StyleGAN pictures using Python."
    ]
   },
   {
@@ -156,9 +147,9 @@
     "id": "Rq3dZOg_5GNH"
    },
    "source": [
-    "### Generating High Rez GAN Faces with Google CoLab\n",
+    "## Generating High Rez GAN Faces with Google CoLab\n",
     "\n",
-    "This notebook demonstrates how to run [NVidia StyleGAN2 ADA](https://github.com/NVlabs/stylegan2-ada) inside of a Google CoLab notebook.  I suggest you use this to generate GAN faces from a pretrained model.  If you try to train your own, you will run into compute limitations of Google CoLab. Make sure to run this code on a GPU instance.  GPU is assumed.\n",
+    "This notebook demonstrates how to run [NVidia StyleGAN2 ADA](https://github.com/NVlabs/stylegan2-ada) inside a Google CoLab notebook.  I suggest you use this to generate GAN faces from a pretrained model.  If you try to train your own, you will run into compute limitations of Google CoLab. Make sure to run this code on a GPU instance.  GPU is assumed.\n",
     "\n",
     "First, we clone StyleGAN3 from GitHub."
    ]
@@ -242,8 +233,8 @@
    },
    "source": [
     "## Run StyleGan From Command Line\n",
-    "Add the StyleGAN folder to Python so that you can import it.  The code below is based on code from NVidia. This actually generates your images. When you use StyleGAN you will generally create a GAN from a seed number, such as 6600.  GANs are actually created by a latent vector, containing 512 floating point values.  The seed is used by the GAN code to generate these 512 values.  The seed value is easier to represent in code than a 512 value vector.  However, while a small change to the latent vector results in a small change to the image, even a small change to the seed value will produce a radically different image.\n",
-    "\n"
+    "\n",
+    "Add the StyleGAN folder to Python so that you can import it. I based this code below on code from NVidia for the original StyleGAN paper. When you use StyleGAN you will generally create a GAN from a seed number. This seed is an integer, such as 6600, that will generate a unique image. The seed generates a latent vector containing 512 floating-point values. The GAN code uses the seed to generate these 512 values. The seed value is easier to represent in code than a 512 value vector; however, while a small change to the latent vector results in a slight change to the image, even a small change to the integer seed value will produce a radically different image."
    ]
   },
   {
@@ -370,7 +361,7 @@
    "source": [
     "## Run StyleGAN From Python Code\n",
     "\n",
-    "Add the StyleGAN folder to Python so that you can import it.  The code below is based on code from NVIDIA. This actually generates your images."
+    "Add the StyleGAN folder to Python so that you can import it.  "
    ]
   },
   {
@@ -473,6 +464,13 @@
     "    G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can now generate images from integer seed codes in Python."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 9,
@@ -566,7 +564,7 @@
    "source": [
     "## Examining the Latent Vector\n",
     "\n",
-    "Figure 7.LVEC shows the effects of transforming the latent vector between two images. This transformation is accomplished by moving one 512-value latent vector slowly to the other 512 vector.  Images that have similar latent vectors will appear similarly to each other.  A high-dimension point between two latent vectos will appear similar to both of the two endpoint latent vectors.\n",
+    "Figure 7.LVEC shows the effects of transforming the latent vector between two images. We accomplish this transformation by slowly moving one 512-value latent vector to another 512 vector. A high-dimension point between two latent vectors will appear similar to both of the two endpoint latent vectors. Images that have similar latent vectors will appear similar to each other.\n",
     "\n",
     "**Figure 7.LVEC: Transforming the Latent Vector**\n",
     "![GAN](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/gan_progression.jpg \"GAN\")"
@@ -627,7 +625,7 @@
     "id": "7fCn7OIM6caj"
    },
    "source": [
-    "The following code will move between the provided seeds.  The constant STEPS specify how many frames there should be between each of the seeds."
+    "The following code will move between the provided seeds.  The constant STEPS specify how many frames there should be between each seed."
    ]
   },
   {
@@ -771,7 +769,7 @@
     "id": "JKWZQwJP7KDu"
    },
    "source": [
-    "Download the generated video."
+    "You can now download the generated video."
    ]
   },
   {