- In this study, we work to transfer the pattern to the basic face image from the pattern face image, by adjusting the latent space and moving within it.
- Initially, we are working on using VAE with the aim of using its encoder later to extract features during training of the basic neural structure.
- That is, in the study, a pre-trained model was not used on Dataset, such as ImageNet, but rather a VAE neural network was trained with the aim of studying facial expressions and the characteristics of those expressions, and at the same time using the latent space resulting from VAE training to be considered input to the GAN neural network.
- Since, according to the study used, we seek to move within the latent space to find appropriate facial expressions that carry the basic features of the basic facial image in addition to the facial features of the pattern image, therefore we need to move within the basic latent space to find the features contained in both the basic image and the pattern image.
- Since normalization helps bring the values closer together, we will initially normalize the resulting values each time from the latent space, and thus it is considered the first step that helps in moving within the latent space of the image for the hybrid values that carry the features of both sides of the image.
- Since normalization helps bring values closer together and make them smaller in value, therefore we can imagine this idea as moving from a latent space with large values (I am talking here about the field of values that each dimension takes) to a field of smaller values.
- Later, through the location of the pattern image within the latent space, we will be able to reposition it within the latent space again.
- In this methodology, we will be able to reach a hybrid location within the latent space that carries both the features of the pattern image and the basic image.
- Note: This methodology requires a huge number of images. It also requires a lot of training time.
- Note: I trained the GAN for 100,000 epochs, and the greater the number of rounds, the better results we were able to achieve.
- As we mentioned at the beginning, the location of the basic image within the latent space coming from the VAE network, as well as for the pattern image. The proposed method is with the aim of repositioning within the latent space.
- In the end, through this methodology, we will not get the same latent space (it will be an analogue of the first latent space and have some deviations consistent with the concept of pattern transfer).
Main Idea |
---|
dataset link: https://www.kaggle.com/datasets/almightyj/person-face-dataset-thispersondoesnotexist/versions/1
Samples |
---|
The content image is the same and the style image is changed | The content image is changed and the style image is changed at every time |
---|---|
In both directions |
---|