diff --git a/0-abstract-and-keywords.tex b/0-abstract-and-keywords.tex index 0a233cb..b252bf2 100644 --- a/0-abstract-and-keywords.tex +++ b/0-abstract-and-keywords.tex @@ -12,8 +12,8 @@ We propose a method of partition training, making deep neural network easier to train. It is validated that the model can validly generate plausible facial images from attributes and be adjusted by attributes. -Results also suggest that the use-cost of LittleGAN has declined. Compared to DCGAN and pix2pix, - it can be more effectively applied to the actual production activity. +Results also suggest that the use-cost of LittleGAN has declined. +Compared to DCGAN and pix2pix, it can be more effectively applied to the actual production activity. \begin{IEEEkeywords} Facial Image Generation, Facial Image Adjustment, Generative Adversarial Networks, Machine Learning diff --git a/1-introduction.tex b/1-introduction.tex index 8814457..a0e9802 100644 --- a/1-introduction.tex +++ b/1-introduction.tex @@ -1,21 +1,27 @@ \section{Introduction} -As an important characteristic, face makes a difference in interpersonal activities. -In daily life, it is necessary to get specific persons' images on various occasions, for example, passing the portrait information of strangers and getting different states of a person. -In recent years, machine learning has developed deeply in society. -Facial image generation and adjustment are one of the directions. +In daily life, it is necessary to get specific persons' images on various occasions, + for example, passing the portrait information of strangers and getting different states of a person. +In recent years, machine learning has spread to all aspects of society. +Facial image generation and adjustment is an emerging subfield. -Combined with the technology of facial identification, facial image generation can be used to determine identity. -The technology mainly applies to identity verification, used to track targets, which is a vital supporting method to combat criminal activities. -As for social activities, it can also be utilized to the identification in intelligent interaction scene, for example, security check at school gates and the check point of companys. +Combined with the technology of facial identification, + facial image generation can be used to redefine identity. +More than any other body part, the face defines who we are. +Facial recognition is often used for identification. +For Facebook it is a tool of convenience for tagging friends, + in China it is a vital method for combatting crime. +When it comes to social activities, it can also be utilized for the identification. +For example, identity authentication at a school gate or the reception in an office. -Nowadays, public safety agency mainly adopt following methods to get facial images. -According to the description of eyewitness, corporate with painters to get rough facial images, - search for the image in existing image database and determine the identity. +The long standing method for authorities to identify a suspect seen by an eyewitness is to have a painter get a rough facial image, + search for the figure in the existent image data base and then finally confirm the identity of the figure. This method has large shortcomings. -First, perspective of painters will affect generation of the portrait. - Second, there will be high requirements for users. - Third, real-time performance is poor. -Another method is getting photos by photography. However, photographing angle may also be poor. +First, the biases of painters will affect generation of the portrait. +The second is that the cost of the rendering increases sharply with the quality. +And the third is that real-time performance is poor. +Another method is getting photos by photography. +However, there will be problems like poor photographing angle or the slow exposure. In order to track and position targets, photos need to be accurate. -The research on adjusting the attributes of facial image is used to solve this problem. -This research has certain practical significance in the fields, which aquires facial images more quickly. \ No newline at end of file +LittleGAN on adjusting the attributes of facial image is used to solve this problem. +This research has certain practical significance for the fields that require accurate facial images, + such as in the accurate positioning of targets in criminal investigation or the securing of the entrance of a building. \ No newline at end of file diff --git a/2-related-work.tex b/2-related-work.tex index ff71912..cb44d26 100644 --- a/2-related-work.tex +++ b/2-related-work.tex @@ -9,17 +9,19 @@ \subsubsection*{Image Generation} In StackGAN\upcite{stackgan}, the author also embedded the text description into the text feature map, and initially generated the contour of the object, and further combined the text feature map to generate higher resolution and more detailed images. - InfoGAN was able to extract attributes from images, enabling conditional generation of images and unsupervised learning. +InfoGAN\upcite{infogan} was able to extract attributes from images, enabling conditional generation of images and unsupervised learning. They also used different discriminators to identify images at different stages, thereby improving the quality of the image. \subsubsection*{Facial Image Generation} -At present, there are several models related to the improvement of image generation quality, +At present, there are several methods related to the improvement of facial image generation, such as BEGAN\upcite{began} and PGGAN\upcite{pggan}. -BEGAN\upcite{began} used a generator and a discriminator similar to variational auto-encoding. -It sequentially convolved images. It improved resolution and diversity of images. It also maintained the stability. -PGGAN\upcite{pggan} first trained low-resolution facial image. It then gradually increased network layer for training, +BEGAN\upcite{began} used variational auto-encoder as the discriminator. +It provides a hyperparameter to instruct the balance between image variety and quality. +It also maintained the stability. +PGGAN\upcite{pggan} first trained low-resolution facial image. +It then gradually increased network layer for training, therefore improving resolution of output images and stability of the network. @@ -28,7 +30,6 @@ \subsection{Facial Image Adjustment} \subsubsection*{Image Translation} Image translation treats the image as outputs. - The purpose of changing attributes is achieved by converting them. Pix2pix\upcite{pix2pix} used U-Net network, L1 Loss and PatchGAN classifiers. In generator, U-Net combined Encoder with Decoder, reducing pressure on each network layer and retaining more image information. @@ -46,11 +47,11 @@ \subsubsection*{Image Translation} \subsubsection*{Facial Image Adjustment} StarGAN\upcite{stargan} trained across datasets so that the model could learn attributes and features from multiple datasets. - It could generate images by specifying more attributes. +It could generate images by specifying more attributes. AttGAN\upcite{attgan} extracted the attributes and latent map of images, changed attributes and combined latent map to generate images. It realized adjustment of images. -Pix2pixHD\upcite{pix2pixhd} used multiple discriminators to discriminating images of different resolutions +Pix2pixHD\upcite{pix2pixhd} used different discriminators to discriminate images of different resolutions and finally combined features at different resolutions to generate high-resolution facial images. However, this model is huge and difficult to train. Apart from that, it is too expensive to deploy and run. diff --git a/3-target-and-innovation.tex b/3-target-and-innovation.tex index d58536f..d9a77c6 100644 --- a/3-target-and-innovation.tex +++ b/3-target-and-innovation.tex @@ -4,9 +4,9 @@ \subsection{Research Goal} \begin{itemize} \item Propose a machine learning technology based, workable, convenient and efficient solution, addressing the needs of facial image generation and adjustment. - It can replace traditional methods which have certain defects,for example, poor real-time performance and time consuming. + It can replace traditional methods which have certain defects, for example, poor real-time performance and time consuming. \item Combine current demand and most up-to-date work to improve the model and training methods. - Reduce size and use-cost of model. + Reduce size and use-cost of model. \item Use new methods to adjust images so as to reduce information loss of original image. \end{itemize} \subsection{Research Highlight} diff --git a/4-process.tex b/4-process.tex index 68d9dd7..3cca2f7 100644 --- a/4-process.tex +++ b/4-process.tex @@ -95,7 +95,7 @@ \subsection{Research Process} \subsection{Model Overview} After the research above, we finally obtained the conditional facial image generation and adjustment model. - It is as shown in Figure \ref{smliegan}. +It is as shown in Figure \ref{smliegan}. \begin{figure} \begin{center} diff --git a/5-experiments.tex b/5-experiments.tex index 1e8b9fd..446a55d 100644 --- a/5-experiments.tex +++ b/5-experiments.tex @@ -25,7 +25,7 @@ \subsection{Training Details} Eecoder network is freezed when training adjustor network. The $\lambda$ in both generator and adjustor network is set to 0.02. We use 128-dimensional latent vector as input. -Use 32 images training set for training. +Use 32 images training set for training. A total of 25 training sessions are used. In each batch, generator is used to generate the image first. Then discriminator is used to identify the image and adjusted by adjustor network. diff --git a/6-conclusion-and-discuss.tex b/6-conclusion-and-discuss.tex index f433fc8..aa52e65 100644 --- a/6-conclusion-and-discuss.tex +++ b/6-conclusion-and-discuss.tex @@ -4,7 +4,8 @@ \subsection{Conclusion} We finally propose a facial image generation and adjustment model and its training method. We share decoder and encoder among three networks, which are generator, discriminator and adjustor. In that way, we reduce the model size and the calculation of training. -We adjust facial image in the image space. It reduces the information loss of original image during the adjustment. +We adjust facial image in the image space. +It reduces the information loss of original image during the adjustment. In the training aspect, we use gradient penalty proposed in WGAN-GP\upcite{wgan-gp}. Most importantly, we innovatively put forward partition training. We have achieved research targets, which are speeding up convergence and further reducing training calculation. @@ -12,22 +13,24 @@ \subsection{Conclusion} \subsection{Application} \subsubsection*{Searching for the target person} In daily life, in many cases, it is necessary to obtain images of specific person. -For example, convey the portrait information of strangers and - obtain images in different states. -In the past, general use are verbal descriptions and drawing sketches. They have poor real-time performance. - Also, requirements for personnel is high. The information conveyed is not intuitive enough to be biased. +For example, convey the portrait information of strangers and obtain images in different states. +In the past, general use are verbal descriptions and drawing sketches. +They have poor real-time performance. +Also, requirements for personnel is high. +The information conveyed is not intuitive enough to be biased. Our model is small, requires less computing equipment and produces images that are closer to real-world images. It can be easily deployed in mobile devices to better improve this reality. \subsubsection*{Expand dataset} Nowadays, most of machine learning relies on a large amount of data, while in reality, there is less data tagged and tagging costs are high. -Using LittleGAN, we can expand dataset. Machine learning can get more data for training, +Using LittleGAN, we can expand dataset. +Machine learning can get more data for training, verification and testing. \subsubsection*{Virtual Image Generation} On the Internet, for player or non-player characters generation in games and privacy protection, -individuals or enterprises need personalized avatar generation. + individuals or enterprises need personalized avatar generation. We believe that using this model can meet the above personalized needs at a lower cost. \subsection{Prospect} @@ -70,7 +73,8 @@ \subsubsection*{Natural Language as Input} \subsubsection*{Multi-domain Migration} -LittleGAN is to provide a solution to the needs of facial image generation and adjustment. It is more suitable for production environment. +LittleGAN is to provide a solution to the needs of facial image generation and adjustment. +It is more suitable for production environment. The improved method we proposed in the study, like partition training, can also be applied to more fields. We hope to transfer and apply the results and experience gained in our research to more areas and reduce use-cost of training, deployment and running.