Now showing 1 - 1 of 1
ItemTowards improving the network architecture of GANs and their evaluation methodsBarua, Sukarna ( 2019)Generative Adversarial Networks (GANs) are a powerful class of generative models. GAN models have recently brought significant success in image synthesis tasks. One key issue concerning GANs is the design of a network architecture that results in high training stability and sample quality. GAN models consist of two distinct neural networks known as the generator and discriminator. Conventional practice is to use a deep convolution architecture for both networks that eliminates fully connected layers from the architecture or restricts their uses to only input and output layers. Our investigation reveals that eliminating fully connected layers from the network architecture of GANs is not the best practice, and more effective GAN architecture can be designed by rather exploiting fully connected layers in the conventional convolution architecture. In this respect, we propose an improved network architecture for GANs that employs multiple fully connected layers in both the generator and discriminator networks. Models based on our proposed architecture learn both faster than the conventional architecture and also generate higher quality of samples. In addition, our proposed architecture demonstrates higher training stability than the conventional architecture in several experimental settings. We demonstrate the effectiveness of our architecture in generating high-fidelity images on four benchmark image datasets. Another key challenge when using GANs is how to best measure their ability to generate realistic data. In this regard, we demonstrate that an intrinsic dimensional characterization of the data space learned by a GAN model leads to an effective evaluation metric for GAN quality. In particular, we propose a new evaluation measure, CrossLID, that assesses the local intrinsic dimensionality (LID) of real-world data with respect to neighborhoods found in GAN-generated samples. Intuitively, CrossLID measures the degree to which manifolds of two data distributions coincide with each other. We compare our proposed measure to several state-of-the-art evaluation metrics. Our experiments show that CrossLID is strongly correlated with the progress of GAN training, is sensitive to mode collapse, is robust to small-scale noise and image transformations, and robust to sample size. One key advantage of the proposed CrossLID metric is the ability to assess mode-wise performance of GAN models. The mode-wise evaluation can be used to assess how well a GAN model has learned the different modes present in the target data distribution. We demonstrate how the proposed mode-wise assessment can be utilized during the GAN training process to detect unlearned modes. This leads us to an effective training strategy for GANs that dynamically mitigate unlearned modes by oversampling them during the training. Experiments on benchmark image datasets show that our proposed training approach achieves better performance scores than the conventional GAN training. In addition, our training approach demonstrates higher stability against mode failures of GANs compared to the conventional training.