Generative modelling and adversarial learning

Wang, Chaoyue

Generative modelling and adversarial learning

Wang, Chaoyue

Permalink

Publication Type:: Thesis
Issue Date:: 2018

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (107.68 kB)

Adobe PDF

Download thesisAdobe PDF (11.06 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, Chaoyue
dc.date.accessioned	2018-10-03T01:59:44Z
dc.date.available	2018-10-03T01:59:44Z
dc.date.issued	2018
dc.identifier.uri	http://hdl.handle.net/10453/127910
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_AU
dc.description.abstract	A main goal of statistics and machine learning is to represent and manipulate high-dimensional probability distributions of real-world data, such as natural images. Generative adversarial networks (GAN), which are based on the adversarial learning paradigm, are one of the main types of methods for deriving generative models from complicated real-world data. GAN and its variants use a generator to synthesise semantic data from standard signal distributions and train a discriminator to distinguish real samples in the training dataset from fake samples synthesised by the generator. As a confronter, the generator aims to deceive the discriminator by producing ever more realistic samples. Through a two-player adversarial game played by the generator and discriminator, the generated distribution can approximate the real-world distribution and generate samples from it. This thesis aims to both improve the quality of generative modelling and manipulate generated samples by specifying multiple scene properties. A novel framework for training GAN is proposed to stabilise the training process and produce more realistic samples. Unlike existing GANs, which alternately train a generator and a discriminator using a pre-defined adversarial objective function, different adversarial training objectives are utilised as mutation operations and train a population of generators to adapt to the environment (i.e. the discriminator). The samples generated by different iterations of generators are evaluated and only well-performing generators are preserved and used for further training. In this way, the proposed framework overcomes the limitations of an individual adversarial training objective and always preserves the best offspring, contributing to the progress and success of GANs. Based on the GANs framework, this thesis devised a novel model, called a perceptual adversarial network (PAN). The proposed PAN consists of two feed-forward convolutional neural networks: a transformation network and a discriminative network. Besides generative adversarial loss, which is widely used in GANs, this thesis proposes to employ perceptual adversarial loss, which undergoes adversarial training between the transformation network and hidden layers of the discriminative network. The hidden layers and output of the discriminative network are upgraded to constantly and automatically discover discrepancies between a transformed image and the corresponding ground truth, and the image transformation network is trained to minimise the discrepancy identified by the discriminative network. Furthermore, to extend the generative models to perform more challenging re-rendering tasks, this thesis explores disentangled representations encoded in real-world samples and proposes a principled tag disentangled generative adversarial network for re-rendering new samples of the object of interest from a single image by specifying multiple scene properties. Specifically, from an input sample, a disentangling network extracts disentangled and interpretable representations, which are then used to generate new samples using the generative network. In order to improve the quality of the disentangled representations, a tag mapping net determines the consistency between the image and its tags. Finally, experiments with different challenging datasets and image synthesis tasks demonstrate the good performance of the proposed frameworks regarding the problem of interest.	en_AU
dc.format	Thesis (PhD)
dc.language.iso	en_AU	en_AU
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/127910/2/02whole.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.subject	Generative modelling.	en_AU
dc.subject	Adversarial learning.	en_AU
dc.subject	Generative adversarial networks.	en_AU
dc.subject	Perceptual adversarial loss.	en_AU
dc.title	Generative modelling and adversarial learning	en_AU
dc.type	Thesis	en_AU
utslib.copyright.status	open_access

Abstract:

A main goal of statistics and machine learning is to represent and manipulate high-dimensional probability distributions of real-world data, such as natural images. Generative adversarial networks (GAN), which are based on the adversarial learning paradigm, are one of the main types of methods for deriving generative models from complicated real-world data. GAN and its variants use a generator to synthesise semantic data from standard signal distributions and train a discriminator to distinguish real samples in the training dataset from fake samples synthesised by the generator. As a confronter, the generator aims to deceive the discriminator by producing ever more realistic samples. Through a two-player adversarial game played by the generator and discriminator, the generated distribution can approximate the real-world distribution and generate samples from it. This thesis aims to both improve the quality of generative modelling and manipulate generated samples by specifying multiple scene properties. A novel framework for training GAN is proposed to stabilise the training process and produce more realistic samples. Unlike existing GANs, which alternately train a generator and a discriminator using a pre-defined adversarial objective function, different adversarial training objectives are utilised as mutation operations and train a population of generators to adapt to the environment (i.e. the discriminator). The samples generated by different iterations of generators are evaluated and only well-performing generators are preserved and used for further training. In this way, the proposed framework overcomes the limitations of an individual adversarial training objective and always preserves the best offspring, contributing to the progress and success of GANs. Based on the GANs framework, this thesis devised a novel model, called a perceptual adversarial network (PAN). The proposed PAN consists of two feed-forward convolutional neural networks: a transformation network and a discriminative network. Besides generative adversarial loss, which is widely used in GANs, this thesis proposes to employ perceptual adversarial loss, which undergoes adversarial training between the transformation network and hidden layers of the discriminative network. The hidden layers and output of the discriminative network are upgraded to constantly and automatically discover discrepancies between a transformed image and the corresponding ground truth, and the image transformation network is trained to minimise the discrepancy identified by the discriminative network. Furthermore, to extend the generative models to perform more challenging re-rendering tasks, this thesis explores disentangled representations encoded in real-world samples and proposes a principled tag disentangled generative adversarial network for re-rendering new samples of the object of interest from a single image by specifying multiple scene properties. Specifically, from an input sample, a disentangling network extracts disentangled and interpretable representations, which are then used to generate new samples using the generative network. In order to improve the quality of the disentangled representations, a tag mapping net determines the consistency between the image and its tags. Finally, experiments with different challenging datasets and image synthesis tasks demonstrate the good performance of the proposed frameworks regarding the problem of interest.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/127910