Advanced Techniques of Cross Domain Translation Learning

Huang, Wanming

Advanced Techniques of Cross Domain Translation Learning

Huang, Wanming

Permalink

Publication Type:: Thesis
Issue Date:: 2020

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Download contents and abstractAdobe PDF (129.56 kB)

Download thesisAdobe PDF (23.74 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Huang, Wanming
dc.date.accessioned	2021-05-20T05:14:48Z
dc.date.available	2021-05-20T05:14:48Z
dc.date.issued	2020
dc.identifier.uri	http://hdl.handle.net/10453/149011
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Cross domain translation, such as image captioning, fashion synthesis from text descriptions, music composition in a particular style, has attracted considerable interest in the deep learning community lately. Despite significant progress in this field, certain drawbacks in previous methods have been identified. First, although the attention mechanism has been widely applied to domain transfer and achieved remarkable outcomes, cross-domain translation remains an open research question on cross domain transfer learning because of the different data structures. Second, most domain translation algorithms address only a pair of domains, and there is a need for 2 X {𝘕 choose 2} transfer functions given N image domains. This makes training prohibitively unmanageable. We have proposed a set of solutions to solve these two problems, as described in detail in Chapter 3. Third, most generative model based domain-transfer algorithms uses single-mode distribution to model the latent space. This does not work well on datasets that contain diversified samples that form multiple clusters. Our study applies mixture models to cross-domain generation, of which the effects and properties are illustrated in Chapter 4. Finally, cross-domain translation models usually suffer from long training time and are difficult to converge. Indeed, this applies to most deep neural network training that involves complex network designs and large datasets. Our work in Chapter 5 accelerates deep neural network training with a specially designed mini-batch sampling strategy.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/149011/2/02whole.pdf
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Advanced Techniques of Cross Domain Translation Learning	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Cross domain translation, such as image captioning, fashion synthesis from text descriptions, music composition in a particular style, has attracted considerable interest in the deep learning community lately. Despite significant progress in this field, certain drawbacks in previous methods have been identified. First, although the attention mechanism has been widely applied to domain transfer and achieved remarkable outcomes, cross-domain translation remains an open research question on cross domain transfer learning because of the different data structures. Second, most domain translation algorithms address only a pair of domains, and there is a need for 2 X {𝘕 choose 2} transfer functions given N image domains. This makes training prohibitively unmanageable. We have proposed a set of solutions to solve these two problems, as described in detail in Chapter 3. Third, most generative model based domain-transfer algorithms uses single-mode distribution to model the latent space. This does not work well on datasets that contain diversified samples that form multiple clusters. Our study applies mixture models to cross-domain generation, of which the effects and properties are illustrated in Chapter 4. Finally, cross-domain translation models usually suffer from long training time and are difficult to converge. Indeed, this applies to most deep neural network training that involves complex network designs and large datasets. Our work in Chapter 5 accelerates deep neural network training with a specially designed mini-batch sampling strategy.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/149011