Advanced Techniques of Cross Domain Translation Learning

Publication Type:
Thesis
Issue Date:
2020
Full metadata record
Cross domain translation, such as image captioning, fashion synthesis from text descriptions, music composition in a particular style, has attracted considerable interest in the deep learning community lately. Despite significant progress in this field, certain drawbacks in previous methods have been identified. First, although the attention mechanism has been widely applied to domain transfer and achieved remarkable outcomes, cross-domain translation remains an open research question on cross domain transfer learning because of the different data structures. Second, most domain translation algorithms address only a pair of domains, and there is a need for 2 X {𝘕 choose 2} transfer functions given N image domains. This makes training prohibitively unmanageable. We have proposed a set of solutions to solve these two problems, as described in detail in Chapter 3. Third, most generative model based domain-transfer algorithms uses single-mode distribution to model the latent space. This does not work well on datasets that contain diversified samples that form multiple clusters. Our study applies mixture models to cross-domain generation, of which the effects and properties are illustrated in Chapter 4. Finally, cross-domain translation models usually suffer from long training time and are difficult to converge. Indeed, this applies to most deep neural network training that involves complex network designs and large datasets. Our work in Chapter 5 accelerates deep neural network training with a specially designed mini-batch sampling strategy.
Please use this identifier to cite or link to this item: