Learning Neural Network Architecture from Data: NAS and Dynamic Networks

Li, Changlin

Learning Neural Network Architecture from Data: NAS and Dynamic Networks

Li, Changlin

Permalink

Publication Type:: Thesis
Issue Date:: 2023

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download thesisAdobe PDF (9.16 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Li, Changlin
dc.date.accessioned	2023-07-14T01:46:10Z
dc.date.available	2023-07-14T01:46:10Z
dc.date.issued	2023
dc.identifier.uri	http://hdl.handle.net/10453/171498
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	A myriad of breakthroughs in neural network architecture has brought significant improvement on wide range of deep learning tasks. Despite the large advances brought about by network design, manually finding well-optimized network architecture is challenging given large amount of design choices. Automatically learning neural network architecture from data, e.g. neural architecture search (NAS) and dynamic network, are novel ways of architecture design. These newly rising data-dependent methods have great potential but also have many fatal problems yet to be solved. On one hand, the efficiency and effectiveness of NAS cannot be guaranteed at the same time, because of the inaccurate architecture ratings caused by the large search space. On the other hand, for dynamic networks, dynamic sparse patterns on convolutional filters in dynamic pruning methods fail to achieve actual acceleration in real-world implementation, due to the extra burden of indexing, weight-copying, or zero-masking. Therefore, we propose two novel NAS methods and one dynamic network method to overcome these issues. Firstly, to improve NAS's effectiveness, we propose to modularize the large search space of NAS into blocks and use the blockwise representation of existing models to supervise our architecture search, distilling the neural architecture knowledge from a teacher model, forming our DNA. Secondly, to cast off the yoke of the teacher architecture, we further propose an unsupervised NAS method named Block-wisely Self-Supervised Neural Architecture Search (BossNAS). Finally, to address the aforementioned issue of dynamic network, we explore a dynamic network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims to achieve good hardware-efficiency via dynamically adjusting filter numbers of networks at test time with respect to different inputs, while keeping filters stored statically and contiguously in hardware to prevent the extra burden.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/171498/1/thesis.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	© 2023 Changlin Li
dc.rights	au.edu.uts.lib/cph
dc.title	Learning Neural Network Architecture from Data: NAS and Dynamic Networks	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

A myriad of breakthroughs in neural network architecture has brought significant improvement on wide range of deep learning tasks. Despite the large advances brought about by network design, manually finding well-optimized network architecture is challenging given large amount of design choices. Automatically learning neural network architecture from data, e.g. neural architecture search (NAS) and dynamic network, are novel ways of architecture design. These newly rising data-dependent methods have great potential but also have many fatal problems yet to be solved. On one hand, the efficiency and effectiveness of NAS cannot be guaranteed at the same time, because of the inaccurate architecture ratings caused by the large search space. On the other hand, for dynamic networks, dynamic sparse patterns on convolutional filters in dynamic pruning methods fail to achieve actual acceleration in real-world implementation, due to the extra burden of indexing, weight-copying, or zero-masking. Therefore, we propose two novel NAS methods and one dynamic network method to overcome these issues. Firstly, to improve NAS's effectiveness, we propose to modularize the large search space of NAS into blocks and use the blockwise representation of existing models to supervise our architecture search, distilling the neural architecture knowledge from a teacher model, forming our DNA. Secondly, to cast off the yoke of the teacher architecture, we further propose an unsupervised NAS method named Block-wisely Self-Supervised Neural Architecture Search (BossNAS). Finally, to address the aforementioned issue of dynamic network, we explore a dynamic network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims to achieve good hardware-efficiency via dynamically adjusting filter numbers of networks at test time with respect to different inputs, while keeping filters stored statically and contiguously in hardware to prevent the extra burden.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/171498