Multidimensional Dynamic Pruning: Exploring Spatial and Channel Fuzzy Sparsity

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Fuzzy Systems, 2024, 32, (9), pp. 4890-4901
Issue Date:
2024-01-01
Filename Description Size
1719661.pdfPublished version1.92 MB
Full metadata record
Dynamic pruning is an effective model compression method to reduce the computational cost of networks. However, existing dynamic pruning methods are limited to pruning along a single dimension (channel, spatial, or depth), which cannot maximally excavate the redundancy of the network. Meanwhile, most of the current state-of-the-arts usually implement dynamic pruning via masked-out partial channels and pixels for training while failing to accelerate the inference speed. To tackle these limitations, we propose a novel fuzzy-based multidimensional dynamic pruning paradigm to dynamically compress neural networks along both the channel and spatial dimensions. Specifically, we design a multidimensional fuzzy-mask block to simultaneously learn which spatial positions or channels are redundant and need to be pruned. Then, the Gumbel-Softmax trick combined with a sparsity loss is introduced to train these mask modules in an end-to-end manner. During the testing stage, we convert features and convolution kernels into two matrices, respectively, and then implement sparse convolution through matrix multiplication to accelerate the network inference. Extensive experiments demonstrate that our method outperforms existing methods in terms of accuracy and computational cost. For instance, on the CIFAR-10 dataset, our method prunes 68% FLOPs of ResNet-56 with only a 0.07% Top-1 accuracy drop.
Please use this identifier to cite or link to this item: