T3SRS: Tensor Train Transformer for compressing sequential recommender systems

Publisher:
Elsevier
Publication Type:
Journal Article
Citation:
Expert Systems with Applications, 2024, 238, pp. 122260
Issue Date:
2024-03-15
Filename Description Size
Binder1.pdfAccepted version1.34 MB
Adobe PDF
Full metadata record
In recent years, attention mechanisms have gained popularity in sequential recommender systems (SRSs) due to obtaining dynamic user preferences efficiently. However, over-parameterization of these models often increases the risk of overfitting. To address this challenge, we propose a Transformer model based on tensor train networks. Initially, we propose a tensor train layer (TTL) to accommodate the original weight matrix, thus reducing the space complexity of the mapping layer. Based on the TTL, we reconfigure the multi-head attention module and the position-wise feed-forward network. Finally, a tensor train layer replaces the output layer to complete the overall compression. According to the experimental results, the proposed model compresses SRSs parameters effectively, achieving compression rates of 76.2%−85.0%, while maintaining or enhancing sequence recommendation performance. To our knowledge, the Tensor Train Transformer is the first model compression approach for Transformer-based SRSs, and the model is broadly applicable.
Please use this identifier to cite or link to this item: