Enhancing Vision-Language Models Incorporating TSK Fuzzy System for Domain Adaptation

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2024 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2024, 00, pp. 1-8
Issue Date:
2024-01-01
Filename Description Size
1743887.pdfPublished version360.76 kB
Adobe PDF
Full metadata record
Unsupervised Domain Adaptation UDA addresses the challenge of applying knowledge from a labeled source domain to tasks within an unlabeled target domain where each domain exhibits unique data distributions To tackle significant uncertainty in the unlabeled target domain fuzzy domain adaptation methods have been devised However existing methods highly focus on utilizing visual information overlooking the potential textual information within class labels To this end vision language models have been developed to exploit information from both visual and textual branches Nonetheless adapting vision language models in UDA encounters several critical issues 1 current methods tend to optimize only one branch risking convergence to local optima and 2 insufficient exploitation of cross domain relationships To address these issues and advance UDA this paper proposes an innovative method called VLM TSK DA which enhances vision language models by integrating Takagi Sugeno Kang TSK fuzzy systems The TSK fuzzy system is employed as an image adapter to effectively manage uncertainty during the transfer process which is combined with image features in a residual manner for performance optimization Our method integrates the TSK fuzzy system with prompt learning ensuring simultaneous updates of both visual and textual branches to achieve a global optimum Furthermore we introduce a fuzzy c means clustering loss function designed to leverage inherent cross domain relationships significantly reducing the distance between the target domain data and source cluster centers with high membership values Thereby effectively minimizing the distribution discrepancy Empirical evaluations on real world datasets validate the efficacy of the proposed method
Please use this identifier to cite or link to this item: