Speech2EEG: Leveraging Pretrained Speech Model for EEG Signal Recognition.
- Publisher:
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Publication Type:
- Journal Article
- Citation:
- IEEE Trans Neural Syst Rehabil Eng, 2023, 31, pp. 2140-2153
- Issue Date:
- 2023
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Full metadata record
Field | Value | Language |
---|---|---|
dc.contributor.author |
Zhou, J https://orcid.org/0000-0002-6620-604X |
|
dc.contributor.author |
Duan, Y https://orcid.org/0000-0003-1517-994X |
|
dc.contributor.author | Zou, Y | |
dc.contributor.author | Chang, Y-C | |
dc.contributor.author | Wang, Y-K | |
dc.contributor.author | Lin, C-T | |
dc.date.accessioned | 2024-03-05T03:47:42Z | |
dc.date.available | 2024-03-05T03:47:42Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | IEEE Trans Neural Syst Rehabil Eng, 2023, 31, pp. 2140-2153 | |
dc.identifier.issn | 1534-4320 | |
dc.identifier.issn | 1558-0210 | |
dc.identifier.uri | http://hdl.handle.net/10453/176126 | |
dc.description.abstract | Identifying meaningful brain activities is critical in brain-computer interface (BCI) applications. Recently, an increasing number of neural network approaches have been proposed to recognize EEG signals. However, these approaches depend heavily on using complex network structures to improve the performance of EEG recognition and suffer from the deficit of training data. Inspired by the waveform characteristics and processing methods shared between EEG and speech signals, we propose Speech2EEG, a novel EEG recognition method that leverages pretrained speech features to improve the accuracy of EEG recognition. Specifically, a pretrained speech processing model is adapted to the EEG domain to extract multichannel temporal embeddings. Then, several aggregation methods, including the weighted average, channelwise aggregation, and channel-and-depthwise aggregation, are implemented to exploit and integrate the multichannel temporal embeddings. Finally, a classification network is used to predict EEG categories based on the integrated features. Our work is the first to explore the use of pretrained speech models for EEG signal analysis as well as the effective ways to integrate the multichannel temporal embeddings from the EEG signal. Extensive experimental results suggest that the proposed Speech2EEG method achieves state-of-the-art performance on two challenging motor imagery (MI) datasets, the BCI IV-2a and BCI IV-2b datasets, with accuracies of 89.5% and 84.07% , respectively. Visualization analysis of the multichannel temporal embeddings show that the Speech2EEG architecture can capture useful patterns related to MI categories, which can provide a novel solution for subsequent research under the constraints of a limited dataset scale. | |
dc.format | Print-Electronic | |
dc.language | eng | |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | |
dc.relation | http://purl.org/au-research/grants/arc/DP210101093 | |
dc.relation | United States Department of the NavyN629091912058 | |
dc.relation | http://purl.org/au-research/grants/arc/DP220100803 | |
dc.relation.ispartof | IEEE Trans Neural Syst Rehabil Eng | |
dc.relation.isbasedon | 10.1109/TNSRE.2023.3268751 | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.subject | 0903 Biomedical Engineering, 0906 Electrical and Electronic Engineering | |
dc.subject.classification | Biomedical Engineering | |
dc.subject.classification | 4003 Biomedical engineering | |
dc.subject.classification | 4007 Control engineering, mechatronics and robotics | |
dc.subject.mesh | Humans | |
dc.subject.mesh | Speech | |
dc.subject.mesh | Imagination | |
dc.subject.mesh | Brain-Computer Interfaces | |
dc.subject.mesh | Neural Networks, Computer | |
dc.subject.mesh | Electroencephalography | |
dc.subject.mesh | Algorithms | |
dc.subject.mesh | Humans | |
dc.subject.mesh | Electroencephalography | |
dc.subject.mesh | Speech | |
dc.subject.mesh | Imagination | |
dc.subject.mesh | Algorithms | |
dc.subject.mesh | Brain-Computer Interfaces | |
dc.subject.mesh | Neural Networks, Computer | |
dc.subject.mesh | Humans | |
dc.subject.mesh | Speech | |
dc.subject.mesh | Imagination | |
dc.subject.mesh | Brain-Computer Interfaces | |
dc.subject.mesh | Neural Networks, Computer | |
dc.subject.mesh | Electroencephalography | |
dc.subject.mesh | Algorithms | |
dc.title | Speech2EEG: Leveraging Pretrained Speech Model for EEG Signal Recognition. | |
dc.type | Journal Article | |
utslib.citation.volume | 31 | |
utslib.location.activity | United States | |
utslib.for | 0903 Biomedical Engineering | |
utslib.for | 0906 Electrical and Electronic Engineering | |
pubs.organisational-group | University of Technology Sydney | |
pubs.organisational-group | University of Technology Sydney/Faculty of Engineering and Information Technology | |
pubs.organisational-group | University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute | |
pubs.organisational-group | University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science | |
utslib.copyright.status | open_access | * |
dc.date.updated | 2024-03-05T03:47:36Z | |
pubs.publication-status | Published | |
pubs.volume | 31 |
Abstract:
Identifying meaningful brain activities is critical in brain-computer interface (BCI) applications. Recently, an increasing number of neural network approaches have been proposed to recognize EEG signals. However, these approaches depend heavily on using complex network structures to improve the performance of EEG recognition and suffer from the deficit of training data. Inspired by the waveform characteristics and processing methods shared between EEG and speech signals, we propose Speech2EEG, a novel EEG recognition method that leverages pretrained speech features to improve the accuracy of EEG recognition. Specifically, a pretrained speech processing model is adapted to the EEG domain to extract multichannel temporal embeddings. Then, several aggregation methods, including the weighted average, channelwise aggregation, and channel-and-depthwise aggregation, are implemented to exploit and integrate the multichannel temporal embeddings. Finally, a classification network is used to predict EEG categories based on the integrated features. Our work is the first to explore the use of pretrained speech models for EEG signal analysis as well as the effective ways to integrate the multichannel temporal embeddings from the EEG signal. Extensive experimental results suggest that the proposed Speech2EEG method achieves state-of-the-art performance on two challenging motor imagery (MI) datasets, the BCI IV-2a and BCI IV-2b datasets, with accuracies of 89.5% and 84.07% , respectively. Visualization analysis of the multichannel temporal embeddings show that the Speech2EEG architecture can capture useful patterns related to MI categories, which can provide a novel solution for subsequent research under the constraints of a limited dataset scale.
Please use this identifier to cite or link to this item:
Download statistics for the last 12 months
Not enough data to produce graph