Monaural Speech Enhancement on Drone via Adapter Based Transfer Learning

Chen, X; Bi, H; Lai, W-T; Ma, F

Monaural Speech Enhancement on Drone via Adapter Based Transfer Learning

Chen, X

Bi, H Lai, W-T Ma, F

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: 2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC), 2024, 00, pp. 85-89
Issue Date:: 2024-10-04

Closed Access

	Filename	Description	Size
	Conference Monaural speech enhancement on drone via Adapter based transfer learning.pdf	Published version	4.62 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chen, X https://orcid.org/0000-0001-5711-7996
dc.contributor.author	Bi, H
dc.contributor.author	Lai, W-T
dc.contributor.author	Ma, F
dc.date	2024-09-09
dc.date.accessioned	2025-02-03T23:03:37Z
dc.date.available	2025-02-03T23:03:37Z
dc.date.issued	2024-10-04
dc.identifier.citation	2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC), 2024, 00, pp. 85-89
dc.identifier.isbn	979-8-3503-6186-5
dc.identifier.issn	2639-4316
dc.identifier.uri	http://hdl.handle.net/10453/184878
dc.description.abstract	Monaural Speech enhancement on drones is challenging because the ego noise from the rotating motors and propellers leads to extremely low signal to noise ratios at onboard microphones Although recent masking based deep neural network methods excel in monaural speech enhancement they struggle in the challenging drone noise scenario Furthermore existing drone noise datasets are limited causing models to overfit Considering the harmonic nature of drone noise this paper proposes a frequency domain bottleneck adapter to enable transfer learning Specifically the adapter s parameters are trained on drone noise while retaining the parameters of the pre trained Frequency Recurrent Convolutional Recurrent Network FRCRN fixed Evaluation results demonstrate the proposed method can effectively enhance speech quality Moreover it is a more efficient alternative to fine tuning models for various drone types which requires substantial computational resources
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC)
dc.relation.ispartof	2024 18th International Workshop on Acoustic Signal Enhancement
dc.relation.ispartofseries	International Workshop on Acoustic Signal Enhancement
dc.relation.isbasedon	10.1109/iwaenc61483.2024.10694014
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Monaural Speech Enhancement on Drone via Adapter Based Transfer Learning
dc.type	Conference Proceeding
utslib.citation.volume	00
utslib.location.activity	Aalborg, Denmark
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Mechanical and Mechatronic Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2025-02-03T23:03:35Z
pubs.finish-date	2024-09-12
pubs.place-of-publication	Piscataway, USA
pubs.publication-status	Published
pubs.start-date	2024-09-09
pubs.volume	00
dc.location	Piscataway, USA

Abstract:

Monaural Speech enhancement on drones is challenging because the ego noise from the rotating motors and propellers leads to extremely low signal to noise ratios at onboard microphones Although recent masking based deep neural network methods excel in monaural speech enhancement they struggle in the challenging drone noise scenario Furthermore existing drone noise datasets are limited causing models to overfit Considering the harmonic nature of drone noise this paper proposes a frequency domain bottleneck adapter to enable transfer learning Specifically the adapter s parameters are trained on drone noise while retaining the parameters of the pre trained Frequency Recurrent Convolutional Recurrent Network FRCRN fixed Evaluation results demonstrate the proposed method can effectively enhance speech quality Moreover it is a more efficient alternative to fine tuning models for various drone types which requires substantial computational resources

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/184878