Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions

Al-Ali, AKH; Dean, D; Senadji, B; Chandran, V; Naik, GR

Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions

Al-Ali, AKH Dean, D Senadji, B Chandran, V Naik, GR

Permalink

Publication Type:: Journal Article
Citation:: IEEE Access, 2017, 5 pp. 15400 - 15413
Issue Date:: 2017-07-19

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published VersionAdobe PDF (8.16 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Al-Ali, AKH	en_US
dc.contributor.author	Dean, D	en_US
dc.contributor.author	Senadji, B	en_US
dc.contributor.author	Chandran, V	en_US
dc.contributor.author	Naik, GR https://orcid.org/0000-0003-1790-9838	en_US
dc.date.issued	2017-07-19	en_US
dc.identifier.citation	IEEE Access, 2017, 5 pp. 15400 - 15413	en_US
dc.identifier.uri	http://hdl.handle.net/10453/124599
dc.description.abstract	© 2013 IEEE. Environmental noise and reverberation conditions severely degrade the performance of forensic speaker verification. Robust feature extraction plays an important role in improving forensic speaker verification performance. This paper investigates the effectiveness of combining features, mel frequency cepstral coefficients (MFCCs), and MFCC extracted from the discrete wavelet transform (DWT) of the speech, with and without feature warping for improving modern identity-vector (i-vector)-based speaker verification performance in the presence of noise and reverberation. The performance of i-vector speaker verification was evaluated using different feature extraction techniques: MFCC, feature-warped MFCC, DWT-MFCC, feature-warped DWT-MFCC, a fusion of DWT-MFCC and MFCC features, and fusion feature-warped DWT-MFCC and feature-warped MFCC features. We evaluated the performance of i-vector speaker verification using the Australian Forensic Voice Comparison and QUT-NOISE databases in the presence of noise, reverberation, and noisy and reverberation conditions. Our results indicate that the fusion of feature-warped DWT-MFCC and feature-warped MFCC is superior to other feature extraction techniques in the presence of environmental noise under the majority of signal-to-noise ratios (SNRs), reverberation, and noisy and reverberation conditions. At 0-dB SNR, the performance of the fusion of feature-warped DWT-MFCC and feature-warped MFCC approach achieves a reduction in average equal error rate of 21.33%, 20.00%, and 13.28% over feature-warped MFCC, respectively, in the presence of various types of environmental noises only, reverberation, and noisy and reverberation environments. The approach can be used for improving the performance of forensic speaker verification and it may be utilized for preparing legal evidence in court.	en_US
dc.relation.ispartof	IEEE Access	en_US
dc.relation.isbasedon	10.1109/ACCESS.2017.2728801	en_US
dc.title	Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions	en_US
dc.type	Journal Article
utslib.citation.volume	5	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	5	en_US

Abstract:

© 2013 IEEE. Environmental noise and reverberation conditions severely degrade the performance of forensic speaker verification. Robust feature extraction plays an important role in improving forensic speaker verification performance. This paper investigates the effectiveness of combining features, mel frequency cepstral coefficients (MFCCs), and MFCC extracted from the discrete wavelet transform (DWT) of the speech, with and without feature warping for improving modern identity-vector (i-vector)-based speaker verification performance in the presence of noise and reverberation. The performance of i-vector speaker verification was evaluated using different feature extraction techniques: MFCC, feature-warped MFCC, DWT-MFCC, feature-warped DWT-MFCC, a fusion of DWT-MFCC and MFCC features, and fusion feature-warped DWT-MFCC and feature-warped MFCC features. We evaluated the performance of i-vector speaker verification using the Australian Forensic Voice Comparison and QUT-NOISE databases in the presence of noise, reverberation, and noisy and reverberation conditions. Our results indicate that the fusion of feature-warped DWT-MFCC and feature-warped MFCC is superior to other feature extraction techniques in the presence of environmental noise under the majority of signal-to-noise ratios (SNRs), reverberation, and noisy and reverberation conditions. At 0-dB SNR, the performance of the fusion of feature-warped DWT-MFCC and feature-warped MFCC approach achieves a reduction in average equal error rate of 21.33%, 20.00%, and 13.28% over feature-warped MFCC, respectively, in the presence of various types of environmental noises only, reverberation, and noisy and reverberation environments. The approach can be used for improving the performance of forensic speaker verification and it may be utilized for preparing legal evidence in court.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/124599