Enhanced forensic speaker verification using multi-run ICA in the presence of environmental noise and reverberation conditions

Publication Type:
Conference Proceeding
Citation:
Proceedings of the 2017 IEEE International Conference on Signal and Image Processing Applications, ICSIPA 2017, 2017, pp. 174 - 179
Issue Date:
2017-01-01
Metrics:
Full metadata record
Files in This Item:
Filename Description Size
08120601.pdfPublished version148.79 kB
Adobe PDF
© 2017 IEEE. The performance of forensic speaker verification degrades severely in the presence of high levels of environmental noise and reverberation conditions. Multiple channel speech enhancement algorithms are a possible solution to reduce the effect of environmental noise from the noisy speech signals. Although multiple speech enhancement algorithms such as multi-run independent component analysis (ICA) were used in previous studies to improve the performance of recognition in biosignal applications, the effectiveness of multi-run ICA algorithm to improve the performance of noisy forensic speaker verification under reverberation conditions has not been investigated yet. In this paper, the multi-run ICA algorithm is used to enhance the noisy speech signals by choosing the highest signal to interference ratio (SIR) of the mixing matrix from different mixing matrices generated by iterating the fast ICA algorithm for several times. Wavelet-based mel frequency cepstral coefficients (MFCCs) feature warping approach is applied to the enhanced speech signals to extract the robust features to environmental noise and reverberation conditions. The state-of-The-Art intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) are used as a classifier in our approach. Experimental results show that forensic speaker verification based on the multi-run ICA algorithm achieves significant improvements in equal error rate (EER) of 60.88%, 51.84%, 66.15% over the baseline noisy speaker verification when enrolment speech signals reverberated at 0.15 sec and the test speech signals were mixed with STREET, CAR and HOME noises respectively at-10 dB signal to noise ratio (SNR).
Please use this identifier to cite or link to this item: