Using Double-Density Dual Tree Wavelet Transform into MFCC for Noisy Speech Recognition

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2020 12th International Conference on Information Technology and Electrical Engineering (ICITEE), 2020, 00, pp. 302-306
Issue Date:
2020-12-01
Filename Description Size
09271737.pdf796.3 kB
Adobe PDF
Full metadata record
The automatic speech recognition has gained significant progress in technology as well as in many applications. However, speech fluctuations due to noise effects significantly reduce recognition accuracy, and recognition on noisy channels is more difficult to generate correct word sequences than in a clean environment. Extracting meaningful acoustic information from noisy speech utterances has been a challenging task recently. Therefore, we present a combination of Mel frequency cepstrum coefficient (MFCC) and double-density dual tree wavelet transformation denoising algorithm to recognize noisy speech utterances. Hybrid frame-level cross entropy deep neural network-hidden Markov model (DNN-HMM) is used as an acoustic modeling activity. According to a suite of experiments, the proposed denoising method provides better performance without affecting the accuracy of higher sound intensity levels. Experimental results demonstrate that the recognition accuracy reach up to 96.6% in 10dB, 91.84% in 5dB, 78.05% in 0dB and 49.37% in -5dB, respectively.
Please use this identifier to cite or link to this item: