Development of novel automated language classification model using pyramid pattern technique with speech signals

Akbal, E; Barua, PD; Tuncer, T; Dogan, S; Acharya, UR

Development of novel automated language classification model using pyramid pattern technique with speech signals

Akbal, E Barua, PD Tuncer, T Dogan, S Acharya, UR

Permalink

Publisher:: Springer Nature
Publication Type:: Journal Article
Citation:: Neural Computing and Applications, 2022, 34, (23), pp. 21319-21333
Issue Date:: 2022-12-01

Closed Access

	Filename	Description	Size
	s00521-022-07613-7.pdf	Published version	1.21 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Akbal, E
dc.contributor.author	Barua, PD
dc.contributor.author	Tuncer, T
dc.contributor.author	Dogan, S
dc.contributor.author	Acharya, UR
dc.date.accessioned	2023-06-30T04:33:29Z
dc.date.available	2023-06-30T04:33:29Z
dc.date.issued	2022-12-01
dc.identifier.citation	Neural Computing and Applications, 2022, 34, (23), pp. 21319-21333
dc.identifier.issn	0941-0643
dc.identifier.issn	1433-3058
dc.identifier.uri	http://hdl.handle.net/10453/171036
dc.description.abstract	Language classification using speeches is a complex issue in machine learning and pattern recognition. Various text and image-based language classification methods have been presented. But there are limited speech-based language classification methods in the literature. Also, the previously presented models classified limited numbers of languages, and few are accents. This work presents an automated handcrafted language classification model. The novel pyramid pattern is presented to extract the features extraction. Also, statistical features and maximum pooling are used to generate the features. We have developed our speech-language classification model using two datasets: (i) created a new big speech dataset containing 14,500 speeches in 29 languages, and (ii) used the VoxForge dataset. The neighborhood component analysis method is used to select the most informative 1000 features from the generated features, and these features are classified using a quadratic support vector machine classifier (QSVM). Our developed method yielded 98.87 ± 0.30% and 97.12 ± 1.27% accuracies for our and VoxForge datasets, respectively. Also, geometric mean, average precision, and F1-score evaluation parameters are calculated, and they are presented in the results section. This paper presents an accurate language classification model developed using two big speech-language datasets. Our results indicate the success of the proposed pyramid pattern-based language classification method in classifying various speech languages accurately.
dc.language	en
dc.publisher	Springer Nature
dc.relation.ispartof	Neural Computing and Applications
dc.relation.isbasedon	10.1007/s00521-022-07613-7
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering, 1702 Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Development of novel automated language classification model using pyramid pattern technique with speech signals
dc.type	Journal Article
utslib.citation.volume	34
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1702 Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Civil and Environmental Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2023-06-30T04:33:28Z
pubs.issue	23
pubs.publication-status	Published
pubs.volume	34
utslib.citation.issue	23

Abstract:

Language classification using speeches is a complex issue in machine learning and pattern recognition. Various text and image-based language classification methods have been presented. But there are limited speech-based language classification methods in the literature. Also, the previously presented models classified limited numbers of languages, and few are accents. This work presents an automated handcrafted language classification model. The novel pyramid pattern is presented to extract the features extraction. Also, statistical features and maximum pooling are used to generate the features. We have developed our speech-language classification model using two datasets: (i) created a new big speech dataset containing 14,500 speeches in 29 languages, and (ii) used the VoxForge dataset. The neighborhood component analysis method is used to select the most informative 1000 features from the generated features, and these features are classified using a quadratic support vector machine classifier (QSVM). Our developed method yielded 98.87 ± 0.30% and 97.12 ± 1.27% accuracies for our and VoxForge datasets, respectively. Also, geometric mean, average precision, and F1-score evaluation parameters are calculated, and they are presented in the results section. This paper presents an accurate language classification model developed using two big speech-language datasets. Our results indicate the success of the proposed pyramid pattern-based language classification method in classifying various speech languages accurately.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/171036