Classification models combined with Boruta feature selection for heart disease prediction
- Publisher:
- Elsevier
- Publication Type:
- Journal Article
- Citation:
- Informatics in Medicine Unlocked, 2024, 44, pp. 101442
- Issue Date:
- 2024-01-01
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most vulnerable to heart diseases and ensuring they receive the appropriate care, premature demise can be prevented. Machine learning algorithms are now crucial in the medical field, especially when using medical databases to diagnose diseases. Such efficient algorithms and data processing techniques are applied to predict various diseases and offer much potential for accurate heart disease prognosis. Therefore, this study compares the performance logistic regression, decision tree, and support vector machine (SVM) methods with and without Boruta feature selection. The Cleveland Clinic Heart Disease Dataset acquired from Kaggle, which consists of 14 features and 303 instances, was used for the investigation. It was found that the Boruta feature selection algorithm, which selects six of the most relevant features, improved the results of the algorithms. Among these classification algorithms, logistic regression produced the most efficient result, with an accuracy of 88.52 %.
Please use this identifier to cite or link to this item: