Technology and Health Care, 2026 (SCI-Expanded, Scopus)
Thalassemia is a hereditary blood disorder characterized by abnormal hemoglobin production. Common diagnostic methods include complete blood count, high-performance liquid chromatography, and hemoglobin electrophoresis. While physicians make the final diagnosis, advancements in artificial intelligence, specifically machine learning (ML) and deep learning, offer significant potential as auxiliary tools and decision support systems to reduce diagnostic errors. This study investigates ML algorithms for classifying thalassemia and its subtypes, including alpha (α) thalassemia and beta (β) thalassemia (minor, intermedia, and major). A synthetic training dataset of 1534 samples was generated based on the statistical properties and correlation structures of real clinical data. The models were then evaluated using an external real-world dataset of 349 patients from the Hematology Department of Atatürk University Research Hospital. Support Vector Machines (SVM), Logistic Regression (LR), XGBoost, Artificial Neural Networks (ANN), and a hybrid stacking model named ThalP were implemented. The ThalP model integrates the probability outputs of SVM, LR, and XGBoost through a neural network meta-classifier. Experimental results demonstrate that the proposed ThalP model achieved strong performance on the real clinical dataset with an accuracy of 83.1% and a macro-F1 score of 0.80. These findings indicate that ML-based hybrid models can serve as effective decision-support tools for classifying thalassemia subtypes using routine hematological parameters.