Computers and Electrical Engineering, cilt.122, 2025 (SCI-Expanded)
Skin lesions have morphological diversity, and their classification is a challenging task due to the large inter-class similarity and intra-class variation. To address this, an involution and soft attention based multimodal hybrid fusion network, ISAFusionNet, is proposed for automatic multi-label skin lesion classification. The proposed method is composed of two feature extraction branches and a hybrid fusion branch. The feature extraction branches utilize involution modules within multiple residual blocks to improve the visual representation of dermoscopy and clinical image information. The hybrid fusion branch, on the other hand, complementarily fuses the features of two image modalities in a multi-layer sense and combine them with meta-data features. This branch is composed of multiple soft attention modules to focus on the most relevant skin lesion areas. The proposed multi-modal method is evaluated on the seven-point checklist dataset, and an average accuracy of 85.6% is achieved for multi-label classification. Average sensitivity, specificity, precision and AUC results of 74.8%, 89%, 85.2% and 94.3% were obtained, respectively. These results indicate that the proposed ISAFusionNet improves the average accuracy by 3.13% compared to the existing state-of-the-art model. In this sense, involution and soft attention based deep multi-modal hybrid fusion network yields satisfactory performance for multi-label skin lesion classification problem.