Optimization Method for Remote Sensing Image-Based Photovoltaic Panel Segmentation via Perception-Driven Enhancement in Nonideal Environments

Chao, Xuewei; Zhang, Lixin; Li, Yang; Nie, Jing; Yang, Shuo; ERCİŞLİ, Sezai

doi:10.1109/jstars.2025.3602477

Optimization Method for Remote Sensing Image-Based Photovoltaic Panel Segmentation via Perception-Driven Enhancement in Nonideal Environments

Chao X., Zhang L., Li Y., Nie J., Yang S., ERCİŞLİ S.

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, cilt.18, ss.22513-22529, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 18
Basım Tarihi: 2025
Doi Numarası: 10.1109/jstars.2025.3602477
Dergi Adı: IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Aquatic Science & Fisheries Abstracts (ASFA), Compendex, Geobase, INSPEC, Directory of Open Access Journals, Civil Engineering Abstracts
Sayfa Sayıları: ss.22513-22529
Anahtar Kelimeler: CycleGAN, deep learning, Non-ideal environment, photovoltaic panel segmentation, remote sensing image
Atatürk Üniversitesi Adresli: Evet

Özet

This research addresses the challenges of feature distortion, poor scale adaptability, and high model complexity in the segmentation of photovoltaic panels from remote sensing images under nonideal conditions, such as rain and fog. To tackle these issues, a DMFA-DeepLab model integrating perception-driven enhancement is proposed. First, a physical perception degradation data generation method is developed based on the CycleGAN framework to simulate optical degradation features caused by rain and fog, thereby enhancing the model's generalization capability in adverse environmental conditions. Second, a multiscale convolutional attention module (MCAM) is designed, which captures cross-scale features through heterogeneous convolution branches with receptive fields of 3, 5, and 9. The module further incorporates a channel-spatial dual-attention mechanism to dynamically focus on key regions while suppressing background interference. Based on MCAM, a multilevel feature aggregation network is constructed, which enhances boundary description through cross-level feature fusion. To achieve model lightweighting, a two-stage pruning-knowledge distillation strategy is introduced: initially, 30% of low-contribution channels are pruned based on the BN scaling factor through sparse training; after a second pruning of 20%, the cumulative compression rate reaches 44% . Finally, knowledge distillation is applied, where the original model serves as the teacher to guide the student model in performance restoration. The experimental results demonstrate that, on the enhanced dataset simulating rain and fog environments, the complete model achieves an MIoU of 93.17%, representing an improvement of 7.04% over the baseline DeepLabV3+. The lightweight model DMFA-DeepLab, while reducing the parameter count by 44%, restores the MIoU to 92.96% and increases inference speed by 2.3 times. Compared with the mainstream models, such as U-Net and PSPNet, DMFA-DeepLab demonstrates significantly superior segmentation accuracy and robustness in complex environments, achieving an F1-score of 94.57%, which is 4.67% points higher than the second-best model.