IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, cilt.19, ss.7597-7610, 2026 (SCI-Expanded, Scopus)
Vegetation indices (VIs), such as the normalized difference vegetation index (NDVI) and the normalized difference red edge index (NDRE), derived from multispectral imagery is crucial for crop health monitoring in precision agriculture. However, the acquisition of key bands particularly the near-infrared (NIR) and red-edge (RE) bands relies on expensive and specialized multispectral sensors, which significantly limits their application in resource-constrained scenarios. To address this issue, this study explores cross-band generation methods to produce high-quality NIR and RE bands from widely available RGB images. Existing methods often prioritize pixel-level accuracy while neglecting the semantic utility of the generated bands for downstream tasks and are susceptible to background interference. To overcome these limitations, we propose a perception-driven generation framework. First, we perform vegetation segmentation on raw images using the excess green vegetation index to construct a purified dataset of cotton vegetation pixels, which directs the model's focus toward learning the core spectral mapping relationships of vegetation. Leveraging prior knowledge from semantic segmentation, we introduce a selective reconstruction process that prioritizes detailed spectral reconstruction in vegetation regions while simplifying non-critical background areas. This strategy enhances computational efficiency without compromising key information quality. Subsequently, we design a transformer-based multi-channel spectral fusion model incorporating a spectral-spatial attention mechanism utilizing strip convolution to adaptively fuse channel features and concentrate on key spatial regions. The model is optimized using a composite loss function that combines perceptual loss and pixel level loss, ensuring the generated bands are both spectrally accurate and semantically rich. Experiments on a cotton image dataset collected by a DJI Mavic3 multispectral drone demonstrate that the proposed method generates more visually realistic NIR and RE bands compared to existing approaches. Most importantly, the classification accuracy for cotton health status using VIs derived from the generated data is significantly improved, outperforming methods using only RGB data and approaching the performance achieved with real multispectral data. This study provides a cost-effective solution for high precision crop monitoring under resource constraints.