IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, cilt.72, sa.1, ss.1562-1573, 2026 (SCI-Expanded, Scopus)
Consumer-grade agricultural UAV systems are increasingly adopted for crop monitoring, yet their perception systems remain highly vulnerable to AI-enabled attacks and short-term environmental disturbances. To address this, we propose MTCF-TaGN, an adversarial defense framework that leverages a short-term multi-modal temporal dataset synchronized across RGB, multi-spectral, and thermal infrared sequences. The proposed MTCF-Net captures temporal dependencies and adaptively fuses multi-modal features to mitigate the effects of wind, illumination, and occlusion. To evaluate robustness, the TaGN module generates digital, physical, and cross-modal adversarial samples, while the PANDA module employs reinforcement learning for both real-time adversarial detection and filtering. Experiments demonstrate that MTCF-TaGN improves detection robustness by achieving 12.4% higher mAP and 15.7% lower false positives under strong adversarial conditions, with inference latency below 80 ms, which makes it suitable for real-time UAV deployment. This work highlights a practical solution to enhance both accuracy and security resilience in agricultural UAVs, and more broadly, provide insights into securing multi-modal consumer devices against adversarial threats.