Environmental Science and Pollution Research, cilt.32, sa.10, ss.5849-5873, 2025 (SCI-Expanded)
Even though there exist many research efforts trying to develop forecasting models based on machine learning (ML) or statistical techniques, feature selection is not employed in a large majority of the studies. To fill this gap, this study builds prediction models involving feature selection through one-step ahead estimation of climatological parameters (i.e., temperature and evapotranspiration), considering the aforementioned shortcomings. In addition, the best models are used to make estimations for a long horizon of 30 years. The experimental results performed on three stations located at the Van Lake Closed basin of Turkey showed that the Bayesian Ridge regressor (BRR) often outperforms other regressors. The respective best models involving BRR also enabled us to obtain R2 scores ranging from 0.961 to 0.988. On the other hand, feature selection helps us to reach or go beyond the respective baseline performance of any model by using a lower number of features. Finally, the overall evaluation is stated to have a limitation in that it needs non-sparse and complete time series data to produce satisfying results. It will also be a challenging task to employ our regression-based ML pipeline on any sparse time series dataset.