International Journal of Intelligent Transportation Systems Research, 2026 (ESCI, Scopus)
Road traffic accidents remain a major global public safety concern, making accurate prediction of accident severity essential for effective transportation planning and risk mitigation. Traditional statistical approaches often fail to capture complex nonlinear relationships among accident-related factors, while existing machine learning studies frequently lack robust feature engineering and imbalance handling strategies. To address these limitations, this study proposes an integrated machine learning framework based on advanced feature engineering techniques using the UK STATS19 dataset. The proposed approach combines second-order polynomial feature expansion to model nonlinear interactions, Principal Component Analysis (PCA) for dimensionality reduction and multicollinearity mitigation, and Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance. In addition, higher-order statistical features, including skewness and kurtosis, are incorporated to enhance the representation of underlying data distributions. A comprehensive set of machine learning models is evaluated within this framework. Experimental results demonstrate that the baseline model achieves an accuracy of 79.63% without any feature engineering. After applying SMOTE, Random Forest performance significantly improves to 91.72% accuracy with a Macro F1-score of 91.66% and an MCC of 0.8764. Sensitivity analysis reveals that the best configuration (SMOTE with k=3) achieves 92.51% accuracy and 92.51% Macro F1-score. The inclusion of polynomial features further enhances predictive performance by capturing subtle nonlinear relationships not represented in the original feature space. A systematic ablation study, sensitivity analysis, and non-parametric statistical significance tests confirm the robustness of the proposed framework. Rather than introducing entirely new algorithms, this study demonstrates that the effective integration of well-established techniques can substantially improve accident severity prediction. The proposed framework provides a robust and interpretable approach for transportation safety analysis and offers valuable insights for policymakers to support data-driven decision-making, targeted interventions, and efficient resource allocation.