Analysis of Order Cancellation Rates and Feature Weighting with Machine Learning: An E-Commerce Case Study


IRMAK T.

7th International Conference on Intelligent and Fuzzy Systems, INFUS 2025, İstanbul, Türkiye, 29 - 31 Temmuz 2025, cilt.1531 LNNS, ss.763-770, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 1531 LNNS
  • Doi Numarası: 10.1007/978-3-031-98304-7_82
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.763-770
  • Anahtar Kelimeler: E-commerce, Feature Importance, Machine Learning
  • Atatürk Üniversitesi Adresli: Evet

Özet

In the e-commerce sector, identifying the weights of factors influencing cancellation rates and forecasting these rates in advance can enable businesses to make more informed and data-driven decisions. In this study, real sales data from an e-commerce company operating in Turkey and engaged in the sale of fashion jewelry and silver accessories on major online marketplaces was analyzed. The weights of the factors affecting product cancellation rates were determined using XGBoost, CatBoost Feature Importance, Permutation Importance, and SHAP (Shapley Additive Explanations) methods. Additionally, the performances of various machine learning algorithms—such as boosting models, Decision Tree, and Support Vector Regressor—were compared in terms of their ability to predict cancellation rates. According to the results, the CatBoost model achieved the highest performance across all metrics, providing the most accurate predictions with an R2 score of 0.9986. Based on feature importance analyses, the variable Customer_Cancelled_Order_Quantity was identified as the most influential feature by SHAP, CatBoost, and Permutation Importance methods, whereas the Net_Sales_Quantity variable was found to be the most significant according to the XGBoost model. The findings suggest that in order to reduce order cancellations, pricing strategies should be optimized, inventory management should be strengthened, and customer-focused processes should be improved.