Text classification-based petition recognition and routing system: a Turkish case study


Sancar Y., Karabey Aksakallı I., Karacalı T.

International Journal of Information Technology (Singapore), cilt.15, sa.4, ss.2139-2146, 2023 (Scopus) identifier

Özet

Large-scale enterprises include many units and various electronic petitions are manually submitted to the relevant units by authorized staff. However, this delivery process is time-consuming in large-scale institutions and the petitions may not be delivered to the relevant units on time. The manual categorization of the petitions and the time wasted while transmitting the responses given to the recipients in the same way leads to delay in business life. That administrators cannot follow the petitions shows a need for a petition recognition system that automatically directs the relevant unit according to its content. In this study, electronic petitions sent from any unit of the institution are processed through a petition recognition and routing system. The system offers a solution to direct petitions each relevant unit according to their subjects. In this system, a printed document is scanned through the OCR (Optical Character Recognition) techniques and the characters are extracted from the digital petitions. After the pre-processing and feature extraction phase, the petitions are categorized using various machine learning classification methods, and the proposed routing system automatically detects the most successful classification method to direct the petitions to the relevant units. The experimental results show that the proposed petition recognition and routing system can classify the petitions by 0.951 accuracy rate and 0.94 f-macro value using Stochastic Gradient Descent classifier with BoWtfidf vectorized method. The performance of the proposed petition classification and routing system is reasonable for the end users. Based on our investigation, this study is the first in its area that contributes a novel petition benchmark dataset and addresses the petition classification issue by combining OCR, natural language processes, and machine learning techniques. The novel petition dataset is thought to pave the way for further research in petition text classification.