Use of Different Variants of Item Response Theory-Based Feature Selection Method for Text Categorization


ÇOBAN Ö.

International Conference on Theoretical and Applied Computer Science and Engineering, İstanbul, Türkiye, 30 Eylül - 01 Ekim 2022 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ictacse50438.2022.10009653
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: machine learning, text categorization, item response theory, feature selection
  • Atatürk Üniversitesi Adresli: Hayır

Özet

In this study, we investigate the performance of the item response theory (IRT)-based feature selection (FS) approach on eight text datasets considering different feature sets and weighting schemes. We also employ its recently introduced variants in our evaluation. The results of our extensive experiments show that the IRT-based FS approach often reaches or improves the classification f-score by selecting a higher number of features compared to their well-known peers. Recently introduced variants, on the other hand, often fall behind the IRT1 and IRT2 for the task of text categorization.