Disturbed-entropy: A simple data quality assessment approach


Li Y., Chao X., ERCİŞLİ S.

ICT EXPRESS, cilt.8, sa.3, ss.309-312, 2022 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 8 Sayı: 3
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1016/j.icte.2022.01.006
  • Dergi Adı: ICT EXPRESS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.309-312
  • Anahtar Kelimeler: Neural network, Data-centric computing, Information entropy
  • Atatürk Üniversitesi Adresli: Evet

Özet

From the perspective of information value, we proposed a simple and effective approach to assess data quality, called disturbed-entropy. In specific, considering image classification task, the existing samples per category are statistically represented as a pixel prototype, which is used to disturb the unseen samples. Then, the entropy of disturbed image is calculated based on predicted probability. Both the numerical and visual experiments are conducted to show the effect. In case of same data budget, the performance comparison based on selected good and bad data is significant and consistent. This work attempts to gain insight into data quality and redundancy. (C) 2022 The Author(s). Published by Elsevier B.V. on behalf of The Korean Institute of Communications and Information Sciences.