Detection and Cross-domain Evaluation of Cyberbullying in Facebook Activity Contents for Turkish


Coban Ö., ÖZEL S. A., Inan A.

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, cilt.22, sa.4, 2023 (SCI-Expanded) identifier identifier

Özet

Cyberbullying refers to bullying and harassment of defenseless or vulnerable people such as children, teenagers, and women through any means of communication (e.g., e-mail, text messages, wall posts, tweets) over any online medium (e.g., social media, blogs, online games, virtual reality environments). The effect of cyberbullying may be severe and irreversible and it has become one of the major problems of cyber-societies in today's electronic world. Prevention of cyberbullying activities as well as the development of timely response mechanisms require automated and accurate detection of cyberbullying acts. This study focuses on the problem of cyberbullying detection over Facebook activity content written in Turkish. Through extensive experiments with the various machine and deep learning algorithms, the best estimator for the task is chosen and then employed for both cross-domain evaluation and profiling of cyber-aggressive users. The results obtained with fivefold cross-validation are evaluated with an average-macro F1 score. These results show that BERT is the best estimator with an average macro F1 of 0.928, and employing it on various datasets collected from different OSN domains produces highly satisfying results. This article also reports detailed profiling of cyber-aggressive users by providing even more information than what is visible to the naked eye.