On Improving Automated Detection of Cyber-Bully in Social Networks with Constrained Datasets: A Hierarchical Deep Learning Approach

Document Type

Conference Proceeding

Publication Title

IEEE International Conference on Communications

Abstract

During the recent years, online users, particularly in social networks, have witnessed an upsurge in racism, sexism, and other types of aggressive and cyberbully content, which are often manifested through offensive, abusive, or hateful speech and harassment. This can lead to severe physical and psychological stress in young children and adolescents, leading to even suicides and negatively affecting social policies. Therefore, there is a significant need to identify and regulate harassing content posted on the Internet in a smart, automated, and accurate manner. With this aim, in this paper, we design and develop a hierarchical framework comprising machine learning algorithms in order of higher computational complexity to adaptatively switch among them for efficiently detecting hateful and abusive content. We combine simple machine learning models such as Naive Bayes/Logistic Regression classifiers with customized calibration and Expectation-Maximization (EM) algorithms, and compare them with the much stronger deep learning techniques. Our proposed hierarchical framework demonstrates a significant improvement of the automated detection of abusive contents in social networks with a relatively small twitter dataset in contrast with the deep learning-based counterpart, namely the Bidirectional Encoder Representations from Transformers (BERT) model, training of which typically requires a much higher volume of labeled documents to detect abusive comments. © 2022 IEEE.

First Page

1746

Last Page

1751

DOI

10.1109/ICC45855.2022.9838544

Publication Date

8-11-2022

Keywords

Bidirectional Encoder Representations from Transformers (BERT), calibration, Cyberbully, Expectation-Maximization (EM), Naive Bayes, racism, Automation, Classifiers, Computational efficiency, Deep learning, Learning algorithms, Learning systems, Machinery, Maximum principle, Signal encoding, Social networking (online)

Comments

IR Deposit conditions: non-described

Share

COinS