Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and “fake news” detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g., on the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here we focus on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e., the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-of the-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, we improve over previous work by 2.5-27 macro-F1 points absolute for the two tasks and datasets.
P. Panayotov, U. Shukla, H.T. Sencar, M. Nabeel, and P. Nakov, "GREENER: Graph Neural Networks for News Media Profiling". in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 7470–7480, Dec 2022. doi:10.18653/v1/2022.emnlp-main.506