Document Type

Conference Proceeding

Publication Title

CEUR Workshop Proceedings

Abstract

This paper addresses the challenge of detecting political bias in news articles and media outlets from CheckThat!lab Task 3 [1, 2] by proposing an automated method for classifying these as left, center, or right-leaning. As mass media consumption continues to grow, the capability to identify bias in news reporting is crucial due to the potential societal impact of unaddressed political bias. To tackle this issue, we present a comprehensive approach employing machine learning techniques to detect political leaning in news media and articles. Our model, CatBoost, is evaluated on a diverse dataset comprising over 55,000 news articles sourced from AllSides1 at the article-level. For each model, we aggregate predictions made across news items by a single medium using a majority voting system at medium-level. Our dataset gathered and annotated from over 1,000 popular online platforms as rated by Media Bias/Fact Check2, categorizes political bias into the left, center, or right-wing. We have approximately ten articles from each of these platforms, yielding over 8,000 articles in total. We employ both CatBoost and CatBoost OF3 for media-level classification. These effectively detect political ideology across various media sources, with our CatBoost model demonstrating robustness and effectiveness in handling diverse data. Our findings suggest that utilizing the majority voting technique at the medium level improves model performance. We also highlight the importance of addressing class imbalance and implementing balanced data splits to enhance model performance. Regarding article-level classification using CatBoost, we achieve a Mean Absolute Error (MAE) of 0.270, an F1 score of 0.690, and an accuracy of 0.694. For media-level classification, we achieve a competitive MAE of 0.320, and with the use of the majority voting classifier, our model attains an F1 score of 0.727 and an accuracy of 0.725.

First Page

289

Last Page

305

Publication Date

9-2023

Keywords

Automated methods; F1 scores; Mass media; Mean absolute error; Media consumption; Media outlets; Modeling performance; News articles; News media; Political bias

Comments

Open Access version from ceur-ws

CC-BY

Uploaded on May 30, 2024

Share

COinS