Machine Learning Faculty Publications

Linear Classifier: An Often-Forgotten Baseline for Text Classification

Yu Chen Lin, National Taiwan University
Si An Chen, National Taiwan University
Jie Jyun Liu, National Taiwan University
Chih Jen Lin, National Taiwan University & Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Conference Proceeding

Publication Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Abstract

Large-scale pre-trained language models such as BERT are popular solutions for text classification. Due to the superior performance of these advanced methods, nowadays, people often directly train them for a few epochs and deploy the obtained model. In this opinion paper, we point out that this way may only sometimes get satisfactory results. We argue the importance of running a simple baseline like linear classifiers on bag-of-words features along with advanced methods. First, for many text data, linear methods show competitive performance, high efficiency, and robustness. Second, advanced models such as BERT may only achieve the best results if properly applied. Simple baselines help to confirm whether the results of advanced models are acceptable. Our experimental results fully support these points.

First Page

1876

Last Page

1888

DOI

10.18653/v1/2023.acl-short.160

Publication Date

7-2023

Keywords

Computational linguistics, Text processing

Comments

Archived with thanks to ACL Anthology

License: CC by 4.0 DEED

Uploaded 23 January 2024

Recommended Citation

Y.C. Lin, S.A. Chen, J.J. Liu, and C.J. Lin, "Linear Classifier: An Often-Forgotten Baseline for Text Classification", In Proceedings of the 61st Annual Meeting of the Assoc. for Comp. Linguistics, ACL, vol 2: Short Papers, pp. 1876–1888, July 2023. doi:10.18653/v1/2023.acl-short.160

Additional Links

DOI link: https://doi.org/10.18653/v1/2023.acl-short.160

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Machine Learning Faculty Publications

Linear Classifier: An Often-Forgotten Baseline for Text Classification

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Browse

Contribute

Links

Machine Learning Faculty Publications

Linear Classifier: An Often-Forgotten Baseline for Text Classification

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Share

Browse

Contribute

Links