Natural Language Processing Faculty Publications

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

Zihui Gu, Renmin University of China
Ju Fan, Renmin University of China
Nan Tang, Qatar Computing Research Institute
Preslav Nakov, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Xiaoman Zhao, Renmin University of China
Xiaoyong Du, Renmin University of China

Document Type

Conference Proceeding

Publication Title

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Abstract

Fact verification has attracted a lot of research attention recently, e.g., in journalism, marketing, and policymaking, as misinformation and disinformation online can sway one's opinion and affect one's actions. While fact-checking is a hard task in general, in many cases, false statements can be easily debunked based on analytics over tables with reliable information. Hence, table-based fact verification has recently emerged as an important and growing research area. Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, in this paper we introduce PASTA, a novel state-of-the-art framework for table-based fact verification via pre-training with synthesized sentence-table cloze questions. In particular, we design six types of common sentence-table cloze tasks, including Filter, Aggregation, Superlative, Comparative, Ordinal, and Unique, based on which we synthesize a large corpus consisting of 1.2 million sentence-table pairs from WikiTables. PASTA uses a recent pre-trained LM, DeBERTaV3, and further pretrains it on our corpus. Our experimental results show that PASTA achieves new state-of-the-art performance on two table-based fact verification benchmarks: TabFact and SEM-TAB-FACTS. In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4.7 points (85.6% vs. 80.9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1.5 points (90.6% vs. 92.1%).

First Page

4971

Last Page

4983

DOI

10.18653/v1/2022.emnlp-main.331

Publication Date

12-2022

Comments

Archived with thanks to ACL Anthology

Preprint License: CC by 4.0 DEED

Uploaded 27 November 2023

Recommended Citation

Z. Gu, J. Fan, N. Tang, P. Nakov, X. Zhao, and X. Du, "PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training", in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 4971–4983, Dec 2022. doi:10.18653/v1/2022.emnlp-main.331

Additional Links

Publisher version link: https://aclanthology.org/2022.emnlp-main.331/

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Natural Language Processing Faculty Publications

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Additional Links

Included in

Browse

Contribute

Links

Natural Language Processing Faculty Publications

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Additional Links

Included in

Share

Browse

Contribute

Links