Machine Learning Faculty Publications

BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Shibo Hao, University of California, San Diego
Bowen Tan, Carnegie Mellon University
Kaiwen Tang, University of California, San Diego
Bin Ni, University of California, San Diego
Xiyan Shao, University of California, San Diego
Hengzhe Zhang, University of California, San Diego
Eric P. Xing, Carnegie Mellon UniversityFollow
Zhiting Hu, University of California, San Diego

Document Type

Conference Proceeding

Publication Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Abstract

It is crucial to automatically construct knowledge graphs (KGs) of diverse new relations to support knowledge discovery and broad applications. Previous KG construction methods, based on either crowdsourcing or text mining, are often limited to a small predefined set of relations due to manual cost or restrictions in text corpus. Recent research proposed to use pretrained language models (LMs) as implicit knowledge bases that accept knowledge queries with prompts. Yet, the implicit knowledge lacks many desirable properties of a full-scale symbolic KG, such as easy access, navigation, editing, and quality assurance. In this paper, we propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition (a prompt and a few shot of example entity pairs), the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge of the desired relation. We develop an effective search-and-rescore mechanism for improved efficiency and accuracy. We deploy the approach to harvest KGs of over 400 new relations from different LMs. Extensive human and automatic evaluations show our approach manages to extract diverse accurate knowledge, including tuples of complex relations (e.g., "A is capable of but not good at B"). The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.

First Page

5000

Last Page

5015

Publication Date

7-2023

Keywords

Computer aided language translation, Data mining, Harvesting, Knowledge graph, Natural language processing systems, Quality assurance

Comments

Preprint version from arXiv

Uploaded on June 20, 2024

Recommended Citation

S. Hao et al., "BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models," Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 5000 - 5015, Jul 2023.

Download

Included in

Computer Sciences Commons

COinS

Machine Learning Faculty Publications

BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Keywords

Comments

Recommended Citation

Included in

Browse

Contribute

Links

Machine Learning Faculty Publications

BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Keywords

Comments

Recommended Citation

Included in

Share

Browse

Contribute

Links