Learning to Co-Embed Queries and Documents

Document Type

Article

Publication Title

Electronics (Switzerland)

Abstract

Learning to Rank (L2R) methods that utilize machine learning techniques to solve the ranking problems have been widely studied in the field of information retrieval. Existing methods usually concatenate query and document features as training input, without explicit understanding of relevance between queries and documents, especially in pairwise based ranking approach. Thus, it is an interesting question whether we can devise an algorithm that effectively describes the relation between queries and documents to learn a better ranking model without incurring huge parameter costs. In this paper, we present a Gaussian Embedding model for Ranking (GERank), an architecture for co-embedding queries and documents, such that each query or document is represented by a Gaussian distribution with mean and variance. Our GERank optimizes an energy-based loss based on the pairwise ranking framework. Additionally, the KL-divergence is utilized to measure the relevance between queries and documents. Experimental results on two LETOR datasets and one TREC dataset demonstrate that our model obtains a remarkable improvement in the ranking performance compared with the state-of-the-art retrieval models.

DOI

10.3390/electronics11223694

Publication Date

11-11-2022

Keywords

ad hoc retrieval, Gaussian embedding, learning to rank

Comments

IR Deposit conditions:

OA version: Accepted version

No embargo

Published source must be acknowledged

Share

COinS