Learning to Co-Embed Queries and Documents
Document Type
Article
Publication Title
Electronics (Switzerland)
Abstract
Learning to Rank (L2R) methods that utilize machine learning techniques to solve the ranking problems have been widely studied in the field of information retrieval. Existing methods usually concatenate query and document features as training input, without explicit understanding of relevance between queries and documents, especially in pairwise based ranking approach. Thus, it is an interesting question whether we can devise an algorithm that effectively describes the relation between queries and documents to learn a better ranking model without incurring huge parameter costs. In this paper, we present a Gaussian Embedding model for Ranking (GERank), an architecture for co-embedding queries and documents, such that each query or document is represented by a Gaussian distribution with mean and variance. Our GERank optimizes an energy-based loss based on the pairwise ranking framework. Additionally, the KL-divergence is utilized to measure the relevance between queries and documents. Experimental results on two LETOR datasets and one TREC dataset demonstrate that our model obtains a remarkable improvement in the ranking performance compared with the state-of-the-art retrieval models.
DOI
10.3390/electronics11223694
Publication Date
11-11-2022
Keywords
ad hoc retrieval, Gaussian embedding, learning to rank
Recommended Citation
Y. Wu, B. Lu, L. Tian, and S. Liang, “Learning to Co-Embed Queries and Documents,” Electronics, vol. 11, no. 22, p. 3694, Nov. 2022, doi: 10.3390/electronics11223694.
Comments
IR Deposit conditions:
OA version: Accepted version
No embargo
Published source must be acknowledged