Profiling Users for Question Answering Communities via Flow-Based Constrained Co-Embedding Model

Shangsong Liang, Sun Yat-Sen University & Mohamed bin Zayed University of Artificial Intelligence
Yupeng Luo, Sun Yat-Sen University
Zaiqiao Meng, University of Glasgow

IR deposit conditions:

  • OA version (accepted version)
  • No Embargo
  • Publisher copyright and source must be acknowledged
  • Must link to publisher version with statement that this is the definitive version and DOI
  • Must state that version on repository is the authors version
  • Set statement to accompany deposit (see policy)

Abstract

In this article, we study the task of user profiling in question answering communities (QACs). Previous user profiling algorithms suffer from a number of defects: they regard users and words as atomic units, leading to the mismatch between them; they are designed for other applications but not for QACs; and some semantic profiling algorithms do not co-embed users and words, leading to making the affinity measurement between them difficult. To improve the profiling performance, we propose a neural Flow-based Constrained Co-embedding Model, abbreviated as FCCM. FCCM jointly co-embeds the vector representations of both users and words in QACs such that the affinities between them can be semantically measured. Specifically, FCCM extends the standard variational auto-encoder model to enforce the inferred embeddings of users and words subject to the voting constraint, i.e., given a question and the users who answer this question in the community, representations of the users whose answers receive more votes are closer to the representations of the words associated with these answers, compared with representations of whose receiving fewer votes. In addition, FCCM integrates normalizing flow into the variational auto-encoder framework to avoid the assumption that the distributions of the embeddings are Gaussian, making the inferred embeddings fit the real distributions of the data better. Experimental results on a Chinese Zhihu question answering dataset demonstrate the effectiveness of our proposed FCCM model for the task of user profiling in QACs.