Automated Generation of Chest X-Ray Reports
Document Type
Dissertation
Abstract
In this work, we focus on (i) understanding the relative importance of encoder and decoder components, and (ii) developing a new reward for REINFORCE-based model optimization to improve the clinical accuracy of the reports. We analyze four different image encoding approaches: direct, fine-grained, CLIP-based, and Cluster-CLIP-based encodings in conjunction with three different decoders on the large-scale MIMIC-CXR dataset. Among these encoders, the cluster CLIP visual encoder is a novel approach that aims to generate more discriminative and explainable representations. CLIP-based encoders produce comparable results to traditional CNN-based encoders in terms of NLP metrics, while fine-grained encoding outperforms all other encoders both in terms of NLP and clinical accuracy metrics, thereby validating the Importance of image encoders to extract semantic information effectively. We also propose a new reward for REINFORCE-based optimization. The reward relies on question-answering (QA) transformer models. QA model selects the most relevant spans of the generated reports and the model is optimized with respect to those important spans. The QA-based reward doesn’t perform as well as other existing rewards in the REINFORCE-based optimization, but we outline its current weaknesses and propose further modifications for its improvement.
First Page
i
Last Page
46
Publication Date
12-30-2022
Recommended Citation
N. Otabek, "Automated Generation of Chest X-Ray Reports", M.S. Thesis, Machine Learning, MBZUAI, Abu Dhabi, UAE, 2022.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfillment of the requirements for the M.Sc degree in Machine Learning
Advisors: Dr. Karthik Nandakumar, Mr. Mohammad Yaqub
Online access provided for MBZUAI patrons