Efficient and Accurate Phase Recognition in Videos of Cataract Surgeries
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Computer Vision
Department
Computer Vision
First Advisor
Dr. Mohammad Yaqub
Second Advisor
Dr. Karthik Nandakumar
Abstract
Cataracts are the leading cause of eye disease globally, affecting 65.2 million individuals out of an estimated 2.2 billion people suffering from some form of vision impairment. This condition is characterized by the gradual clouding of the eye’s lens, primarily due to aging. Presently, phacoemulsification cataract surgery is recognized as the benchmark due to its reduced risk of complications after surgery. Despite its benefits, challenges persist due to the scarcity of proficient surgeons and the lack of practical approaches for assessing and providing feedback on surgical skills. In response, deep learning solutions can provide intraoperative assessments, analyses after surgery, and systematic feedback on the performance of surgeons. The first step to many of those solutions is accurate and efficient phase recognition in which a deep learning model classifies each frame of a cataract surgery video into a single phase. However, the efficiency versus effectiveness trade-off of a phase recognition model has not been thoroughly considered in previous state-of-the-art methods, with most methods leaning towards achieving better performance at the cost of an inefficient architecture. In our research, we introduce a new paradigm of phase recognition that utilizes selective state spaces to strike a good balance between efficiency and effectiveness. Our proposed method (CataMamba) is a dual-stage architecture that first extracts rich visual features from the frames of the surgery and then models the temporal relations using Mamba blocks. We demonstrate our technique’s success in balancing efficiency while maintaining effectiveness across two different cataract surgery datasets with varying numbers of phases, namely Cataract-101 and CATARACTS, where our method either outperforms or performs comparably to current leading methods. These findings highlight the potential of our method within the emerging field of phase recognition in surgical settings.
Recommended Citation
D. Mohamed, "Efficient and Accurate Phase Recognition in Videos of Cataract Surgeries,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfilment of the requirements for the M.Sc degree in Computer Vision
Advisors: Mohammad Yaqub, Karthik Nandakumar
Online access available for MBZUAI patrons