RespiroDynamics Unveiled: A Groundbreaking Multi-Modal Deep Learning and Spiking Neural Network Framework for Revolutionizing Non-Invasive Lung Health Assessment
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Computer Vision
Department
Computer Vision
First Advisor
Dr. Mohsen Guizani
Second Advisor
Dr. Mohammad Yaqub
Abstract
This thesis investigates non-invasive lung health assessment using deep learning and Spiking Neural Networks (SNNs) to analyze thermal and RGB video data. Traditional respiratory diagnostics often require direct physical interaction, which can cause patient discomfort. This research aims to develop a non-contact, robust, precise, and energy-efficient lung health model using thermal or mobile video recordings and personal data, eliminating the need for traditional spirometry. The study collected a unique dataset from 60 male participants of various demographics and health backgrounds, including thermal and RGB videos, heart rate, ECG data, and detailed metadata. The methodology involved creating and testing various neural network models, including Convolutional Neural Networks (CNNs) for classification and regression tasks, and SNNs that process temporal respiratory patterns. Innovations include data augmentation, the Adaptive Precision-Tuned Regression (APTR) loss function, multimodal data integration, attention mechanisms, and ensemble learning to enhance model performance. Results revealed high efficacy in both classification and regression tasks. In the FVC Normal vs. Abnormal classification, the thermal model achieved a perfect score of 100%, and the RGB model scored 99.7%. In the Peak Expiratory Flow (PEF) classification, the thermal model outperformed with 97.14% accuracy compared to 96% for RGB. SNNs showed an accuracy improvement from 91.99% to 99.5% after data aggregation for thermal videos, and from 81.17% to 99% for RGB. In regression tasks, ensemble learning significantly boosted performance; the thermal model reported a Relative Root Mean Square Error of 0.11, a Relative Mean Absolute Error of 0.09, and a Pearson Correlation of 0.93. Comparatively, the RGB model showed poorer performance with respective values of 0.26, 0.21, and 0.79. These findings highlight the superior performance of thermal imaging over RGB in detecting respiratory patterns and the beneficial impact of integrating metadata into the models, setting new standards in the field.
Recommended Citation
A. Sharshar, "RespiroDynamics Unveiled: A Groundbreaking Multi-Modal Deep Learning and Spiking Neural Network Framework for Revolutionizing Non-Invasive Lung Health Assessment,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfilment of the requirements for the M.Sc degree in Computer Vision
Advisors: Mohsen Guizani, Mohammad Yaqub
with 1 year embargo period