Object Detection and Semantic Segmentation on Radar Frequency Signals

Document Type



Scene understanding is a task that has been continuously tackled by researchers in the computer vision community. While using annotated datasets containing camera images has been the most dominating approach in recent literature, multiple works tackled this task through innovative approaches like the use of LiDAR and radar sensors. LiDARs provide a high-resolution and descriptive point-cloud set of features describing the shapes to a high precision. Radars on the other hand are robust against adverse weather conditions, have a high operating range, and are lower in cost than LiDARs. However, radars have not been the popular choice of sensors for scene understanding, despite them providing rich information like the velocity of the objects and having faster data acquisition speeds, allowing for less time lag in automotive driving. In this work, we explore deep learning techniques, approaches, and challenges to tackle the problem of radar perception. The ability to implement scene-understanding techniques utilizing radars allows for higher standards of safety and precision in automotive driving. Being able to also introduce these standards using a fast-acquisition method like radars sets a new standard for the next generation of automotive driving sensing. To be specific, we aim to design and implement efficient and compact models that produce state-of-the-art results in object detection and localization and semantic segmentation. We do that through a cohesive review of recent approaches, their drawbacks, and how they can be improved by utilizing state-of-the-art modules in deep learning. We introduce two models in this work: RadarFormer and TransRadar, for the tasks of object detection and localization, and semantic segmentation, respectively. RadarFormer is a method that produces state-of-the-art results with one-tenth the model size and twice the inference speed in object detection. TransRadar on the other hand is a novel model that exceeds state-of-the-art in the semantic segmentation task in two key datasets in the field. We also propose a loss function that addresses the drawbacks of radar perception in machine learning. This work aims to set a new standard in efficiency and prediction scores in radar perception datasets, and encourages the computer vision community to tackle radar perception learning in more innovative ways.

Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Computer Vision

Advisors: Dr. Hisham Cholakkal, Dr. Fahad Khan

with 1 year embargo period

This document is currently not available here.