An Efficient Violence Detection Approach for Smart Cities Surveillance System

Document Type

Conference Proceeding

Publication Title

Proceedings of 2023 IEEE International Smart Cities Conference, ISC2 2023


Detecting violence in surveillance videos is crucial in activity recognition, with wide-ranging applications in unmanned aerial vehicles (UAVs), internet video filtering, and related domains. This study proposed a highly effective deep learning architecture that employs a two-stream approach, combining a 3D convolution network with a merging module for violence detection. One stream analyzes RGB frames with suppressed background, while the other focuses on the optical flow between corresponding frames. These inputs are precious in identifying violent actions often characterized by distinctive body movements. To ensure robust long-range feature extraction with fewer parameters, we replace the 3D depth-wise convolution operation at each layer instead of the conventional 3D. Our model outperforms existing methods on challenging datasets such as RWF2000, Real-Life Violence Situation (RLVS), and Movie Fight, securing state-of-the-art results. Our experiments demonstrate that the proposed model is well-suited for edge devices, offering computational efficiency and precise detection capabilities.



Publication Date



Emotion recognition, Three-dimensional display, Smart cities, Convolution, Surveillance, Computational modeling, Speech recognition


IR conditions: non-described