A Multi-Head Approach with Shuffled Segments for Weakly-Supervised Video Anomaly Detection

Document Type

Conference Proceeding

Publication Title

Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision Workshops, WACVW 2024

Abstract

Weakly-supervised video anomaly detection (WS-VAD) is a challenging task because coarse video-level annotations are insufficient to train fine-grained (segment or frame-level) detection algorithms. Multiple instance learning (MIL) powered by a ranking loss between the highest scoring segments of normal and anomaly videos has become the de-facto standard for WS-VAD. However, ranking loss is not robust to noisy segment-level labels (induced from the video-level labels), which is inherently the case in WS settings. In this work, we propose a new variant of the MIL method that utilizes a margin loss to achieve WS-VAD. The margin loss enables effective training of an anomaly scoring head based on noisy segment-level labels with high data imbalance (large number of normal segments and very few anomalous segments). We also introduce a self-supervised learning paradigm via stochastic shuffling of segments from multiple videos to mimic event changes during training. This forces the model to learn the boundaries between different virtual events (through a boundary localization head) and localizing the center of virtual events (through a center localization head). The efficacy of the proposed multi-head approach in successfully localizing anomalies is demonstrated through experiments on two large-scale VAD datasets (UCF-Crime and XD-Violence).

First Page

132

Last Page

142

DOI

10.1109/WACVW60836.2024.00022

Publication Date

1-1-2024

This document is currently not available here.

Share

COinS