Machine Learning Faculty Publications

Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

Jingwei Zhao, Institute of Data Science, NUS, Singapore & Integrative Sciences and Engineering Programme, NUS Graduate School, Singapore
Gus Xia, Music X Lab, NYU Shanghai, China & Mohamed bin Zayed University of Artificial IntelligenceFollow
Ye Wang, School of Computing, NUS, Singapore & Institute of Data Science, NUS, Singapore & Integrative Sciences and Engineering Programme, NUS Graduate School, Singapore

Document Type

Article

Publication Title

arXiv

Abstract

We propose Beat Transformer, a novel Transformer encoder architecture for joint beat and downbeat tracking. Different from previous models that track beats solely based on the spectrogram of an audio mixture, our model deals with demixed spectrograms with multiple instrument channels. This is inspired by the fact that humans perceive metrical structures from richer musical contexts, such as chord progression and instrumentation. To this end, we develop a Transformer model with both time-wise attention and instrument-wise attention to capture deep-buried metrical cues. Moreover, our model adopts a novel dilated self-attention mechanism, which achieves powerful hierarchical modelling with only linear complexity. Experiments demonstrate a significant improvement in demixed beat tracking over the non-demixed version. Also, Beat Transformer achieves up to 4% point improvement in downbeat tracking accuracy over the TCN architectures. We further discover an interpretable attention pattern that mirrors our understanding of hierarchical metrical structures. © 2022, CC BY.

DOI

10.48550/arXiv.2209.07140

Publication Date

9-15-2022

Keywords

Spectrographs

Comments

Preprint: arXiv

Archived with thanks to arXiv

Preprint License: CC by 4.0

Uploaded 31 October 2022

Recommended Citation

J. Zhao, G. Xia, and Y. Wang, "Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention", 2022, doi:10.48550/arXiv.2209.07140

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Machine Learning Faculty Publications

Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Included in

Browse

Contribute

Links

Machine Learning Faculty Publications

Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Comments

Recommended Citation

Included in

Share

Browse

Contribute

Links