Skip to main content

Low-complexity short-time acoustic scene classification based on causal convolutions

Buy Article:

$15.00 + tax (Refund Policy)

Acoustic scene classification is vital for building smart cities. Most acoustic scene classification studies are performed by training convolutional neural networks. The networks constructed based on the general convolutional approach need to consider the temporal-frequency connections in the audio spectrogram. This work proposes a bottleneck structure neural network model for acoustic scene classification based on causal convolution and Sub-spectral normalization to overcome the abovementioned problem. To obtain more sample data, a new mix-spec augmentation data enhancement method is proposed by drawing on the existing classical mixup and spec augmentation methods, and the three are combined for sample augmentation. It was experimentally found that the proposed bottleneck structure helps improve the recognition accuracy and robustness of the model. In addition, the ablation experiment showed that the data augmentation training model combined with the mixup, SpecAugment, and the newly proposed mix-SpecAugment method had the highest classification accuracy. Finally, the near-square audio spectrogram has better training results by employing different shapes of audio spectrograms for the experiments. Compared with the baseline results, it is found that our proposed model significantly improved the accuracy of the baseline and reduced loss.

The requested document is freely available to subscribers. Users without a subscription can purchase this article.

Sign in

Document Type: Research Article

Affiliations: 1: Department of Computing, Faculty of Engineering, The Hong Kong Polytechnic University 2: School of Communication Engineering, Xidian University

Publication date: 04 October 2024

More about this publication?
  • The Noise-Con conference proceedings are sponsored by INCE/USA and the Inter-Noise proceedings by I-INCE. NOVEM (Noise and Vibration Emerging Methods) conference proceedings are included. All NoiseCon Proceedings one year or older are free to download. InterNoise proceedings from outside the USA older than 10 years are free to download. Others are free to INCE/USA members and member societies of I-INCE.

  • Membership Information
  • INCE Subject Classification
  • Ingenta Connect is not responsible for the content or availability of external websites
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content