TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene

Soonshin Seo; Junseok Oh; Eunsoo Cho; Hosung Park; Gyujin Kim; Ji-Hwan Kim

doi:10.32604/cmc.2022.026259

Open Access icon Open Access

ARTICLE

TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene

Soonshin Seo¹, Junseok Oh², Eunsoo Cho², Hosung Park², Gyujin Kim², Ji-Hwan Kim^2,*

1 NAVER Corporation, Seongnam, 13561, Korea
2 Department of Computer Science and Engineering, Sogang University, Seoul, 04107, Korea

* Corresponding Author: Ji-Hwan Kim. Email: email

Computers, Materials & Continua 2022, 73(2), 3291-3303. https://doi.org/10.32604/cmc.2022.026259

Received 20 December 2021; Accepted 22 February 2022; Issue published 16 June 2022

Abstract

Acoustic scene classification (ASC) is a method of recognizing and classifying environments that employ acoustic signals. Various ASC approaches based on deep learning have been developed, with convolutional neural networks (CNNs) proving to be the most reliable and commonly utilized in ASC systems due to their suitability for constructing lightweight models. When using ASC systems in the real world, model complexity and device robustness are essential considerations. In this paper, we propose a two-pass mobile network for low-complexity classification of the acoustic scene, named TP-MobNet. With inverse residuals and linear bottlenecks, TP-MobNet is based on MobileNetV2, and following mobile blocks, coordinate attention and two-pass fusion approaches are utilized. The log-range dependencies and precise position information in feature maps can be trained via coordinate attention. By capturing more diverse feature resolutions at the network's end sides, two-pass fusions can also train generalization. Also, the model size is reduced by applying weight quantization to the trained model. By adding weight quantization to the trained model, the model size is also lowered. The TAU Urban Acoustic Scenes 2020 Mobile development set was used for all of the experiments. It has been confirmed that the proposed model, with a model size of 219.6 kB, achieves an accuracy of 73.94%.

Keywords

Acoustic scene classification; low-complexity; device robustness; two-pass mobile network; coordinate attention; weight quantization

Cite This Article

APA Style

Seo, S., Oh, J., Cho, E., Park, H., Kim, G. et al. (2022). TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene. Computers, Materials & Continua, 73(2), 3291–3303. https://doi.org/10.32604/cmc.2022.026259

Vancouver Style

Seo S, Oh J, Cho E, Park H, Kim G, Kim J. TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene. Comput Mater Contin. 2022;73(2):3291–3303. https://doi.org/10.32604/cmc.2022.026259

IEEE Style

S. Seo, J. Oh, E. Cho, H. Park, G. Kim, and J. Kim, “TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene,” Comput. Mater. Contin., vol. 73, no. 2, pp. 3291–3303, 2022. https://doi.org/10.32604/cmc.2022.026259

BibTex EndNote RIS

Copyright © 2022 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene

Abstract

Keywords

Cite This Article

1364

1054

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link