Analyzing the Influence of Diverse Background Noises on Voice Transmission: A Deep Learning Approach to Noise Suppression.

Nogales Moyano, Alberto; Caracuel-Cayuela, Javier; García Tejedor, Álvaro José

doi:10.3390/app14020740

Analyzing the Influence of Diverse Background Noises on Voice Transmission: A Deep Learning Approach to Noise Suppression.

applsci-14-00740-v2.pdf (1.06 MB)

Identifiers

URI: https://hdl.handle.net/10641/4297

ISSN: 2076-3417

DOI: 10.3390/app14020740

Publication date

2024

Authors

Nogales Moyano, Alberto

Caracuel-Cayuela, Javier

García Tejedor, Álvaro José

Metrics

Share

Export

Abstract

This paper presents an approach to enhancing the clarity and intelligibility of speech in digital communications compromised by various background noises. Utilizing deep learning techniques, specifically a Variational Autoencoder (VAE) with 2D convolutional filters, we aim to suppress background noise in audio signals. Our method focuses on four simulated environmental noise scenarios: storms, wind, traffic, and aircraft. The training dataset has been obtained from public sources (TED-LIUM 3 dataset, which includes audio recordings from the popular TED-TALK series) combined with these background noises. The audio signals were transformed into 2D power spectrograms, upon which our VAE model was trained to filter out the noise and reconstruct clean audio. Our results demonstrate that the model outperforms existing state-of-the-art solutions in noise suppression. Although differences in noise types were observed, it was challenging to definitively conclude which background noise most adversely affects speech quality. The results have been assessed with objective (mathematical metrics) and subjective (listening to a set of audios by humans) methods. Notably, wind noise showed the smallest deviation between the noisy and cleaned audio, perceived subjectively as the most improved scenario. Future work should involve refining the phase calculation of the cleaned audio and creating a more balanced dataset to minimize differences in audio quality across scenarios. Additionally, practical applications of the model in real-time streaming audio are envisaged. This research contributes significantly to the field of audio signal processing by offering a deep learning solution tailored to various noise conditions, enhancing digital communication quality.

Keywords

Speech enhancement, Noise suppression, Deep learning, Variational autoencoders

Collections

ESCUELA POLITÉCNICA SUPERIOR

Full item page

Depósito Digital UFV

Analyzing the Influence of Diverse Background Noises on Voice Transmission: A Deep Learning Approach to Noise Suppression.

Identifiers

Publication date

Start date of the public exhibition period

End date of the public exhibition period

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Share

Export

Research Projects

Organizational Units

Journal Issue

Abstract

Doctoral program

Description

Keywords

Citation

Collections