Document detail
ID

oai:arXiv.org:2406.08800

Topic
Computer Science - Sound Computer Science - Machine Learnin... Electrical Engineering and Systems...
Author
Feng, Tiantian Dimitriadis, Dimitrios Narayanan, Shrikanth
Category

Computer Science

Year

2024

listing date

9/4/2024

Keywords
science recognition modeling models audio
Metrics

Abstract

Recent advances in foundation models have enabled audio-generative models that produce high-fidelity sounds associated with music, events, and human actions.

Despite the success achieved in modern audio-generative models, the conventional approach to assessing the quality of the audio generation relies heavily on distance metrics like Frechet Audio Distance.

In contrast, we aim to evaluate the quality of audio generation by examining the effectiveness of using them as training data.

Specifically, we conduct studies to explore the use of synthetic audio for audio recognition.

Moreover, we investigate whether synthetic audio can serve as a resource for data augmentation in speech-related modeling.

Our comprehensive experiments demonstrate the potential of using synthetic audio for audio recognition and speech-related modeling.

Our code is available at https://github.com/usc-sail/SynthAudio.

;Comment: Accepted to 2024 INTERSPEECH; corrections to ActivityNet labels

Feng, Tiantian,Dimitriadis, Dimitrios,Narayanan, Shrikanth, 2024, Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Document

Open

Share

Source

Articles recommended by ES/IODE AI

A rare case of localized peliosis hepatis during adjuvant chemotherapy including oxaliplatin mimicking a liver metastasis of colon cancer
peliosis hepatis metastatic liver tumor oxaliplatin oxaliplatin associated cancer metastatic tumor liver hepatis peliosis