![Unsupervised Audio-Caption Alignment via Correspondence Learning](../../../wp-content/uploads/2024/01/2401.02584-400x200.png)
Category: Sound
Page 1/2
![Unsupervised Audio-Caption Alignment via Correspondence Learning](../../../wp-content/uploads/2024/01/2401.02584-400x200.png)
![Enhancing Speaker Recognition with Gradient Weighting and Noise Suppression](../../../wp-content/uploads/2024/01/2401.02626-400x200.png)
Enhancing Speaker Recognition with Gradient Weighting and Noise Suppression
![Generating Music Tracks with Unified Representation and Diffusion Framework: A Comparative Study](../../../wp-content/uploads/2024/01/2401.02678-400x200.png)
Generating Music Tracks with Unified Representation and Diffusion Framework: A Comparative Study
![Efficient Assessment of Student Music Performances Using Deep Neural Networks](../../../wp-content/uploads/2024/01/2401.02566-400x200.png)
Efficient Assessment of Student Music Performances Using Deep Neural Networks
![Enhancing Speech Emotion Recognition with Pretrained Models](../../../wp-content/uploads/2023/12/2312.16383-400x200.png)
Enhancing Speech Emotion Recognition with Pretrained Models
![Voice Conversion Techniques: An Overview](../../../wp-content/uploads/2023/12/2312.16552-400x200.png)
Voice Conversion Techniques: An Overview
![Self-Supervised Learning for Speech Recognition: A Comparative Study](../../../wp-content/uploads/2023/12/2312.16613-400x200.png)
Self-Supervised Learning for Speech Recognition: A Comparative Study
![Designing Artificial Reverberation Networks with Control of Scattering and Early Reflections](../../../wp-content/uploads/2023/12/2312.14658-400x200.png)
Designing Artificial Reverberation Networks with Control of Scattering and Early Reflections
![Unifying Embeddings for Face Recognition and Clustering](../../../wp-content/uploads/2023/12/2312.14806-400x200.png)
Unifying Embeddings for Face Recognition and Clustering
![High-Fidelity Neural Audio Compression: A Comparative Study of Recent Methods](../../../wp-content/uploads/2023/12/2312.13722-400x200.png)
High-Fidelity Neural Audio Compression: A Comparative Study of Recent Methods
![Improving Audio-Visual Speech Recognition with HuBERT: A Data-Driven Approach](../../../wp-content/uploads/2023/12/2312.13873-400x200.png)
Improving Audio-Visual Speech Recognition with HuBERT: A Data-Driven Approach
![Comparing Deep Learning Models for Music Classification: A Comprehensive Study](../../../wp-content/uploads/2023/12/2312.14005-400x200.png)
Comparing Deep Learning Models for Music Classification: A Comprehensive Study
![Rap Music Evolution: From Early 2000s to Global Dominance](../../../wp-content/uploads/2023/12/2312.14036-400x200.png)
Rap Music Evolution: From Early 2000s to Global Dominance
![Recognizing Underwater Acoustic Signals with Multilevel Cascading and Anonymization](../../../wp-content/uploads/2023/12/2312.13143-400x200.png)
Recognizing Underwater Acoustic Signals with Multilevel Cascading and Anonymization
![Speech Separation Techniques: Transformer, Attention, and Deep Learning](../../../wp-content/uploads/2023/12/2312.11825-400x200.png)
Speech Separation Techniques: Transformer, Attention, and Deep Learning
![Improving Speech Emotion Recognition with Ablation Studies and Multi-Scale DNNs](../../../wp-content/uploads/2023/12/2312.11974-400x200.png)
Improving Speech Emotion Recognition with Ablation Studies and Multi-Scale DNNs
![Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction](../../../wp-content/uploads/2023/12/2312.14860-400x200.png)
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction
![Accelerating Progress in Spoofed and Deepfake Speech Detection](../../../wp-content/uploads/2023/12/2312.09651-400x200.png)
Accelerating Progress in Spoofed and Deepfake Speech Detection
![CHiME Speech Separation and Recognition Challenges: A Comprehensive Overview](../../../wp-content/uploads/2023/12/2312.09746-400x200.png)
CHiME Speech Separation and Recognition Challenges: A Comprehensive Overview
![Unifying Streaming and Non-Streaming ASR with Cascaded Encoders](../../../wp-content/uploads/2023/12/2312.09842-400x200.png)