2/25/2026 | USA | technology | ✓ Verified - arxiv.org

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

#Automatic Speech Recognition #Speech Enhancement #Observation Addition #Intelligibility-guided #Training-free #Noise reduction #Audio processing #Machine learning

📌 Key Takeaways

The new method fuses noisy and enhanced speech without modifying existing ASR or speech enhancement models
It's training-free, reducing complexity and enhancing generalization compared to previous approaches
Fusion weights are derived from intelligibility estimates obtained directly from the backend ASR
Experiments across diverse SE-ASR combinations show strong robustness and improvements over existing baselines
Additional analyses validate the design through switching-based alternatives and frame vs. utterance-level comparisons

📖 Full Retelling

Researchers Haoyang Li, Changsong Liu, Wei Rao, Hao Shi, Sakriani Sakti, and Eng Siong Chng introduced a novel Training-Free Intelligibility-Guided Observation Addition method for improving Automatic Speech Recognition in noisy environments on February 24, 2026, addressing the challenge where speech recognition systems degrade significantly when background noise is present and existing enhancement techniques introduce artifacts that harm recognition accuracy. The paper, submitted to arXiv under the category of Electrical Engineering and Systems Science > Audio and Speech Processing, presents an innovative approach to the persistent problem of ASR performance degradation in noisy conditions. Unlike conventional methods that require retraining models or introduce additional computational complexity, this new technique intelligently combines noisy speech with speech-enhanced versions without modifying the parameters of existing systems. The method derives fusion weights from intelligibility estimates obtained directly from the backend ASR system, creating a more efficient and adaptable solution to the noise recognition challenge. Extensive experiments conducted across diverse speech enhancement and ASR combinations demonstrated the method's strong robustness and significant improvements over existing observation addition baselines, making it a promising advancement for real-world applications where clean audio environments cannot be guaranteed.

🏷️ Themes

Speech Recognition, Audio Processing, Machine Learning, Noise Reduction

📚 Related People & Topics

Speech recognition

Automatic conversion of spoken language into text

Speech recognition (automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT)) is a sub-field of computational linguistics concerned with methods and technologies that translate spoken language into text or other interpretable forms. Speech recognition applications inc...

View Profile → Wikipedia ↗

Noise reduction

Process of removing noise from a signal

Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree.

View Profile → Wikipedia ↗

Audio processing

Topics referred to by the same term

Audio processing may refer to:

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Speech recognition:

👤 Taiwanese Hakka 1 shared

View full profile

Original Source

              --> Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2602.20967 [Submitted on 24 Feb 2026] Title: Training-Free Intelligibility-Guided Observation Addition for Noisy ASR Authors: Haoyang Li , Changsong Liu , Wei Rao , Hao Shi , Sakriani Sakti , Eng Siong Chng View a PDF of the paper titled Training-Free Intelligibility-Guided Observation Addition for Noisy ASR, by Haoyang Li and 5 other authors View PDF HTML Abstract: Automatic speech recognition degrades severely in noisy environments. Although speech enhancement front-ends effectively suppress background noise, they often introduce artifacts that harm recognition. Observation addition addressed this issue by fusing noisy and SE enhanced speech, improving recognition without modifying the parameters of the SE or ASR models. This paper proposes an intelligibility-guided OA method, where fusion weights are derived from intelligibility estimates obtained directly from the backend ASR. Unlike prior OA methods based on trained neural predictors, the proposed method is training-free, reducing complexity and enhances generalization. Extensive experiments across diverse SE-ASR combinations and datasets demonstrate strong robustness and improvements over existing OA baselines. Additional analyses of intelligibility-guided switching-based alternatives and frame versus utterance-level OA further validate the proposed design. Subjects: Audio and Speech Processing (eess.AS) ; Artificial Intelligence (cs.AI cs.SD) Cite as: arXiv:2602.20967 [eess.AS] (or arXiv:2602.20967v1 [eess.AS] for this version) https://doi.org/10.48550/arXiv.2602.20967 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Haoyang Li [ view email ] [v1] Tue, 24 Feb 2026 14:46:54 UTC (436 KB) Full-text links: Access Paper: View a PDF of the paper titled Training-Free Intelligibility-Guided Observation Addition for Noisy ASR, by Haoyang Li and 5 other authors View PDF HTML TeX Source vie...
            

Read full article at source

Source

arxiv.org

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

📚 Related People & Topics

Speech recognition

Noise reduction

Audio processing

Entity Intersection Graph

Source

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine