Advancing Speech Processing with Efficient, Robust, and Data-Scarce Neural Network Architectures

Lokeshkiran P

PDF

Published: Jun 30, 2025

Keywords:

Speech Processing, Neural Networks, Noise Robustness, Low-Resource Languages, Real-Time Systems, Generalization, Data-Efficient Training.

Lokeshkiran P, Karthikeyan S.

Abstract

Speech processing has become a cornerstone of modern technology, enabling applications like virtual assistants, real-time transcription, and language translation. However, despite advancements, significant challenges remain that hinder the effectiveness and scalability of these systems. One primary issue is the high computational demand of current neural network models, which limits their deployment on resource-constrained devices such as mobile phones and edge computing systems. These models require substantial processing power, making real-time speech processing challenging in many practical scenarios. Another critical issue is the linguistic bias inherent in many speech processing models. Most current systems are trained predominantly on high-resource languages, leading to poor performance in underrepresented languages and dialects. This creates a digital divide, leaving a significant portion of the global population without access to reliable speech technologies. Additionally, speech processing systems often struggle with robustness in noisy environments, where background noise and overlapping speech degrade system accuracy, further limiting their real-world applicability. Furthermore, neural network-based speech models require vast amounts of labeled data to achieve high performance, which is often unavailable for low-resource languages. This data scarcity presents a barrier to developing inclusive systems that cater to diverse linguistic contexts. This research aims to address these challenges by developing efficient neural network architectures, enhancing robustness in noisy conditions, and exploring data-efficient training strategies. By improving performance in resource-constrained settings and enhancing linguistic inclusivity, this work seeks to advance speech processing technologies, making them more reliable and accessible to a global user base.

Issue

Vol. 46 No. 02 (2025): June 2025

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details