CRIS

Permanent URI for this communityhttps://scripta.up.edu.mx/handle/20.500.12552/1

Browse

Search Results

Now showing 1 - 1 of 1
  • Some of the metrics are blocked by your 
    Item type:Publication,
    Deploying Real-Time Speech Recognition on ESP32 Using TinyML and Edge Impulse
    (Springer Nature Switzerland, 2025)
    ;
    Gutiérrez, Sebastián
    ;
    ;
    The emergence of Tiny Machine Learning (TinyML) has enabled real-time on-device inference on ultra-low-power microcontrollers, eliminating reliance on cloud computing while significantly reducing latency, power consumption, and bandwidth requirements. This study explores the deployment of a TinyML-based speech recognition system on an ESP32 microcontroller, leveraging Edge Impulse for model development, Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction, and TensorFlow Lite for Microcontrollers (TFLM) for efficient inference. The model was trained on a curated subset of the Google Speech Commands Dataset, incorporating background noise augmentation to enhance robustness in real-world environments. Using Edge Impulse’s EON Compiler, the model was fully quantized and optimized, achieving a 37% reduction in RAM usage and 27% in ROM. The final model attained 87.14% accuracy on testing data and 97.1% average classification confidence during real-time inference, with excellent noise rejection (99.6%) and latency of 266 ms. Compared to state-of-the-art systems deployed on more powerful platforms, the proposed approach achieves competitive accuracy while maintaining real-time inference and minimal resource consumption on ultra-low-power hardware. This makes it particularly suitable for battery-powered IoT, robotics, and embedded automation applications where connectivity and energy efficiency are critical. By balancing performance and efficiency, this research highlights the viability of deploying speech recognition systems on constrained microcontrollers. Future work will explore advanced architectures and enhanced feature extraction strategies to further improve recognition accuracy, especially for short or phonetically similar commands. ©The authors ©Springer.