This project is developed based on Espressif’s ESP-IDF.
This project is an open-source initiative primarily for educational purposes. We hope that through this project, we can help more people get started with AI hardware development and understand how to apply rapidly developing large language models to actual hardware devices. Whether you are a student interested in AI or a developer looking to explore new technologies, you can gain valuable learning experience from this project.
Everyone is welcome to participate in the development and improvement of the project. If you have any ideas or suggestions, please feel free to raise issues or join the group chat.
Learning and Communication QQ Group: 946599635
Functionality Implemented:
-
Wi-Fi Configuration -
Supports BOOT key wake-up and interruption -
Offline voice wake-up (Espressif solution) -
Streaming voice dialogue (WebSocket or UDP protocol) -
Supports recognition in 5 languages: Mandarin, Cantonese, English, Japanese, Korean (SenseVoice solution) -
Voiceprint recognition (recognizing who is calling AI’s name,3D Speaker project) -
Uses large model TTS (Volcano Engine and CosyVoice solution) -
Supports configurable prompts and tones (custom roles) -
Qwen2.5 72B or Doubao API -
Supports self-summary after each dialogue round, generating memory -
Extended LCD display to show signal strength -
Supports ML307 Cat.1 4G module
