forked from xiaozhi/xiaozhi-esp32
Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
e59be04394 | ||
|
|
d26e8d25ff | ||
|
|
8e9be5abc7 | ||
|
|
7fd72aa8e2 | ||
|
|
0396b4a91c | ||
|
|
53b08843d4 | ||
|
|
797f9c2515 |
@@ -4,7 +4,7 @@
|
||||
# CMakeLists in this exact order for cmake to work correctly
|
||||
cmake_minimum_required(VERSION 3.16)
|
||||
|
||||
set(PROJECT_VER "0.2.0")
|
||||
set(PROJECT_VER "0.3.0")
|
||||
|
||||
include($ENV{IDF_PATH}/tools/cmake/project.cmake)
|
||||
project(xiaozhi)
|
||||
|
||||
67
README.md
67
README.md
@@ -2,50 +2,41 @@
|
||||
|
||||
BiliBili 视频介绍 [【ESP32+SenseVoice+Qwen72B打造你的AI聊天伴侣!】](https://www.bilibili.com/video/BV11msTenEH3/?share_source=copy_web&vd_source=ee1aafe19d6e60cf22e60a93881faeba)
|
||||
|
||||
这是虾哥的第一个硬件作品。
|
||||
|
||||
## 项目目的
|
||||
|
||||
本项目基于乐鑫的 ESP-IDF 进行开发。
|
||||
|
||||
本项目是一个开源项目,主要用于教学目的。我们希望通过这个项目,能够帮助更多人入门 AI 硬件开发,了解如何将当下飞速发展的大语言模型应用到实际的硬件设备中。无论你是对 AI 感兴趣的学生,还是想要探索新技术的开发者,都可以通过这个项目获得宝贵的学习经验。
|
||||
|
||||
欢迎所有人参与到项目的开发和改进中来。如果你有任何想法或建议,请随时提出 issue 或加入群聊。
|
||||
|
||||
学习交流 QQ 群:946599635
|
||||
|
||||
## 已实现功能
|
||||
|
||||
- Wi-Fi 配网
|
||||
- 离线语音唤醒(使用乐鑫方案)
|
||||
- 流式语音对话(WebSocket 协议)
|
||||
- 支持国语、粤语、英语、日语、韩语 5 种语言识别(使用 SenseVoice 方案)
|
||||
- 声纹识别(识别是谁在喊 AI 的名字,[3D Speaker 项目](https://github.com/modelscope/3D-Speaker))
|
||||
- 使用大模型 TTS(火山引擎方案,阿里云接入中)
|
||||
- 支持可配置的提示词和音色(自定义角色)
|
||||
- 免费提供 Qwen2.5 72B 和 豆包模型(受限于性能和额度,人多后可能会限额)
|
||||
- 支持每轮对话后自我总结,生成记忆体
|
||||
- 扩展液晶显示屏,显示信号强弱(后面可以显示中文字幕)
|
||||
- 支持 ML307 Cat.1 4G 模块(可选)
|
||||
|
||||
## 硬件部分
|
||||
|
||||
### DIY 所需硬件
|
||||
为方便协作,目前所有硬件资料都放在飞书文档中:
|
||||
|
||||
- 开发板:ESP32-S3-DevKitC-1
|
||||
- 麦克风:INMP441
|
||||
- 功放:MAX98357
|
||||
- 喇叭:8Ω 3W
|
||||
- 400 孔面包板 2 块
|
||||
- 导线若干
|
||||
[《小智 AI 聊天机器人百科全书》](https://ccnphfhqs21z.feishu.cn/wiki/F5krwD16viZoF0kKkvDcrZNYnhb?from=from_copylink)
|
||||
|
||||
第二版接线图如下:
|
||||
|
||||
### GPIO 接线指引
|
||||
|
||||
以下是默认接线方案,如果你的接线跟默认不一样,请在项目配置中同步修改。
|
||||
|
||||

|
||||
|
||||
注意,MAX98357 的 GND 和 VIN 接线隐藏在元件下方。INMP441 的 VDD 和 GND 不能接反,否则会烧毁麦克风。
|
||||
|
||||
#### MAX98357 功放
|
||||
|
||||
```
|
||||
LRC -> GPIO 4
|
||||
BCLK -> GPIO 5
|
||||
DIN -> GPIO 6
|
||||
GAIN -> GND(如果音量太大,请将 GAIN 接到 3.3V)
|
||||
SD -> 3.3V
|
||||
GND -> GND
|
||||
VIN -> 3.3V 或 5V(如果你的喇叭需要 5V,应该将 VIN 接到 5V)
|
||||
```
|
||||
|
||||
#### INMP441 麦克风
|
||||
|
||||
```
|
||||
L/R -> GND
|
||||
WS -> GPIO 10
|
||||
SCK -> GPIO 11
|
||||
SD -> GPIO 3
|
||||
VDD -> 3.3V
|
||||
GND -> GND
|
||||
```
|
||||

|
||||
|
||||
## 固件部分
|
||||
|
||||
@@ -73,7 +64,7 @@ GND -> GND
|
||||
- 配置完成后,编译固件
|
||||
|
||||
|
||||
## 配置 Wi-Fi
|
||||
## 配置 Wi-Fi (4G 版本跳过)
|
||||
|
||||
按照上述接线,烧录固件,设备上电后,开发板上的 RGB 会闪烁蓝灯(部分开发板需要焊接 RGB 灯的开关才会亮),进入配网状态。
|
||||
|
||||
|
||||
BIN
docs/wiring2.jpg
Normal file
BIN
docs/wiring2.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 72 KiB |
@@ -1,6 +1,10 @@
|
||||
#include "Application.h"
|
||||
#include "BuiltinLed.h"
|
||||
#include "TlsTransport.h"
|
||||
#include "Ml307SslTransport.h"
|
||||
#include "WifiConfigurationAp.h"
|
||||
#include "WifiStation.h"
|
||||
|
||||
#include <cstring>
|
||||
#include "esp_log.h"
|
||||
#include "model_path.h"
|
||||
@@ -11,7 +15,19 @@
|
||||
#define TAG "Application"
|
||||
|
||||
|
||||
Application::Application() {
|
||||
Application::Application()
|
||||
#ifdef CONFIG_USE_ML307
|
||||
: ml307_at_modem_(CONFIG_ML307_TX_PIN, CONFIG_ML307_RX_PIN, 4096),
|
||||
http_(ml307_at_modem_),
|
||||
firmware_upgrade_(http_)
|
||||
#else
|
||||
: http_(),
|
||||
firmware_upgrade_(http_)
|
||||
#endif
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
, display_(CONFIG_DISPLAY_SDA_PIN, CONFIG_DISPLAY_SCL_PIN)
|
||||
#endif
|
||||
{
|
||||
event_group_ = xEventGroupCreate();
|
||||
audio_encode_queue_ = xQueueCreate(100, sizeof(iovec));
|
||||
audio_decode_queue_ = xQueueCreate(100, sizeof(AudioPacket*));
|
||||
@@ -21,8 +37,6 @@ Application::Application() {
|
||||
ESP_LOGI(TAG, "Model %d: %s", i, models->model_name[i]);
|
||||
if (strstr(models->model_name[i], ESP_WN_PREFIX) != NULL) {
|
||||
wakenet_model_ = models->model_name[i];
|
||||
} else if (strstr(models->model_name[i], ESP_NSNET_PREFIX) != NULL) {
|
||||
nsnet_model_ = models->model_name[i];
|
||||
}
|
||||
}
|
||||
|
||||
@@ -31,6 +45,10 @@ Application::Application() {
|
||||
if (opus_decode_sample_rate_ != CONFIG_AUDIO_OUTPUT_SAMPLE_RATE) {
|
||||
opus_resampler_.Configure(opus_decode_sample_rate_, CONFIG_AUDIO_OUTPUT_SAMPLE_RATE);
|
||||
}
|
||||
|
||||
firmware_upgrade_.SetCheckVersionUrl(CONFIG_OTA_VERSION_URL);
|
||||
firmware_upgrade_.SetHeader("Device-Id", SystemInfo::GetMacAddress().c_str());
|
||||
firmware_upgrade_.SetPostData(SystemInfo::GetJsonString());
|
||||
}
|
||||
|
||||
Application::~Application() {
|
||||
@@ -48,9 +66,6 @@ Application::~Application() {
|
||||
for (auto& pcm : wake_word_pcm_) {
|
||||
free(pcm.iov_base);
|
||||
}
|
||||
for (auto& opus : wake_word_opus_) {
|
||||
free(opus.iov_base);
|
||||
}
|
||||
|
||||
if (opus_decoder_ != nullptr) {
|
||||
opus_decoder_destroy(opus_decoder_);
|
||||
@@ -67,7 +82,128 @@ Application::~Application() {
|
||||
vEventGroupDelete(event_group_);
|
||||
}
|
||||
|
||||
void Application::CheckNewVersion() {
|
||||
// Check if there is a new firmware version available
|
||||
firmware_upgrade_.CheckVersion();
|
||||
if (firmware_upgrade_.HasNewVersion()) {
|
||||
// Wait for the chat state to be idle
|
||||
while (chat_state_ != kChatStateIdle) {
|
||||
vTaskDelay(100);
|
||||
}
|
||||
SetChatState(kChatStateUpgrading);
|
||||
firmware_upgrade_.StartUpgrade([this](int progress, size_t speed) {
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
char buffer[64];
|
||||
snprintf(buffer, sizeof(buffer), "Upgrading...\n %d%% %zuKB/s", progress, speed / 1024);
|
||||
display_.SetText(buffer);
|
||||
#endif
|
||||
});
|
||||
// If upgrade success, the device will reboot and never reach here
|
||||
ESP_LOGI(TAG, "Firmware upgrade failed...");
|
||||
SetChatState(kChatStateIdle);
|
||||
} else {
|
||||
firmware_upgrade_.MarkCurrentVersionValid();
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
|
||||
#ifdef CONFIG_USE_ML307
|
||||
static std::string csq_to_string(int csq) {
|
||||
if (csq == -1) {
|
||||
return "No network";
|
||||
} else if (csq >= 0 && csq <= 9) {
|
||||
return "Very bad";
|
||||
} else if (csq >= 10 && csq <= 14) {
|
||||
return "Bad";
|
||||
} else if (csq >= 15 && csq <= 19) {
|
||||
return "Fair";
|
||||
} else if (csq >= 20 && csq <= 24) {
|
||||
return "Good";
|
||||
} else if (csq >= 25 && csq <= 31) {
|
||||
return "Very good";
|
||||
}
|
||||
return "Invalid";
|
||||
}
|
||||
#else
|
||||
static std::string rssi_to_string(int rssi) {
|
||||
if (rssi >= -55) {
|
||||
return "Very good";
|
||||
} else if (rssi >= -65) {
|
||||
return "Good";
|
||||
} else if (rssi >= -75) {
|
||||
return "Fair";
|
||||
} else if (rssi >= -85) {
|
||||
return "Poor";
|
||||
} else {
|
||||
return "No network";
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
void Application::UpdateDisplay() {
|
||||
while (true) {
|
||||
if (chat_state_ == kChatStateIdle) {
|
||||
#ifdef CONFIG_USE_ML307
|
||||
std::string network_name = ml307_at_modem_.GetCarrierName();
|
||||
int signal_quality = ml307_at_modem_.GetCsq();
|
||||
if (signal_quality == -1) {
|
||||
network_name = "No network";
|
||||
} else {
|
||||
ESP_LOGI(TAG, "%s CSQ: %d", network_name.c_str(), signal_quality);
|
||||
display_.SetText(network_name + "\n" + csq_to_string(signal_quality) + " (" + std::to_string(signal_quality) + ")");
|
||||
}
|
||||
#else
|
||||
auto& wifi_station = WifiStation::GetInstance();
|
||||
int8_t rssi = wifi_station.GetRssi();
|
||||
display_.SetText(wifi_station.GetSsid() + "\n" + rssi_to_string(rssi) + " (" + std::to_string(rssi) + ")");
|
||||
#endif
|
||||
}
|
||||
vTaskDelay(pdMS_TO_TICKS(10 * 1000));
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
void Application::Start() {
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
#ifdef CONFIG_USE_ML307
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.StartContinuousBlink(100);
|
||||
ml307_at_modem_.SetDebug(false);
|
||||
ml307_at_modem_.SetBaudRate(921600);
|
||||
// Print the ML307 modem information
|
||||
std::string module_name = ml307_at_modem_.GetModuleName();
|
||||
ESP_LOGI(TAG, "ML307 Module: %s", module_name.c_str());
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
display_.SetText(std::string("Wait for network\n") + module_name);
|
||||
#endif
|
||||
ml307_at_modem_.ResetConnections();
|
||||
ml307_at_modem_.WaitForNetworkReady();
|
||||
|
||||
ESP_LOGI(TAG, "ML307 IMEI: %s", ml307_at_modem_.GetImei().c_str());
|
||||
ESP_LOGI(TAG, "ML307 ICCID: %s", ml307_at_modem_.GetIccid().c_str());
|
||||
#else
|
||||
// Try to connect to WiFi, if failed, launch the WiFi configuration AP
|
||||
auto& wifi_station = WifiStation::GetInstance();
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
display_.SetText(std::string("Connect to WiFi\n") + wifi_station.GetSsid());
|
||||
#endif
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.StartContinuousBlink(100);
|
||||
wifi_station.Start();
|
||||
if (!wifi_station.IsConnected()) {
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.Blink(1000, 500);
|
||||
auto& wifi_ap = WifiConfigurationAp::GetInstance();
|
||||
wifi_ap.SetSsidPrefix("Xiaozhi");
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
display_.SetText(wifi_ap.GetSsid() + "\n" + wifi_ap.GetWebServerUrl());
|
||||
#endif
|
||||
wifi_ap.Start();
|
||||
return;
|
||||
}
|
||||
#endif
|
||||
|
||||
// Initialize the audio device
|
||||
audio_device_.Start(CONFIG_AUDIO_INPUT_SAMPLE_RATE, CONFIG_AUDIO_OUTPUT_SAMPLE_RATE);
|
||||
audio_device_.OnStateChanged([this]() {
|
||||
@@ -89,31 +225,15 @@ void Application::Start() {
|
||||
xTaskCreateStatic([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->AudioEncodeTask();
|
||||
vTaskDelete(NULL);
|
||||
}, "opus_encode", opus_stack_size, this, 1, audio_encode_task_stack_, &audio_encode_task_buffer_);
|
||||
audio_decode_task_stack_ = (StackType_t*)malloc(opus_stack_size);
|
||||
xTaskCreateStatic([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->AudioDecodeTask();
|
||||
vTaskDelete(NULL);
|
||||
}, "opus_decode", opus_stack_size, this, 1, audio_decode_task_stack_, &audio_decode_task_buffer_);
|
||||
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
// Blink the LED to indicate the device is connecting
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.BlinkOnce();
|
||||
WifiStation::GetInstance().Start();
|
||||
|
||||
// Check if there is a new firmware version available
|
||||
firmware_upgrade_.CheckVersion();
|
||||
if (firmware_upgrade_.HasNewVersion()) {
|
||||
builtin_led.TurnOn();
|
||||
firmware_upgrade_.StartUpgrade();
|
||||
// If upgrade success, the device will reboot and never reach here
|
||||
ESP_LOGI(TAG, "Firmware upgrade failed...");
|
||||
builtin_led.TurnOff();
|
||||
} else {
|
||||
firmware_upgrade_.MarkValid();
|
||||
}
|
||||
|
||||
StartCommunication();
|
||||
StartDetection();
|
||||
|
||||
@@ -121,44 +241,69 @@ void Application::Start() {
|
||||
builtin_led.SetGreen();
|
||||
builtin_led.BlinkOnce();
|
||||
xEventGroupSetBits(event_group_, DETECTION_RUNNING);
|
||||
|
||||
// Launch a task to check for new firmware version
|
||||
xTaskCreate([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->CheckNewVersion();
|
||||
vTaskDelete(NULL);
|
||||
}, "check_new_version", 4096 * 2, this, 1, NULL);
|
||||
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
// Launch a task to update the display
|
||||
xTaskCreate([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->UpdateDisplay();
|
||||
vTaskDelete(NULL);
|
||||
}, "update_display", 4096, this, 1, NULL);
|
||||
#endif
|
||||
}
|
||||
|
||||
void Application::SetChatState(ChatState state) {
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
const char* state_str[] = {
|
||||
"idle",
|
||||
"connecting",
|
||||
"listening",
|
||||
"speaking",
|
||||
"wake_word_detected",
|
||||
"testing",
|
||||
"upgrading",
|
||||
"unknown"
|
||||
};
|
||||
chat_state_ = state;
|
||||
ESP_LOGI(TAG, "STATE: %s", state_str[chat_state_]);
|
||||
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
switch (chat_state_) {
|
||||
case kChatStateIdle:
|
||||
ESP_LOGI(TAG, "Chat state: idle");
|
||||
builtin_led.TurnOff();
|
||||
break;
|
||||
case kChatStateConnecting:
|
||||
ESP_LOGI(TAG, "Chat state: connecting");
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.TurnOn();
|
||||
break;
|
||||
case kChatStateListening:
|
||||
ESP_LOGI(TAG, "Chat state: listening");
|
||||
builtin_led.SetRed();
|
||||
builtin_led.TurnOn();
|
||||
break;
|
||||
case kChatStateSpeaking:
|
||||
ESP_LOGI(TAG, "Chat state: speaking");
|
||||
builtin_led.SetGreen();
|
||||
builtin_led.TurnOn();
|
||||
break;
|
||||
case kChatStateWakeWordDetected:
|
||||
ESP_LOGI(TAG, "Chat state: wake word detected");
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.TurnOn();
|
||||
break;
|
||||
case kChatStateTesting:
|
||||
ESP_LOGI(TAG, "Chat state: testing");
|
||||
builtin_led.SetRed();
|
||||
builtin_led.TurnOn();
|
||||
break;
|
||||
case kChatStateUpgrading:
|
||||
builtin_led.SetGreen();
|
||||
builtin_led.StartContinuousBlink(100);
|
||||
break;
|
||||
}
|
||||
|
||||
const char* state_str[] = { "idle", "connecting", "listening", "speaking", "wake_word_detected", "testing", "unknown" };
|
||||
std::lock_guard<std::recursive_mutex> lock(mutex_);
|
||||
if (ws_client_ && ws_client_->IsConnected()) {
|
||||
cJSON* root = cJSON_CreateObject();
|
||||
@@ -175,7 +320,7 @@ void Application::StartCommunication() {
|
||||
afe_config_t afe_config = {
|
||||
.aec_init = false,
|
||||
.se_init = true,
|
||||
.vad_init = false,
|
||||
.vad_init = true,
|
||||
.wakenet_init = false,
|
||||
.voice_communication_init = true,
|
||||
.voice_communication_agc_init = true,
|
||||
@@ -209,6 +354,7 @@ void Application::StartCommunication() {
|
||||
xTaskCreate([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->AudioCommunicationTask();
|
||||
vTaskDelete(NULL);
|
||||
}, "audio_communication", 4096 * 2, this, 5, NULL);
|
||||
}
|
||||
|
||||
@@ -216,7 +362,7 @@ void Application::StartDetection() {
|
||||
afe_config_t afe_config = {
|
||||
.aec_init = false,
|
||||
.se_init = true,
|
||||
.vad_init = false,
|
||||
.vad_init = true,
|
||||
.wakenet_init = true,
|
||||
.voice_communication_init = false,
|
||||
.voice_communication_agc_init = false,
|
||||
@@ -249,11 +395,13 @@ void Application::StartDetection() {
|
||||
xTaskCreate([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->AudioFeedTask();
|
||||
vTaskDelete(NULL);
|
||||
}, "audio_feed", 4096 * 2, this, 5, NULL);
|
||||
|
||||
xTaskCreate([](void* arg) {
|
||||
Application* app = (Application*)arg;
|
||||
app->AudioDetectionTask();
|
||||
vTaskDelete(NULL);
|
||||
}, "audio_detection", 4096 * 2, this, 5, NULL);
|
||||
}
|
||||
|
||||
@@ -277,22 +425,21 @@ void Application::AudioFeedTask() {
|
||||
}
|
||||
|
||||
void Application::StoreWakeWordData(uint8_t* data, size_t size) {
|
||||
// store audio data to detect_packets_
|
||||
// store audio data to wake_word_pcm_
|
||||
auto iov = (iovec){
|
||||
.iov_base = heap_caps_malloc(size, MALLOC_CAP_SPIRAM),
|
||||
.iov_len = size
|
||||
};
|
||||
memcpy(iov.iov_base, data, size);
|
||||
wake_word_pcm_.push_back(iov);
|
||||
// remove the oldest packet if the size is larger than 50, about 2 seconds
|
||||
if (wake_word_pcm_.size() > 50) {
|
||||
// keep about 2 seconds of data, detect duration is 32ms (sample_rate == 16000, chunksize == 512)
|
||||
while (wake_word_pcm_.size() > 2000 / 32) {
|
||||
heap_caps_free(wake_word_pcm_.front().iov_base);
|
||||
wake_word_pcm_.pop_front();
|
||||
}
|
||||
}
|
||||
|
||||
void Application::EncodeWakeWordData() {
|
||||
wake_word_opus_.clear();
|
||||
if (wake_word_encode_task_stack_ == nullptr) {
|
||||
wake_word_encode_task_stack_ = (StackType_t*)malloc(4096 * 8);
|
||||
}
|
||||
@@ -302,37 +449,55 @@ void Application::EncodeWakeWordData() {
|
||||
// encode detect packets
|
||||
OpusEncoder* encoder = new OpusEncoder();
|
||||
encoder->Configure(CONFIG_AUDIO_INPUT_SAMPLE_RATE, 1, 60);
|
||||
encoder->SetComplexity(2);
|
||||
encoder->SetComplexity(0);
|
||||
app->wake_word_opus_.resize(4096 * 4);
|
||||
size_t offset = 0;
|
||||
|
||||
for (auto& pcm: app->wake_word_pcm_) {
|
||||
encoder->Encode(pcm, [app](const iovec opus) {
|
||||
iovec iov = {
|
||||
.iov_base = heap_caps_malloc(opus.iov_len, MALLOC_CAP_SPIRAM),
|
||||
.iov_len = opus.iov_len
|
||||
};
|
||||
memcpy(iov.iov_base, opus.iov_base, opus.iov_len);
|
||||
app->wake_word_opus_.push_back(iov);
|
||||
encoder->Encode(pcm, [app, &offset](const iovec opus) {
|
||||
size_t protocol_size = sizeof(BinaryProtocol) + opus.iov_len;
|
||||
if (offset + protocol_size < app->wake_word_opus_.size()) {
|
||||
auto protocol = (BinaryProtocol*)(&app->wake_word_opus_[offset]);
|
||||
protocol->version = htons(PROTOCOL_VERSION);
|
||||
protocol->type = htons(0);
|
||||
protocol->reserved = 0;
|
||||
protocol->timestamp = htonl(app->audio_device_.playing() ? app->audio_device_.last_timestamp() : 0);
|
||||
protocol->payload_size = htonl(opus.iov_len);
|
||||
memcpy(protocol->payload, opus.iov_base, opus.iov_len);
|
||||
offset += protocol_size;
|
||||
}
|
||||
});
|
||||
heap_caps_free(pcm.iov_base);
|
||||
}
|
||||
app->wake_word_pcm_.clear();
|
||||
app->wake_word_opus_.resize(offset);
|
||||
|
||||
auto end_time = esp_timer_get_time();
|
||||
ESP_LOGI(TAG, "Encode wake word data opus packets: %d in %lld ms", app->wake_word_opus_.size(), (end_time - start_time) / 1000);
|
||||
xEventGroupSetBits(app->event_group_, DETECT_PACKETS_ENCODED);
|
||||
ESP_LOGI(TAG, "Encode wake word opus: %zu bytes in %lld ms", app->wake_word_opus_.size(), (end_time - start_time) / 1000);
|
||||
xEventGroupSetBits(app->event_group_, WAKE_WORD_ENCODED);
|
||||
delete encoder;
|
||||
vTaskDelete(NULL);
|
||||
}, "encode_detect_packets", 4096 * 8, this, 1, wake_word_encode_task_stack_, &wake_word_encode_task_buffer_);
|
||||
}
|
||||
|
||||
void Application::SendWakeWordData() {
|
||||
for (auto& opus: wake_word_opus_) {
|
||||
ws_client_->Send(opus.iov_base, opus.iov_len, true);
|
||||
heap_caps_free(opus.iov_base);
|
||||
}
|
||||
ws_client_->Send(wake_word_opus_.data(), wake_word_opus_.size(), true);
|
||||
wake_word_opus_.clear();
|
||||
}
|
||||
|
||||
BinaryProtocol* Application::AllocateBinaryProtocol(void* payload, size_t payload_size) {
|
||||
auto last_timestamp = audio_device_.playing() ? audio_device_.last_timestamp() : 0;
|
||||
auto protocol = (BinaryProtocol*)heap_caps_malloc(sizeof(BinaryProtocol) + payload_size, MALLOC_CAP_SPIRAM);
|
||||
protocol->version = htons(PROTOCOL_VERSION);
|
||||
protocol->type = htons(0);
|
||||
protocol->reserved = 0;
|
||||
protocol->timestamp = htonl(last_timestamp);
|
||||
protocol->payload_size = htonl(payload_size);
|
||||
assert(sizeof(BinaryProtocol) == 16);
|
||||
memcpy(protocol->payload, payload, payload_size);
|
||||
return protocol;
|
||||
}
|
||||
|
||||
void Application::CheckTestButton() {
|
||||
if (gpio_get_level(GPIO_NUM_1) == 0) {
|
||||
if (chat_state_ == kChatStateIdle) {
|
||||
@@ -385,7 +550,7 @@ void Application::AudioDetectionTask() {
|
||||
|
||||
auto res = esp_afe_sr_v1.fetch(afe_detection_data_);
|
||||
if (res == nullptr || res->ret_value == ESP_FAIL) {
|
||||
ESP_LOGE(TAG, "Error in fetch");
|
||||
ESP_LOGE(TAG, "Error in AudioDetectionTask");
|
||||
if (res != nullptr) {
|
||||
ESP_LOGI(TAG, "Error code: %d", res->ret_value);
|
||||
}
|
||||
@@ -397,16 +562,23 @@ void Application::AudioDetectionTask() {
|
||||
|
||||
CheckTestButton();
|
||||
if (chat_state_ == kChatStateTesting) {
|
||||
iovec iov = {
|
||||
.iov_base = heap_caps_malloc(res->data_size, MALLOC_CAP_SPIRAM),
|
||||
.iov_len = (size_t)res->data_size
|
||||
};
|
||||
memcpy(iov.iov_base, res->data, res->data_size);
|
||||
test_pcm_.push_back(iov);
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
if (res->vad_state == AFE_VAD_SPEECH) {
|
||||
iovec iov = {
|
||||
.iov_base = heap_caps_malloc(res->data_size, MALLOC_CAP_SPIRAM),
|
||||
.iov_len = (size_t)res->data_size
|
||||
};
|
||||
memcpy(iov.iov_base, res->data, res->data_size);
|
||||
test_pcm_.push_back(iov);
|
||||
builtin_led.SetRed(128);
|
||||
} else {
|
||||
builtin_led.SetRed(32);
|
||||
}
|
||||
builtin_led.TurnOn();
|
||||
continue;
|
||||
}
|
||||
|
||||
if (res->wakeup_state == WAKENET_DETECTED) {
|
||||
if (chat_state_ == kChatStateIdle && res->wakeup_state == WAKENET_DETECTED) {
|
||||
xEventGroupClearBits(event_group_, DETECTION_RUNNING);
|
||||
SetChatState(kChatStateConnecting);
|
||||
|
||||
@@ -416,7 +588,7 @@ void Application::AudioDetectionTask() {
|
||||
StartWebSocketClient();
|
||||
|
||||
// Here the websocket is done, and we also wait for the wake word data to be encoded
|
||||
xEventGroupWaitBits(event_group_, DETECT_PACKETS_ENCODED, pdTRUE, pdTRUE, portMAX_DELAY);
|
||||
xEventGroupWaitBits(event_group_, WAKE_WORD_ENCODED, pdTRUE, pdTRUE, portMAX_DELAY);
|
||||
|
||||
std::lock_guard<std::recursive_mutex> lock(mutex_);
|
||||
if (ws_client_ && ws_client_->IsConnected()) {
|
||||
@@ -427,8 +599,7 @@ void Application::AudioDetectionTask() {
|
||||
opus_encoder_.ResetState();
|
||||
// If connected, the hello message is already sent, so we can start communication
|
||||
xEventGroupSetBits(event_group_, COMMUNICATION_RUNNING);
|
||||
|
||||
ESP_LOGI(TAG, "Start communication after wake word detected");
|
||||
ESP_LOGI(TAG, "Communication running");
|
||||
} else {
|
||||
SetChatState(kChatStateIdle);
|
||||
xEventGroupSetBits(event_group_, DETECTION_RUNNING);
|
||||
@@ -446,7 +617,7 @@ void Application::AudioCommunicationTask() {
|
||||
|
||||
auto res = esp_afe_vc_v1.fetch(afe_communication_data_);
|
||||
if (res == nullptr || res->ret_value == ESP_FAIL) {
|
||||
ESP_LOGE(TAG, "Error in fetch");
|
||||
ESP_LOGE(TAG, "Error in AudioCommunicationTask");
|
||||
if (res != nullptr) {
|
||||
ESP_LOGI(TAG, "Error code: %d", res->ret_value);
|
||||
}
|
||||
@@ -457,21 +628,30 @@ void Application::AudioCommunicationTask() {
|
||||
{
|
||||
std::lock_guard<std::recursive_mutex> lock(mutex_);
|
||||
if (ws_client_ == nullptr || !ws_client_->IsConnected()) {
|
||||
if (ws_client_ != nullptr) {
|
||||
delete ws_client_;
|
||||
ws_client_ = nullptr;
|
||||
}
|
||||
xEventGroupClearBits(event_group_, COMMUNICATION_RUNNING);
|
||||
if (audio_device_.playing()) {
|
||||
audio_device_.Break();
|
||||
}
|
||||
SetChatState(kChatStateIdle);
|
||||
if (ws_client_ != nullptr) {
|
||||
delete ws_client_;
|
||||
ws_client_ = nullptr;
|
||||
}
|
||||
xEventGroupSetBits(event_group_, DETECTION_RUNNING);
|
||||
xEventGroupClearBits(event_group_, COMMUNICATION_RUNNING);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
if (chat_state_ == kChatStateListening) {
|
||||
// Update the LED state based on the VAD state
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
if (res->vad_state == AFE_VAD_SPEECH) {
|
||||
builtin_led.SetRed(128);
|
||||
} else {
|
||||
builtin_led.SetRed(32);
|
||||
}
|
||||
builtin_led.TurnOn();
|
||||
|
||||
// Send audio data to server
|
||||
iovec data = {
|
||||
.iov_base = malloc(res->data_size),
|
||||
@@ -491,10 +671,12 @@ void Application::AudioEncodeTask() {
|
||||
|
||||
// Encode audio data
|
||||
opus_encoder_.Encode(pcm, [this](const iovec opus) {
|
||||
auto protocol = AllocateBinaryProtocol(opus.iov_base, opus.iov_len);
|
||||
std::lock_guard<std::recursive_mutex> lock(mutex_);
|
||||
if (ws_client_ && ws_client_->IsConnected()) {
|
||||
ws_client_->Send(opus.iov_base, opus.iov_len, true);
|
||||
ws_client_->Send(protocol, sizeof(BinaryProtocol) + opus.iov_len, true);
|
||||
}
|
||||
heap_caps_free(protocol);
|
||||
});
|
||||
|
||||
free(pcm.iov_base);
|
||||
@@ -518,9 +700,9 @@ void Application::AudioDecodeTask() {
|
||||
}
|
||||
|
||||
if (opus_decode_sample_rate_ != CONFIG_AUDIO_OUTPUT_SAMPLE_RATE) {
|
||||
int target_size = test_resampler_.GetOutputSamples(frame_size);
|
||||
int target_size = opus_resampler_.GetOutputSamples(frame_size);
|
||||
std::vector<int16_t> resampled(target_size);
|
||||
test_resampler_.Process(packet->pcm.data(), frame_size, resampled.data());
|
||||
opus_resampler_.Process(packet->pcm.data(), frame_size, resampled.data());
|
||||
packet->pcm = std::move(resampled);
|
||||
}
|
||||
}
|
||||
@@ -548,9 +730,14 @@ void Application::StartWebSocketClient() {
|
||||
}
|
||||
|
||||
std::string token = "Bearer " + std::string(CONFIG_WEBSOCKET_ACCESS_TOKEN);
|
||||
ws_client_ = new WebSocketClient();
|
||||
#ifdef CONFIG_USE_ML307
|
||||
ws_client_ = new WebSocket(new Ml307SslTransport(ml307_at_modem_, 0));
|
||||
#else
|
||||
ws_client_ = new WebSocket(new TlsTransport());
|
||||
#endif
|
||||
ws_client_->SetHeader("Authorization", token.c_str());
|
||||
ws_client_->SetHeader("Device-Id", SystemInfo::GetMacAddress().c_str());
|
||||
ws_client_->SetHeader("Protocol-Version", std::to_string(PROTOCOL_VERSION).c_str());
|
||||
|
||||
ws_client_->OnConnected([this]() {
|
||||
ESP_LOGI(TAG, "Websocket connected");
|
||||
@@ -558,7 +745,7 @@ void Application::StartWebSocketClient() {
|
||||
// Send hello message to describe the client
|
||||
// keys: message type, version, wakeup_model, audio_params (format, sample_rate, channels)
|
||||
std::string message = "{";
|
||||
message += "\"type\":\"hello\", \"version\":\"1.0\",";
|
||||
message += "\"type\":\"hello\",";
|
||||
message += "\"wakeup_model\":\"" + std::string(wakenet_model_) + "\",";
|
||||
message += "\"audio_params\":{";
|
||||
message += "\"format\":\"opus\", \"sample_rate\":" + std::to_string(CONFIG_AUDIO_INPUT_SAMPLE_RATE) + ", \"channels\":1";
|
||||
@@ -567,21 +754,23 @@ void Application::StartWebSocketClient() {
|
||||
});
|
||||
|
||||
ws_client_->OnData([this](const char* data, size_t len, bool binary) {
|
||||
auto packet = new AudioPacket();
|
||||
if (binary) {
|
||||
auto header = (AudioDataHeader*)data;
|
||||
packet->type = kAudioPacketTypeData;
|
||||
packet->timestamp = ntohl(header->timestamp);
|
||||
auto protocol = (BinaryProtocol*)data;
|
||||
|
||||
auto payload_size = ntohl(header->payload_size);
|
||||
auto packet = new AudioPacket();
|
||||
packet->type = kAudioPacketTypeData;
|
||||
packet->timestamp = ntohl(protocol->timestamp);
|
||||
auto payload_size = ntohl(protocol->payload_size);
|
||||
packet->opus.resize(payload_size);
|
||||
memcpy(packet->opus.data(), data + sizeof(AudioDataHeader), payload_size);
|
||||
memcpy(packet->opus.data(), protocol->payload, payload_size);
|
||||
xQueueSend(audio_decode_queue_, &packet, portMAX_DELAY);
|
||||
} else {
|
||||
// Parse JSON data
|
||||
auto root = cJSON_Parse(data);
|
||||
auto type = cJSON_GetObjectItem(root, "type");
|
||||
if (type != NULL) {
|
||||
if (strcmp(type->valuestring, "tts") == 0) {
|
||||
auto packet = new AudioPacket();
|
||||
auto state = cJSON_GetObjectItem(root, "state");
|
||||
if (strcmp(state->valuestring, "start") == 0) {
|
||||
packet->type = kAudioPacketTypeStart;
|
||||
@@ -597,19 +786,24 @@ void Application::StartWebSocketClient() {
|
||||
packet->type = kAudioPacketTypeSentenceStart;
|
||||
packet->text = cJSON_GetObjectItem(root, "text")->valuestring;
|
||||
}
|
||||
xQueueSend(audio_decode_queue_, &packet, portMAX_DELAY);
|
||||
} else if (strcmp(type->valuestring, "stt") == 0) {
|
||||
auto text = cJSON_GetObjectItem(root, "text");
|
||||
if (text != NULL) {
|
||||
ESP_LOGI(TAG, ">> %s", text->valuestring);
|
||||
}
|
||||
}
|
||||
}
|
||||
cJSON_Delete(root);
|
||||
}
|
||||
xQueueSend(audio_decode_queue_, &packet, portMAX_DELAY);
|
||||
});
|
||||
|
||||
ws_client_->OnError([this](int error) {
|
||||
ESP_LOGE(TAG, "Websocket error: %d", error);
|
||||
});
|
||||
|
||||
ws_client_->OnClosed([this]() {
|
||||
ESP_LOGI(TAG, "Websocket closed");
|
||||
ws_client_->OnDisconnected([this]() {
|
||||
ESP_LOGI(TAG, "Websocket disconnected");
|
||||
});
|
||||
|
||||
if (!ws_client_->Connect(CONFIG_WEBSOCKET_URL)) {
|
||||
|
||||
@@ -4,8 +4,12 @@
|
||||
#include "AudioDevice.h"
|
||||
#include "OpusEncoder.h"
|
||||
#include "OpusResampler.h"
|
||||
#include "WebSocketClient.h"
|
||||
#include "WebSocket.h"
|
||||
#include "Display.h"
|
||||
#include "Ml307AtModem.h"
|
||||
#include "FirmwareUpgrade.h"
|
||||
#include "Ml307Http.h"
|
||||
#include "EspHttp.h"
|
||||
|
||||
#include "opus.h"
|
||||
#include "resampler_structs.h"
|
||||
@@ -19,7 +23,17 @@
|
||||
|
||||
#define DETECTION_RUNNING 1
|
||||
#define COMMUNICATION_RUNNING 2
|
||||
#define DETECT_PACKETS_ENCODED 4
|
||||
#define WAKE_WORD_ENCODED 4
|
||||
|
||||
#define PROTOCOL_VERSION 2
|
||||
struct BinaryProtocol {
|
||||
uint16_t version;
|
||||
uint16_t type;
|
||||
uint32_t reserved;
|
||||
uint32_t timestamp;
|
||||
uint32_t payload_size;
|
||||
uint8_t payload[];
|
||||
} __attribute__((packed));
|
||||
|
||||
|
||||
enum ChatState {
|
||||
@@ -28,7 +42,8 @@ enum ChatState {
|
||||
kChatStateListening,
|
||||
kChatStateSpeaking,
|
||||
kChatStateWakeWordDetected,
|
||||
kChatStateTesting
|
||||
kChatStateTesting,
|
||||
kChatStateUpgrading
|
||||
};
|
||||
|
||||
class Application {
|
||||
@@ -49,15 +64,23 @@ private:
|
||||
~Application();
|
||||
|
||||
AudioDevice audio_device_;
|
||||
#ifdef CONFIG_USE_ML307
|
||||
Ml307AtModem ml307_at_modem_;
|
||||
Ml307Http http_;
|
||||
#else
|
||||
EspHttp http_;
|
||||
#endif
|
||||
FirmwareUpgrade firmware_upgrade_;
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
Display display_;
|
||||
#endif
|
||||
|
||||
std::recursive_mutex mutex_;
|
||||
WebSocketClient* ws_client_ = nullptr;
|
||||
WebSocket* ws_client_ = nullptr;
|
||||
esp_afe_sr_data_t* afe_detection_data_ = nullptr;
|
||||
esp_afe_sr_data_t* afe_communication_data_ = nullptr;
|
||||
EventGroupHandle_t event_group_;
|
||||
char* wakenet_model_ = NULL;
|
||||
char* nsnet_model_ = NULL;
|
||||
volatile ChatState chat_state_ = kChatStateIdle;
|
||||
|
||||
// Audio encode / decode
|
||||
@@ -84,8 +107,13 @@ private:
|
||||
StaticTask_t wake_word_encode_task_buffer_;
|
||||
StackType_t* wake_word_encode_task_stack_ = nullptr;
|
||||
std::list<iovec> wake_word_pcm_;
|
||||
std::vector<iovec> wake_word_opus_;
|
||||
std::string wake_word_opus_;
|
||||
|
||||
TaskHandle_t check_new_version_task_ = nullptr;
|
||||
StaticTask_t check_new_version_task_buffer_;
|
||||
StackType_t* check_new_version_task_stack_ = nullptr;
|
||||
|
||||
BinaryProtocol* AllocateBinaryProtocol(void* payload, size_t payload_size);
|
||||
void SetDecodeSampleRate(int sample_rate);
|
||||
void SetChatState(ChatState state);
|
||||
void StartDetection();
|
||||
@@ -96,6 +124,8 @@ private:
|
||||
void SendWakeWordData();
|
||||
void CheckTestButton();
|
||||
void PlayTestAudio();
|
||||
void CheckNewVersion();
|
||||
void UpdateDisplay();
|
||||
|
||||
void AudioFeedTask();
|
||||
void AudioDetectionTask();
|
||||
|
||||
@@ -76,10 +76,10 @@ void AudioDevice::CreateDuplexChannels() {
|
||||
},
|
||||
.gpio_cfg = {
|
||||
.mclk = I2S_GPIO_UNUSED,
|
||||
.bclk = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_BCLK,
|
||||
.ws = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_WS,
|
||||
.dout = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_DOUT,
|
||||
.din = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_DIN,
|
||||
.bclk = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_BCLK,
|
||||
.ws = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_WS,
|
||||
.dout = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_SPK_GPIO_DOUT,
|
||||
.din = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_DIN,
|
||||
.invert_flags = {
|
||||
.mclk_inv = false,
|
||||
.bclk_inv = false,
|
||||
@@ -127,9 +127,9 @@ void AudioDevice::CreateSimplexChannels() {
|
||||
},
|
||||
.gpio_cfg = {
|
||||
.mclk = I2S_GPIO_UNUSED,
|
||||
.bclk = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_BCLK,
|
||||
.ws = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_WS,
|
||||
.dout = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_DOUT,
|
||||
.bclk = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_SPK_GPIO_BCLK,
|
||||
.ws = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_SPK_GPIO_WS,
|
||||
.dout = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_SPK_GPIO_DOUT,
|
||||
.din = I2S_GPIO_UNUSED,
|
||||
.invert_flags = {
|
||||
.mclk_inv = false,
|
||||
@@ -147,7 +147,7 @@ void AudioDevice::CreateSimplexChannels() {
|
||||
std_cfg.gpio_cfg.bclk = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_BCLK;
|
||||
std_cfg.gpio_cfg.ws = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_WS;
|
||||
std_cfg.gpio_cfg.dout = I2S_GPIO_UNUSED;
|
||||
std_cfg.gpio_cfg.din = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_GPIO_DIN;
|
||||
std_cfg.gpio_cfg.din = (gpio_num_t)CONFIG_AUDIO_DEVICE_I2S_MIC_GPIO_DIN;
|
||||
ESP_ERROR_CHECK(i2s_channel_init_std_mode(rx_handle_, &std_cfg));
|
||||
ESP_LOGI(TAG, "Simplex channels created");
|
||||
}
|
||||
@@ -205,7 +205,7 @@ void AudioDevice::AudioPlayTask() {
|
||||
}
|
||||
break;
|
||||
case kAudioPacketTypeSentenceStart:
|
||||
ESP_LOGI(TAG, "Playing sentence: %s", packet->text.c_str());
|
||||
ESP_LOGI(TAG, "<< %s", packet->text.c_str());
|
||||
break;
|
||||
case kAudioPacketTypeSentenceEnd:
|
||||
if (breaked_) { // Clear the queue
|
||||
@@ -219,6 +219,7 @@ void AudioDevice::AudioPlayTask() {
|
||||
break;
|
||||
case kAudioPacketTypeData:
|
||||
Write(packet->pcm.data(), packet->pcm.size());
|
||||
last_timestamp_ = packet->timestamp;
|
||||
break;
|
||||
default:
|
||||
ESP_LOGE(TAG, "Unknown audio packet type: %d", packet->type);
|
||||
|
||||
@@ -28,13 +28,6 @@ struct AudioPacket {
|
||||
uint32_t timestamp;
|
||||
};
|
||||
|
||||
struct AudioDataHeader {
|
||||
uint32_t version;
|
||||
uint32_t reserved;
|
||||
uint32_t timestamp;
|
||||
uint32_t payload_size;
|
||||
} __attribute__((packed));
|
||||
|
||||
class AudioDevice {
|
||||
public:
|
||||
AudioDevice();
|
||||
@@ -51,6 +44,7 @@ public:
|
||||
int output_sample_rate() const { return output_sample_rate_; }
|
||||
bool duplex() const { return duplex_; }
|
||||
bool playing() const { return playing_; }
|
||||
uint32_t last_timestamp() const { return last_timestamp_; }
|
||||
|
||||
private:
|
||||
bool playing_ = false;
|
||||
@@ -58,6 +52,7 @@ private:
|
||||
bool duplex_ = false;
|
||||
int input_sample_rate_ = 0;
|
||||
int output_sample_rate_ = 0;
|
||||
uint32_t last_timestamp_ = 0;
|
||||
|
||||
i2s_chan_handle_t tx_handle_ = nullptr;
|
||||
i2s_chan_handle_t rx_handle_ = nullptr;
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
set(SOURCES "AudioDevice.cc"
|
||||
"FirmwareUpgrade.cc"
|
||||
"SystemInfo.cc"
|
||||
"SystemReset.cc"
|
||||
"Application.cc"
|
||||
"Display.cc"
|
||||
"main.cc"
|
||||
)
|
||||
|
||||
|
||||
139
main/Display.cc
Normal file
139
main/Display.cc
Normal file
@@ -0,0 +1,139 @@
|
||||
|
||||
#include "Display.h"
|
||||
|
||||
#include "esp_log.h"
|
||||
#include "esp_err.h"
|
||||
#include "esp_lcd_panel_ops.h"
|
||||
#include "esp_lcd_panel_vendor.h"
|
||||
#include "esp_lvgl_port.h"
|
||||
#include <string>
|
||||
#include <cstdlib>
|
||||
|
||||
#define TAG "Display"
|
||||
|
||||
#ifdef CONFIG_USE_DISPLAY
|
||||
|
||||
Display::Display(int sda_pin, int scl_pin) : sda_pin_(sda_pin), scl_pin_(scl_pin) {
|
||||
ESP_LOGI(TAG, "Display Pins: %d, %d", sda_pin_, scl_pin_);
|
||||
|
||||
i2c_master_bus_config_t bus_config = {
|
||||
.i2c_port = I2C_NUM_0,
|
||||
.sda_io_num = (gpio_num_t)sda_pin_,
|
||||
.scl_io_num = (gpio_num_t)scl_pin_,
|
||||
.clk_source = I2C_CLK_SRC_DEFAULT,
|
||||
.glitch_ignore_cnt = 7,
|
||||
.intr_priority = 1,
|
||||
.trans_queue_depth = 0,
|
||||
.flags = {
|
||||
.enable_internal_pullup = 1,
|
||||
},
|
||||
};
|
||||
|
||||
ESP_ERROR_CHECK(i2c_new_master_bus(&bus_config, &i2c_bus_));
|
||||
|
||||
// SSD1306 config
|
||||
esp_lcd_panel_io_i2c_config_t io_config = {
|
||||
.dev_addr = 0x3C,
|
||||
.on_color_trans_done = nullptr,
|
||||
.user_ctx = nullptr,
|
||||
.control_phase_bytes = 1,
|
||||
.dc_bit_offset = 6,
|
||||
.lcd_cmd_bits = 8,
|
||||
.lcd_param_bits = 8,
|
||||
.flags = {
|
||||
.dc_low_on_data = 0,
|
||||
.disable_control_phase = 0,
|
||||
},
|
||||
.scl_speed_hz = 400 * 1000,
|
||||
};
|
||||
|
||||
ESP_ERROR_CHECK(esp_lcd_new_panel_io_i2c_v2(i2c_bus_, &io_config, &panel_io_));
|
||||
|
||||
ESP_LOGI(TAG, "Install SSD1306 driver");
|
||||
esp_lcd_panel_dev_config_t panel_config = {};
|
||||
panel_config.reset_gpio_num = -1;
|
||||
panel_config.bits_per_pixel = 1;
|
||||
|
||||
esp_lcd_panel_ssd1306_config_t ssd1306_config = {
|
||||
.height = CONFIG_DISPLAY_HEIGHT
|
||||
};
|
||||
panel_config.vendor_config = &ssd1306_config;
|
||||
|
||||
ESP_ERROR_CHECK(esp_lcd_new_panel_ssd1306(panel_io_, &panel_config, &panel_));
|
||||
ESP_LOGI(TAG, "SSD1306 driver installed");
|
||||
|
||||
// Reset the display
|
||||
ESP_ERROR_CHECK(esp_lcd_panel_reset(panel_));
|
||||
if (esp_lcd_panel_init(panel_) != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to initialize display");
|
||||
return;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "Initialize LVGL");
|
||||
lvgl_port_cfg_t port_cfg = ESP_LVGL_PORT_INIT_CONFIG();
|
||||
lvgl_port_init(&port_cfg);
|
||||
|
||||
const lvgl_port_display_cfg_t display_cfg = {
|
||||
.io_handle = panel_io_,
|
||||
.panel_handle = panel_,
|
||||
.buffer_size = 128 * CONFIG_DISPLAY_HEIGHT,
|
||||
.double_buffer = true,
|
||||
.hres = 128,
|
||||
.vres = CONFIG_DISPLAY_HEIGHT,
|
||||
.monochrome = true,
|
||||
.rotation = {
|
||||
.swap_xy = 0,
|
||||
.mirror_x = 0,
|
||||
.mirror_y = 0,
|
||||
},
|
||||
.flags = {
|
||||
.buff_dma = 0,
|
||||
.buff_spiram = 0,
|
||||
},
|
||||
};
|
||||
disp_ = lvgl_port_add_disp(&display_cfg);
|
||||
lv_disp_set_rotation(disp_, LV_DISP_ROT_180);
|
||||
|
||||
// Set the display to on
|
||||
ESP_LOGI(TAG, "Turning display on");
|
||||
ESP_ERROR_CHECK(esp_lcd_panel_disp_on_off(panel_, true));
|
||||
|
||||
ESP_LOGI(TAG, "Display Loading...");
|
||||
if (lvgl_port_lock(0)) {
|
||||
label_ = lv_label_create(lv_disp_get_scr_act(disp_));
|
||||
lv_label_set_text(label_, "Initializing...");
|
||||
lv_obj_set_width(label_, disp_->driver->hor_res);
|
||||
lv_obj_set_height(label_, disp_->driver->ver_res);
|
||||
lv_obj_set_style_text_line_space(label_, 0, 0);
|
||||
lv_obj_set_style_pad_all(label_, 0, 0);
|
||||
lv_obj_set_style_outline_pad(label_, 0, 0);
|
||||
lvgl_port_unlock();
|
||||
}
|
||||
}
|
||||
|
||||
Display::~Display() {
|
||||
if (label_ != nullptr) {
|
||||
lvgl_port_lock(0);
|
||||
lv_obj_del(label_);
|
||||
lvgl_port_unlock();
|
||||
}
|
||||
|
||||
if (disp_ != nullptr) {
|
||||
lvgl_port_deinit();
|
||||
esp_lcd_panel_del(panel_);
|
||||
esp_lcd_panel_io_del(panel_io_);
|
||||
i2c_master_bus_reset(i2c_bus_);
|
||||
}
|
||||
}
|
||||
|
||||
void Display::SetText(const std::string &text) {
|
||||
if (label_ != nullptr) {
|
||||
text_ = text;
|
||||
lvgl_port_lock(0);
|
||||
// Change the text of the label
|
||||
lv_label_set_text(label_, text_.c_str());
|
||||
lvgl_port_unlock();
|
||||
}
|
||||
}
|
||||
|
||||
#endif
|
||||
32
main/Display.h
Normal file
32
main/Display.h
Normal file
@@ -0,0 +1,32 @@
|
||||
#ifndef DISPLAY_H
|
||||
#define DISPLAY_H
|
||||
|
||||
#include "driver/i2c_master.h"
|
||||
#include "esp_lcd_panel_io.h"
|
||||
#include "esp_lcd_panel_ops.h"
|
||||
#include "lvgl.h"
|
||||
|
||||
#include <string>
|
||||
|
||||
class Display {
|
||||
public:
|
||||
Display(int sda_pin, int scl_pin);
|
||||
~Display();
|
||||
|
||||
void SetText(const std::string &text);
|
||||
|
||||
private:
|
||||
int sda_pin_;
|
||||
int scl_pin_;
|
||||
|
||||
i2c_master_bus_handle_t i2c_bus_ = nullptr;
|
||||
|
||||
esp_lcd_panel_io_handle_t panel_io_ = nullptr;
|
||||
esp_lcd_panel_handle_t panel_ = nullptr;
|
||||
lv_disp_t *disp_ = nullptr;
|
||||
lv_obj_t *label_ = nullptr;
|
||||
|
||||
std::string text_;
|
||||
};
|
||||
|
||||
#endif
|
||||
259
main/FirmwareUpgrade.cc
Normal file
259
main/FirmwareUpgrade.cc
Normal file
@@ -0,0 +1,259 @@
|
||||
#include "FirmwareUpgrade.h"
|
||||
#include "SystemInfo.h"
|
||||
#include "cJSON.h"
|
||||
#include "esp_log.h"
|
||||
#include "esp_partition.h"
|
||||
#include "esp_http_client.h"
|
||||
#include "esp_ota_ops.h"
|
||||
#include "esp_app_format.h"
|
||||
#include "Ml307Http.h"
|
||||
#include <vector>
|
||||
#include <sstream>
|
||||
#include <algorithm>
|
||||
|
||||
#define TAG "FirmwareUpgrade"
|
||||
|
||||
|
||||
FirmwareUpgrade::FirmwareUpgrade(Http& http) : http_(http) {
|
||||
}
|
||||
|
||||
FirmwareUpgrade::~FirmwareUpgrade() {
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::SetCheckVersionUrl(std::string check_version_url) {
|
||||
check_version_url_ = check_version_url;
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::SetPostData(const std::string& post_data) {
|
||||
post_data_ = post_data;
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::SetHeader(const std::string& key, const std::string& value) {
|
||||
headers_[key] = value;
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::CheckVersion() {
|
||||
std::string current_version = esp_app_get_description()->version;
|
||||
ESP_LOGI(TAG, "Current version: %s", current_version.c_str());
|
||||
|
||||
if (check_version_url_.length() < 10) {
|
||||
ESP_LOGE(TAG, "Check version URL is not properly set");
|
||||
return;
|
||||
}
|
||||
|
||||
for (const auto& header : headers_) {
|
||||
http_.SetHeader(header.first, header.second);
|
||||
}
|
||||
|
||||
if (post_data_.empty()) {
|
||||
http_.Open("GET", check_version_url_);
|
||||
} else {
|
||||
http_.SetHeader("Content-Type", "application/json");
|
||||
http_.SetContent(post_data_);
|
||||
http_.Open("POST", check_version_url_);
|
||||
}
|
||||
|
||||
auto response = http_.GetBody();
|
||||
http_.Close();
|
||||
|
||||
// Response: { "firmware": { "version": "1.0.0", "url": "http://" } }
|
||||
// Parse the JSON response and check if the version is newer
|
||||
// If it is, set has_new_version_ to true and store the new version and URL
|
||||
|
||||
cJSON *root = cJSON_Parse(response.c_str());
|
||||
if (root == NULL) {
|
||||
ESP_LOGE(TAG, "Failed to parse JSON response");
|
||||
return;
|
||||
}
|
||||
cJSON *firmware = cJSON_GetObjectItem(root, "firmware");
|
||||
if (firmware == NULL) {
|
||||
ESP_LOGE(TAG, "Failed to get firmware object");
|
||||
cJSON_Delete(root);
|
||||
return;
|
||||
}
|
||||
cJSON *version = cJSON_GetObjectItem(firmware, "version");
|
||||
if (version == NULL) {
|
||||
ESP_LOGE(TAG, "Failed to get version object");
|
||||
cJSON_Delete(root);
|
||||
return;
|
||||
}
|
||||
cJSON *url = cJSON_GetObjectItem(firmware, "url");
|
||||
if (url == NULL) {
|
||||
ESP_LOGE(TAG, "Failed to get url object");
|
||||
cJSON_Delete(root);
|
||||
return;
|
||||
}
|
||||
|
||||
firmware_version_ = version->valuestring;
|
||||
firmware_url_ = url->valuestring;
|
||||
cJSON_Delete(root);
|
||||
|
||||
// Check if the version is newer, for example, 0.1.0 is newer than 0.0.1
|
||||
has_new_version_ = IsNewVersionAvailable(current_version, firmware_version_);
|
||||
if (has_new_version_) {
|
||||
ESP_LOGI(TAG, "New version available: %s", firmware_version_.c_str());
|
||||
} else {
|
||||
ESP_LOGI(TAG, "Current is the latest version");
|
||||
}
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::MarkCurrentVersionValid() {
|
||||
auto partition = esp_ota_get_running_partition();
|
||||
if (strcmp(partition->label, "factory") == 0) {
|
||||
ESP_LOGI(TAG, "Running from factory partition, skipping");
|
||||
return;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "Running partition: %s", partition->label);
|
||||
esp_ota_img_states_t state;
|
||||
if (esp_ota_get_state_partition(partition, &state) != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to get state of partition");
|
||||
return;
|
||||
}
|
||||
|
||||
if (state == ESP_OTA_IMG_PENDING_VERIFY) {
|
||||
ESP_LOGI(TAG, "Marking firmware as valid");
|
||||
esp_ota_mark_app_valid_cancel_rollback();
|
||||
}
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::Upgrade(const std::string& firmware_url) {
|
||||
ESP_LOGI(TAG, "Upgrading firmware from %s", firmware_url.c_str());
|
||||
esp_ota_handle_t update_handle = 0;
|
||||
auto update_partition = esp_ota_get_next_update_partition(NULL);
|
||||
if (update_partition == NULL) {
|
||||
ESP_LOGE(TAG, "Failed to get update partition");
|
||||
return;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "Writing to partition %s at offset 0x%lx", update_partition->label, update_partition->address);
|
||||
bool image_header_checked = false;
|
||||
std::string image_header;
|
||||
|
||||
if (!http_.Open("GET", firmware_url)) {
|
||||
ESP_LOGE(TAG, "Failed to open HTTP connection");
|
||||
return;
|
||||
}
|
||||
|
||||
size_t content_length = http_.GetBodyLength();
|
||||
if (content_length == 0) {
|
||||
ESP_LOGE(TAG, "Failed to get content length");
|
||||
http_.Close();
|
||||
return;
|
||||
}
|
||||
|
||||
char buffer[4096];
|
||||
size_t total_read = 0, recent_read = 0;
|
||||
auto last_calc_time = esp_timer_get_time();
|
||||
while (true) {
|
||||
int ret = http_.Read(buffer, sizeof(buffer));
|
||||
if (ret < 0) {
|
||||
ESP_LOGE(TAG, "Failed to read HTTP data: %s", esp_err_to_name(ret));
|
||||
http_.Close();
|
||||
return;
|
||||
}
|
||||
|
||||
// Calculate speed and progress every second
|
||||
recent_read += ret;
|
||||
total_read += ret;
|
||||
if (esp_timer_get_time() - last_calc_time >= 1000000 || ret == 0) {
|
||||
size_t progress = total_read * 100 / content_length;
|
||||
ESP_LOGI(TAG, "Progress: %zu%% (%zu/%zu), Speed: %zuB/s", progress, total_read, content_length, recent_read);
|
||||
if (upgrade_callback_) {
|
||||
upgrade_callback_(progress, recent_read);
|
||||
}
|
||||
last_calc_time = esp_timer_get_time();
|
||||
recent_read = 0;
|
||||
}
|
||||
|
||||
if (ret == 0) {
|
||||
break;
|
||||
}
|
||||
|
||||
|
||||
if (!image_header_checked) {
|
||||
image_header.append(buffer, ret);
|
||||
if (image_header.size() >= sizeof(esp_image_header_t) + sizeof(esp_image_segment_header_t) + sizeof(esp_app_desc_t)) {
|
||||
esp_app_desc_t new_app_info;
|
||||
memcpy(&new_app_info, image_header.data() + sizeof(esp_image_header_t) + sizeof(esp_image_segment_header_t), sizeof(esp_app_desc_t));
|
||||
ESP_LOGI(TAG, "New firmware version: %s", new_app_info.version);
|
||||
|
||||
auto current_version = esp_app_get_description()->version;
|
||||
if (memcmp(new_app_info.version, current_version, sizeof(new_app_info.version)) == 0) {
|
||||
ESP_LOGE(TAG, "Firmware version is the same, skipping upgrade");
|
||||
http_.Close();
|
||||
return;
|
||||
}
|
||||
|
||||
if (esp_ota_begin(update_partition, OTA_WITH_SEQUENTIAL_WRITES, &update_handle)) {
|
||||
esp_ota_abort(update_handle);
|
||||
http_.Close();
|
||||
ESP_LOGE(TAG, "Failed to begin OTA");
|
||||
return;
|
||||
}
|
||||
|
||||
image_header_checked = true;
|
||||
}
|
||||
}
|
||||
auto err = esp_ota_write(update_handle, buffer, ret);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to write OTA data: %s", esp_err_to_name(err));
|
||||
esp_ota_abort(update_handle);
|
||||
http_.Close();
|
||||
return;
|
||||
}
|
||||
}
|
||||
http_.Close();
|
||||
|
||||
esp_err_t err = esp_ota_end(update_handle);
|
||||
if (err != ESP_OK) {
|
||||
if (err == ESP_ERR_OTA_VALIDATE_FAILED) {
|
||||
ESP_LOGE(TAG, "Image validation failed, image is corrupted");
|
||||
} else {
|
||||
ESP_LOGE(TAG, "Failed to end OTA: %s", esp_err_to_name(err));
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
err = esp_ota_set_boot_partition(update_partition);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to set boot partition: %s", esp_err_to_name(err));
|
||||
return;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "Firmware upgrade successful, rebooting in 3 seconds...");
|
||||
vTaskDelay(pdMS_TO_TICKS(3000));
|
||||
esp_restart();
|
||||
}
|
||||
|
||||
void FirmwareUpgrade::StartUpgrade(std::function<void(int progress, size_t speed)> callback) {
|
||||
upgrade_callback_ = callback;
|
||||
Upgrade(firmware_url_);
|
||||
}
|
||||
|
||||
std::vector<int> FirmwareUpgrade::ParseVersion(const std::string& version) {
|
||||
std::vector<int> versionNumbers;
|
||||
std::stringstream ss(version);
|
||||
std::string segment;
|
||||
|
||||
while (std::getline(ss, segment, '.')) {
|
||||
versionNumbers.push_back(std::stoi(segment));
|
||||
}
|
||||
|
||||
return versionNumbers;
|
||||
}
|
||||
|
||||
bool FirmwareUpgrade::IsNewVersionAvailable(const std::string& currentVersion, const std::string& newVersion) {
|
||||
std::vector<int> current = ParseVersion(currentVersion);
|
||||
std::vector<int> newer = ParseVersion(newVersion);
|
||||
|
||||
for (size_t i = 0; i < std::min(current.size(), newer.size()); ++i) {
|
||||
if (newer[i] > current[i]) {
|
||||
return true;
|
||||
} else if (newer[i] < current[i]) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
return newer.size() > current.size();
|
||||
}
|
||||
37
main/FirmwareUpgrade.h
Normal file
37
main/FirmwareUpgrade.h
Normal file
@@ -0,0 +1,37 @@
|
||||
#ifndef _FIRMWARE_UPGRADE_H
|
||||
#define _FIRMWARE_UPGRADE_H
|
||||
|
||||
#include <functional>
|
||||
#include <string>
|
||||
#include <map>
|
||||
#include "Http.h"
|
||||
|
||||
class FirmwareUpgrade {
|
||||
public:
|
||||
FirmwareUpgrade(Http& http);
|
||||
~FirmwareUpgrade();
|
||||
|
||||
void SetCheckVersionUrl(std::string check_version_url);
|
||||
void SetPostData(const std::string& post_data);
|
||||
void SetHeader(const std::string& key, const std::string& value);
|
||||
void CheckVersion();
|
||||
bool HasNewVersion() { return has_new_version_; }
|
||||
void StartUpgrade(std::function<void(int progress, size_t speed)> callback);
|
||||
void MarkCurrentVersionValid();
|
||||
|
||||
private:
|
||||
Http& http_;
|
||||
std::string check_version_url_;
|
||||
bool has_new_version_ = false;
|
||||
std::string firmware_version_;
|
||||
std::string firmware_url_;
|
||||
std::string post_data_;
|
||||
std::map<std::string, std::string> headers_;
|
||||
|
||||
void Upgrade(const std::string& firmware_url);
|
||||
std::function<void(int progress, size_t speed)> upgrade_callback_;
|
||||
std::vector<int> ParseVersion(const std::string& version);
|
||||
bool IsNewVersionAvailable(const std::string& currentVersion, const std::string& newVersion);
|
||||
};
|
||||
|
||||
#endif // _FIRMWARE_UPGRADE_H
|
||||
@@ -1,14 +1,20 @@
|
||||
menu "Xiaozhi Assistant"
|
||||
|
||||
config OTA_VERSION_URL
|
||||
string "OTA Version URL"
|
||||
default "https://api.tenclass.net/xiaozhi/ota/"
|
||||
help
|
||||
The application will access this URL to check for updates.
|
||||
|
||||
config WEBSOCKET_URL
|
||||
string "Websocket URL"
|
||||
default "wss://"
|
||||
default "wss://api.tenclass.net/xiaozhi/v1/"
|
||||
help
|
||||
Communication with the server through websocket after wake up.
|
||||
|
||||
config WEBSOCKET_ACCESS_TOKEN
|
||||
string "Websocket Access Token"
|
||||
default ""
|
||||
default "test-token"
|
||||
help
|
||||
Access token for websocket communication.
|
||||
|
||||
@@ -24,29 +30,29 @@ config AUDIO_OUTPUT_SAMPLE_RATE
|
||||
help
|
||||
Audio output sample rate.
|
||||
|
||||
config AUDIO_DEVICE_I2S_GPIO_BCLK
|
||||
int "I2S GPIO BCLK"
|
||||
default 5
|
||||
help
|
||||
GPIO number of the I2S BCLK.
|
||||
|
||||
config AUDIO_DEVICE_I2S_GPIO_WS
|
||||
config AUDIO_DEVICE_I2S_MIC_GPIO_WS
|
||||
int "I2S GPIO WS"
|
||||
default 4
|
||||
help
|
||||
GPIO number of the I2S WS.
|
||||
|
||||
config AUDIO_DEVICE_I2S_GPIO_DOUT
|
||||
int "I2S GPIO DOUT"
|
||||
config AUDIO_DEVICE_I2S_MIC_GPIO_BCLK
|
||||
int "I2S GPIO BCLK"
|
||||
default 5
|
||||
help
|
||||
GPIO number of the I2S BCLK.
|
||||
|
||||
config AUDIO_DEVICE_I2S_MIC_GPIO_DIN
|
||||
int "I2S GPIO DIN"
|
||||
default 6
|
||||
help
|
||||
GPIO number of the I2S DOUT.
|
||||
|
||||
config AUDIO_DEVICE_I2S_GPIO_DIN
|
||||
int "I2S GPIO DIN"
|
||||
default 3
|
||||
help
|
||||
GPIO number of the I2S DIN.
|
||||
|
||||
config AUDIO_DEVICE_I2S_SPK_GPIO_DOUT
|
||||
int "I2S GPIO DOUT"
|
||||
default 7
|
||||
help
|
||||
GPIO number of the I2S DOUT.
|
||||
|
||||
config AUDIO_DEVICE_I2S_SIMPLEX
|
||||
bool "I2S Simplex"
|
||||
@@ -54,18 +60,65 @@ config AUDIO_DEVICE_I2S_SIMPLEX
|
||||
help
|
||||
Enable I2S Simplex mode.
|
||||
|
||||
config AUDIO_DEVICE_I2S_MIC_GPIO_BCLK
|
||||
int "I2S MIC GPIO BCLK"
|
||||
default 11
|
||||
config AUDIO_DEVICE_I2S_SPK_GPIO_BCLK
|
||||
int "I2S SPK GPIO BCLK"
|
||||
default 15
|
||||
depends on AUDIO_DEVICE_I2S_SIMPLEX
|
||||
help
|
||||
GPIO number of the I2S MIC BCLK.
|
||||
|
||||
config AUDIO_DEVICE_I2S_MIC_GPIO_WS
|
||||
int "I2S MIC GPIO WS"
|
||||
default 10
|
||||
config AUDIO_DEVICE_I2S_SPK_GPIO_WS
|
||||
int "I2S SPK GPIO WS"
|
||||
default 16
|
||||
depends on AUDIO_DEVICE_I2S_SIMPLEX
|
||||
help
|
||||
GPIO number of the I2S MIC WS.
|
||||
|
||||
config USE_ML307
|
||||
bool "Use ML307"
|
||||
default n
|
||||
help
|
||||
Use ML307 as the modem.
|
||||
|
||||
config ML307_RX_PIN
|
||||
int "ML307 RX Pin"
|
||||
default 11
|
||||
depends on USE_ML307
|
||||
help
|
||||
GPIO number of the ML307 RX.
|
||||
|
||||
config ML307_TX_PIN
|
||||
int "ML307 TX Pin"
|
||||
default 12
|
||||
depends on USE_ML307
|
||||
help
|
||||
GPIO number of the ML307 TX.
|
||||
|
||||
config USE_DISPLAY
|
||||
bool "Use Display"
|
||||
default n
|
||||
help
|
||||
Use Display.
|
||||
|
||||
config DISPLAY_HEIGHT
|
||||
int "Display Height"
|
||||
default 32
|
||||
depends on USE_DISPLAY
|
||||
help
|
||||
Display height in pixels.
|
||||
|
||||
config DISPLAY_SDA_PIN
|
||||
int "Display SDA Pin"
|
||||
default 41
|
||||
depends on USE_DISPLAY
|
||||
help
|
||||
GPIO number of the Display SDA.
|
||||
|
||||
config DISPLAY_SCL_PIN
|
||||
int "Display SCL Pin"
|
||||
default 42
|
||||
depends on USE_DISPLAY
|
||||
help
|
||||
GPIO number of the Display SCL.
|
||||
|
||||
endmenu
|
||||
|
||||
221
main/SystemInfo.cc
Normal file
221
main/SystemInfo.cc
Normal file
@@ -0,0 +1,221 @@
|
||||
#include "SystemInfo.h"
|
||||
#include "freertos/task.h"
|
||||
#include "esp_log.h"
|
||||
#include "esp_flash.h"
|
||||
#include "esp_mac.h"
|
||||
#include "esp_chip_info.h"
|
||||
#include "esp_system.h"
|
||||
#include "esp_partition.h"
|
||||
#include "esp_app_desc.h"
|
||||
#include "esp_psram.h"
|
||||
#include "esp_wifi.h"
|
||||
#include "esp_ota_ops.h"
|
||||
|
||||
|
||||
#define TAG "SystemInfo"
|
||||
|
||||
size_t SystemInfo::GetFlashSize() {
|
||||
uint32_t flash_size;
|
||||
if (esp_flash_get_size(NULL, &flash_size) != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to get flash size");
|
||||
return 0;
|
||||
}
|
||||
return (size_t)flash_size;
|
||||
}
|
||||
|
||||
size_t SystemInfo::GetMinimumFreeHeapSize() {
|
||||
return esp_get_minimum_free_heap_size();
|
||||
}
|
||||
|
||||
size_t SystemInfo::GetFreeHeapSize() {
|
||||
return esp_get_free_heap_size();
|
||||
}
|
||||
|
||||
std::string SystemInfo::GetMacAddress() {
|
||||
uint8_t mac[6];
|
||||
esp_read_mac(mac, ESP_MAC_WIFI_STA);
|
||||
char mac_str[18];
|
||||
snprintf(mac_str, sizeof(mac_str), "%02x:%02x:%02x:%02x:%02x:%02x", mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
|
||||
return std::string(mac_str);
|
||||
}
|
||||
|
||||
std::string SystemInfo::GetChipModelName() {
|
||||
return std::string(CONFIG_IDF_TARGET);
|
||||
}
|
||||
|
||||
std::string SystemInfo::GetJsonString() {
|
||||
/*
|
||||
{
|
||||
"flash_size": 4194304,
|
||||
"psram_size": 0,
|
||||
"minimum_free_heap_size": 123456,
|
||||
"mac_address": "00:00:00:00:00:00",
|
||||
"chip_model_name": "esp32s3",
|
||||
"chip_info": {
|
||||
"model": 1,
|
||||
"cores": 2,
|
||||
"revision": 0,
|
||||
"features": 0
|
||||
},
|
||||
"application": {
|
||||
"name": "my-app",
|
||||
"version": "1.0.0",
|
||||
"compile_time": "2021-01-01T00:00:00Z"
|
||||
"idf_version": "4.2-dev"
|
||||
"elf_sha256": ""
|
||||
},
|
||||
"partition_table": [
|
||||
"app": {
|
||||
"label": "app",
|
||||
"type": 1,
|
||||
"subtype": 2,
|
||||
"address": 0x10000,
|
||||
"size": 0x100000
|
||||
}
|
||||
],
|
||||
"ota": {
|
||||
"label": "ota_0"
|
||||
}
|
||||
}
|
||||
*/
|
||||
std::string json = "{";
|
||||
json += "\"flash_size\":" + std::to_string(GetFlashSize()) + ",";
|
||||
json += "\"psram_size\":" + std::to_string(esp_psram_get_size()) + ",";
|
||||
json += "\"minimum_free_heap_size\":" + std::to_string(GetMinimumFreeHeapSize()) + ",";
|
||||
json += "\"mac_address\":\"" + GetMacAddress() + "\",";
|
||||
json += "\"chip_model_name\":\"" + GetChipModelName() + "\",";
|
||||
json += "\"chip_info\":{";
|
||||
|
||||
esp_chip_info_t chip_info;
|
||||
esp_chip_info(&chip_info);
|
||||
json += "\"model\":" + std::to_string(chip_info.model) + ",";
|
||||
json += "\"cores\":" + std::to_string(chip_info.cores) + ",";
|
||||
json += "\"revision\":" + std::to_string(chip_info.revision) + ",";
|
||||
json += "\"features\":" + std::to_string(chip_info.features);
|
||||
json += "},";
|
||||
|
||||
json += "\"application\":{";
|
||||
auto app_desc = esp_app_get_description();
|
||||
json += "\"name\":\"" + std::string(app_desc->project_name) + "\",";
|
||||
json += "\"version\":\"" + std::string(app_desc->version) + "\",";
|
||||
json += "\"compile_time\":\"" + std::string(app_desc->date) + "T" + std::string(app_desc->time) + "Z\",";
|
||||
json += "\"idf_version\":\"" + std::string(app_desc->idf_ver) + "\",";
|
||||
|
||||
char sha256_str[65];
|
||||
for (int i = 0; i < 32; i++) {
|
||||
snprintf(sha256_str + i * 2, sizeof(sha256_str) - i * 2, "%02x", app_desc->app_elf_sha256[i]);
|
||||
}
|
||||
json += "\"elf_sha256\":\"" + std::string(sha256_str) + "\"";
|
||||
json += "},";
|
||||
|
||||
json += "\"partition_table\": [";
|
||||
esp_partition_iterator_t it = esp_partition_find(ESP_PARTITION_TYPE_ANY, ESP_PARTITION_SUBTYPE_ANY, NULL);
|
||||
while (it) {
|
||||
const esp_partition_t *partition = esp_partition_get(it);
|
||||
json += "{";
|
||||
json += "\"label\":\"" + std::string(partition->label) + "\",";
|
||||
json += "\"type\":" + std::to_string(partition->type) + ",";
|
||||
json += "\"subtype\":" + std::to_string(partition->subtype) + ",";
|
||||
json += "\"address\":" + std::to_string(partition->address) + ",";
|
||||
json += "\"size\":" + std::to_string(partition->size);
|
||||
json += "},";
|
||||
it = esp_partition_next(it);
|
||||
}
|
||||
json.pop_back(); // Remove the last comma
|
||||
json += "],";
|
||||
|
||||
json += "\"ota\":{";
|
||||
auto ota_partition = esp_ota_get_running_partition();
|
||||
json += "\"label\":\"" + std::string(ota_partition->label) + "\"";
|
||||
json += "}";
|
||||
|
||||
// Close the JSON object
|
||||
json += "}";
|
||||
return json;
|
||||
}
|
||||
|
||||
esp_err_t SystemInfo::PrintRealTimeStats(TickType_t xTicksToWait) {
|
||||
#define ARRAY_SIZE_OFFSET 5
|
||||
TaskStatus_t *start_array = NULL, *end_array = NULL;
|
||||
UBaseType_t start_array_size, end_array_size;
|
||||
configRUN_TIME_COUNTER_TYPE start_run_time, end_run_time;
|
||||
esp_err_t ret;
|
||||
uint32_t total_elapsed_time;
|
||||
|
||||
//Allocate array to store current task states
|
||||
start_array_size = uxTaskGetNumberOfTasks() + ARRAY_SIZE_OFFSET;
|
||||
start_array = (TaskStatus_t*)malloc(sizeof(TaskStatus_t) * start_array_size);
|
||||
if (start_array == NULL) {
|
||||
ret = ESP_ERR_NO_MEM;
|
||||
goto exit;
|
||||
}
|
||||
//Get current task states
|
||||
start_array_size = uxTaskGetSystemState(start_array, start_array_size, &start_run_time);
|
||||
if (start_array_size == 0) {
|
||||
ret = ESP_ERR_INVALID_SIZE;
|
||||
goto exit;
|
||||
}
|
||||
|
||||
vTaskDelay(xTicksToWait);
|
||||
|
||||
//Allocate array to store tasks states post delay
|
||||
end_array_size = uxTaskGetNumberOfTasks() + ARRAY_SIZE_OFFSET;
|
||||
end_array = (TaskStatus_t*)malloc(sizeof(TaskStatus_t) * end_array_size);
|
||||
if (end_array == NULL) {
|
||||
ret = ESP_ERR_NO_MEM;
|
||||
goto exit;
|
||||
}
|
||||
//Get post delay task states
|
||||
end_array_size = uxTaskGetSystemState(end_array, end_array_size, &end_run_time);
|
||||
if (end_array_size == 0) {
|
||||
ret = ESP_ERR_INVALID_SIZE;
|
||||
goto exit;
|
||||
}
|
||||
|
||||
//Calculate total_elapsed_time in units of run time stats clock period.
|
||||
total_elapsed_time = (end_run_time - start_run_time);
|
||||
if (total_elapsed_time == 0) {
|
||||
ret = ESP_ERR_INVALID_STATE;
|
||||
goto exit;
|
||||
}
|
||||
|
||||
printf("| Task | Run Time | Percentage\n");
|
||||
//Match each task in start_array to those in the end_array
|
||||
for (int i = 0; i < start_array_size; i++) {
|
||||
int k = -1;
|
||||
for (int j = 0; j < end_array_size; j++) {
|
||||
if (start_array[i].xHandle == end_array[j].xHandle) {
|
||||
k = j;
|
||||
//Mark that task have been matched by overwriting their handles
|
||||
start_array[i].xHandle = NULL;
|
||||
end_array[j].xHandle = NULL;
|
||||
break;
|
||||
}
|
||||
}
|
||||
//Check if matching task found
|
||||
if (k >= 0) {
|
||||
uint32_t task_elapsed_time = end_array[k].ulRunTimeCounter - start_array[i].ulRunTimeCounter;
|
||||
uint32_t percentage_time = (task_elapsed_time * 100UL) / (total_elapsed_time * CONFIG_FREERTOS_NUMBER_OF_CORES);
|
||||
printf("| %-16s | %8lu | %4lu%%\n", start_array[i].pcTaskName, task_elapsed_time, percentage_time);
|
||||
}
|
||||
}
|
||||
|
||||
//Print unmatched tasks
|
||||
for (int i = 0; i < start_array_size; i++) {
|
||||
if (start_array[i].xHandle != NULL) {
|
||||
printf("| %s | Deleted\n", start_array[i].pcTaskName);
|
||||
}
|
||||
}
|
||||
for (int i = 0; i < end_array_size; i++) {
|
||||
if (end_array[i].xHandle != NULL) {
|
||||
printf("| %s | Created\n", end_array[i].pcTaskName);
|
||||
}
|
||||
}
|
||||
ret = ESP_OK;
|
||||
|
||||
exit: //Common return path
|
||||
free(start_array);
|
||||
free(end_array);
|
||||
return ret;
|
||||
}
|
||||
|
||||
20
main/SystemInfo.h
Normal file
20
main/SystemInfo.h
Normal file
@@ -0,0 +1,20 @@
|
||||
#ifndef _SYSTEM_INFO_H_
|
||||
#define _SYSTEM_INFO_H_
|
||||
|
||||
#include <string>
|
||||
|
||||
#include "esp_err.h"
|
||||
#include "freertos/FreeRTOS.h"
|
||||
|
||||
class SystemInfo {
|
||||
public:
|
||||
static size_t GetFlashSize();
|
||||
static size_t GetMinimumFreeHeapSize();
|
||||
static size_t GetFreeHeapSize();
|
||||
static std::string GetMacAddress();
|
||||
static std::string GetChipModelName();
|
||||
static std::string GetJsonString();
|
||||
static esp_err_t PrintRealTimeStats(TickType_t xTicksToWait);
|
||||
};
|
||||
|
||||
#endif // _SYSTEM_INFO_H_
|
||||
@@ -1,24 +1,13 @@
|
||||
## IDF Component Manager Manifest File
|
||||
dependencies:
|
||||
78/esp-builtin-led: "^1.0.0"
|
||||
78/esp-wifi-connect: "^1.0.0"
|
||||
78/esp-ota: "^1.0.0"
|
||||
78/esp-websocket: "^1.0.0"
|
||||
78/esp-builtin-led: "^1.0.1"
|
||||
78/esp-wifi-connect: "^1.1.0"
|
||||
78/esp-opus-encoder: "^1.0.1"
|
||||
78/esp-ml307: "^1.1.0"
|
||||
espressif/esp-sr: "^1.9.0"
|
||||
lvgl/lvgl: "^8.4.0"
|
||||
esp_lvgl_port: "^1.4.0"
|
||||
## Required IDF version
|
||||
idf:
|
||||
version: ">=5.3"
|
||||
# # Put list of dependencies here
|
||||
# # For components maintained by Espressif:
|
||||
# component: "~1.0.0"
|
||||
# # For 3rd party components:
|
||||
# username/component: ">=1.0.0,<2.0.0"
|
||||
# username2/component2:
|
||||
# version: "~1.0.0"
|
||||
# # For transient dependencies `public` flag can be set.
|
||||
# # `public` flag doesn't have an effect dependencies of the `main` component.
|
||||
# # All dependencies of `main` are public by default.
|
||||
# public: true
|
||||
description: "An AI voice assistant for ESP32"
|
||||
url: "https://github.com/78/xiaozhi-esp32"
|
||||
|
||||
|
||||
18
main/main.cc
18
main/main.cc
@@ -5,12 +5,11 @@
|
||||
#include "nvs.h"
|
||||
#include "nvs_flash.h"
|
||||
#include "driver/gpio.h"
|
||||
#include "esp_event.h"
|
||||
|
||||
#include "WifiConfigurationAp.h"
|
||||
#include "Application.h"
|
||||
#include "SystemInfo.h"
|
||||
#include "SystemReset.h"
|
||||
#include "BuiltinLed.h"
|
||||
|
||||
#define TAG "main"
|
||||
#define STATS_TICKS pdMS_TO_TICKS(1000)
|
||||
@@ -32,21 +31,6 @@ extern "C" void app_main(void)
|
||||
}
|
||||
ESP_ERROR_CHECK(ret);
|
||||
|
||||
// Get the WiFi configuration
|
||||
nvs_handle_t nvs_handle;
|
||||
ret = nvs_open("wifi", NVS_READONLY, &nvs_handle);
|
||||
|
||||
// If the WiFi configuration is not found, launch the WiFi configuration AP
|
||||
if (ret != ESP_OK) {
|
||||
auto& builtin_led = BuiltinLed::GetInstance();
|
||||
builtin_led.SetBlue();
|
||||
builtin_led.Blink(1000, 500);
|
||||
|
||||
WifiConfigurationAp::GetInstance().Start("Xiaozhi");
|
||||
return;
|
||||
}
|
||||
nvs_close(nvs_handle);
|
||||
|
||||
// Otherwise, launch the application
|
||||
Application::GetInstance().Start();
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
nvs, data, nvs, 0x9000, 0x4000,
|
||||
otadata, data, ota, 0xd000, 0x2000,
|
||||
phy_init, data, phy, 0xf000, 0x1000,
|
||||
model, data, spiffs, 0x100000, 1M,
|
||||
factory, app, factory, 0x200000, 2M,
|
||||
ota_0, app, ota_0, 0x400000, 2M,
|
||||
ota_1, app, ota_1, 0x600000, 2M,
|
||||
model, data, spiffs, 0x10000, 0xF0000,
|
||||
factory, app, factory, 0x200000, 4M,
|
||||
ota_0, app, ota_0, 0x600000, 4M,
|
||||
ota_1, app, ota_1, 0xA00000, 4M,
|
||||
|
||||
|
@@ -12,6 +12,7 @@ CONFIG_SPIRAM_MALLOC_ALWAYSINTERNAL=4096
|
||||
CONFIG_SPIRAM_TRY_ALLOCATE_WIFI_LWIP=y
|
||||
CONFIG_SPIRAM_MALLOC_RESERVE_INTERNAL=32768
|
||||
CONFIG_SPIRAM_MEMTEST=n
|
||||
CONFIG_MBEDTLS_EXTERNAL_MEM_ALLOC=y
|
||||
|
||||
CONFIG_HTTPD_MAX_REQ_HDR_LEN=2048
|
||||
CONFIG_HTTPD_MAX_URI_LEN=2048
|
||||
|
||||
160
websocket.md
Normal file
160
websocket.md
Normal file
@@ -0,0 +1,160 @@
|
||||
|
||||
# AI 语音交互通信协议文档
|
||||
|
||||
## 1. 连接建立与鉴权
|
||||
|
||||
客户端通过 WebSocket 连接到服务器时,需要在 HTTP 头中包含以下信息:
|
||||
|
||||
- `Authorization`: Bearer token,格式为 "Bearer <access_token>"
|
||||
- `Device-Id`: 设备 MAC 地址
|
||||
- `Protocol-Version`: 协议版本号,当前为 2
|
||||
|
||||
WebSocket URL: `wss://api.tenclass.net/xiaozhi/v1`
|
||||
|
||||
## 2. 二进制数据
|
||||
|
||||
客户端发送的二进制数据使用固定头格式的协议,如下:
|
||||
|
||||
```cpp
|
||||
struct BinaryProtocol {
|
||||
uint16_t version; // 二进制协议版本,当前为 2
|
||||
uint16_t type; // 消息类型(0:音频流数据,1:JSON)
|
||||
uint32_t reserved; // 保留字段
|
||||
uint32_t timestamp; // 时间戳(保留用作回声消除,也可以用于UDP不可靠传输中的排序)
|
||||
uint32_t payload_size; // 负载大小
|
||||
uint8_t payload[]; // 可以是音频数据(Opus 编码或协商的音频格式),也可以封装 JSON
|
||||
} __attribute__((packed));
|
||||
```
|
||||
|
||||
注意:所有多字节整数字段使用网络字节序(大端序)。
|
||||
|
||||
目前二进制数据跟 JSON 都是走同一个 WebSocket 连接,未来实时对话模式下,二进制音频数据可能走 UDP,可以扩展 hello 消息进行协商。
|
||||
|
||||
## 3. 音频数据传输
|
||||
|
||||
- 客户端到服务器: 使用二进制协议发送 Opus 编码的音频数据
|
||||
- 服务器到客户端: 使用二进制协议发送 Opus 编码的音频数据,格式与客户端发送的相同
|
||||
|
||||
出现 payload_size 为 0 的音频数据包可以用做句子边界标记,可以忽略,但不要报错。
|
||||
|
||||
## 4. 握手消息
|
||||
|
||||
连接建立后,客户端发送一个 JSON 格式的 "hello" 消息,初始化服务器端的音频解码器。
|
||||
不需要等待服务器响应,随后即可发送音频数据。
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "hello",
|
||||
"response_mode": "auto",
|
||||
"audio_params": {
|
||||
"format": "opus",
|
||||
"sample_rate": 16000,
|
||||
"channels": 1
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
应答模式 `response_mode` 可以为 `auto` 或 `manual`。
|
||||
|
||||
`auto`:自动应答模式,服务器实时计算音频 VAD 并自动决定何时开始应答。
|
||||
|
||||
`manual`:手动应答模式,客户端状态从 `listening` 变为 `idle` 时,服务器可以应答。
|
||||
|
||||
## 5. 状态更新
|
||||
|
||||
客户端在状态变化时发送 JSON 消息:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "state",
|
||||
"state": "<新状态>"
|
||||
}
|
||||
```
|
||||
|
||||
可能发送的状态值包括: `idle`, `wake_word_detected`, `listening`, `speaking`。
|
||||
|
||||
示例:
|
||||
|
||||
1、按住说话(`response_mode` 为 `manual`)
|
||||
|
||||
- 当按住说话按钮时,如果未连接服务器,则连接服务器,并编码、缓存当前音频数据,连接成功后,客户端设置状态为 `listening`,并在 hello 消息之后发送缓存的音频数据。
|
||||
- 当按住说话按钮时,如果已连接服务器,则客户端设置状态为 `listening`,并发送音频数据。
|
||||
- 当释放说话按钮时,状态变为 `idle`,此时服务器开始识别。
|
||||
- 服务器开始应答时,推送 `stt` 和 `tts` 消息。
|
||||
- 客户端开始播放音频时,状态设为 `speaking`。
|
||||
- 客户端结束播放音频时,状态设为 `idle`。
|
||||
- 在 `speaking` 状态下,按住说话按钮,会立即停止当前音频播放,状态变为 `listening`。
|
||||
|
||||
2、语音唤醒,轮流对话(`response_mode` 为 `auto`)
|
||||
|
||||
- 连接服务器,发送 hello 消息,发送唤醒词音频数据,然后发送状态 `wake_word_detected`,服务器开始应答。
|
||||
- 客户端开始播放音频时,状态设为 `speaking`,此时客户端不会发送音频数据。
|
||||
- 客户端结束播放音频时,状态设为 `listening`,此时客户端发送音频数据。
|
||||
- 服务器计算音频 VAD 自动选择时机开始应答时,推送 `stt` 和 `tts` 消息。
|
||||
- 客户端收到 `tts`.`start` 时,开始播放音频,状态设为 `speaking`。
|
||||
- 客户端收到 `tts`.`stop` 时,停止播放音频,状态设为 `listening`。
|
||||
|
||||
3、语音唤醒,实时对话(`response_mode` 为 `real_time`)
|
||||
|
||||
- 连接服务器,发送 hello 消息,发送唤醒词音频数据,然后发送状态 `wake_word_detected`,服务器开始应答。
|
||||
- 客户端开始播放音频时,状态设为 `speaking`。
|
||||
- 客户端结束播放音频时,状态设为 `listening`。
|
||||
- 在 `speaking` 和 `listening` 状态下,客户端都会发送音频数据。
|
||||
- 服务器计算音频 VAD 自动选择时机开始应答时,推送 `stt` 和 `tts` 消息。
|
||||
- 客户端收到 `stt` 时,状态设为 `listening`。如果当前有音频正在播放,则在当前 sentence 结束后停止播放音频。
|
||||
- 客户端收到 `tts`.`start` 时,开始播放音频,状态设为 `speaking`。
|
||||
- 客户端收到 `tts`.`stop` 时,停止播放音频,状态设为 `listening`。
|
||||
|
||||
## 6. 服务器到客户端的消息
|
||||
|
||||
### 6.1 语音识别结果 (STT)
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "stt",
|
||||
"text": "<识别出的文本>"
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 文本转语音 (TTS)
|
||||
|
||||
TTS开始:
|
||||
```json
|
||||
{
|
||||
"type": "tts",
|
||||
"state": "start",
|
||||
"sample_rate": 24000
|
||||
}
|
||||
```
|
||||
|
||||
句子开始:
|
||||
```json
|
||||
{
|
||||
"type": "tts",
|
||||
"state": "sentence_start",
|
||||
"text": "你在干什么呀?"
|
||||
}
|
||||
```
|
||||
|
||||
句子结束:
|
||||
```json
|
||||
{
|
||||
"type": "tts",
|
||||
"state": "sentence_end"
|
||||
}
|
||||
```
|
||||
|
||||
TTS结束:
|
||||
```json
|
||||
{
|
||||
"type": "tts",
|
||||
"state": "stop"
|
||||
}
|
||||
```
|
||||
|
||||
## 7. 连接管理
|
||||
|
||||
- 客户端检测到 WebSocket 断开连接时,应该停止音频播放并重置为空闲状态
|
||||
- 在断开连接后,客户端按需重新发起连接(比如按钮按下或语音唤醒)
|
||||
|
||||
这个文档概括了 WebSocket 通信协议的主要方面。
|
||||
Reference in New Issue
Block a user