Files
xiaozhi-esp32/main/audio/processors/afe_audio_processor.h
Xiaoxia 2b025c4ea6 Enhance audio processing and wake word detection (#1739)
* Enhance audio processing and wake word detection

- Set task priority in Application::Run to improve responsiveness.
- Log detected wake words with their state in HandleWakeWordDetectedEvent.
- Streamline audio feeding in AudioService to handle both wake word and audio processor events.
- Implement input buffering in AfeAudioProcessor, AfeWakeWord, CustomWakeWord, and EspWakeWord to manage audio data more efficiently.
- Clear input buffers on stop to prevent residual data issues.

* Refactor audio processing to enhance thread safety and state management

- Implement early return checks in Feed methods of AfeAudioProcessor, AfeWakeWord, CustomWakeWord, and EspWakeWord to prevent processing when not running.
- Introduce std::atomic for running state in CustomWakeWord and EspWakeWord to ensure thread-safe access.
- Consolidate input buffer management with mutex locks to avoid race conditions during Stop and Feed operations.

* Refactor listening mode handling and wake word detection configuration

- Replace direct mode setting logic with a new GetDefaultListeningMode method for improved clarity and maintainability.
- Update HandleToggleChatEvent, HandleWakeWordDetectedEvent, and ContinueWakeWordInvoke to utilize the new method for determining listening mode.
- Introduce Kconfig option WAKE_WORD_DETECTION_IN_LISTENING to enable or disable wake word detection during listening mode, enhancing configurability.
2026-02-04 14:28:21 +08:00

48 lines
1.4 KiB
C++

#ifndef AFE_AUDIO_PROCESSOR_H
#define AFE_AUDIO_PROCESSOR_H
#include <esp_afe_sr_models.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include <freertos/event_groups.h>
#include <string>
#include <vector>
#include <functional>
#include <mutex>
#include "audio_processor.h"
#include "audio_codec.h"
class AfeAudioProcessor : public AudioProcessor {
public:
AfeAudioProcessor();
~AfeAudioProcessor();
void Initialize(AudioCodec* codec, int frame_duration_ms, srmodel_list_t* models_list) override;
void Feed(std::vector<int16_t>&& data) override;
void Start() override;
void Stop() override;
bool IsRunning() override;
void OnOutput(std::function<void(std::vector<int16_t>&& data)> callback) override;
void OnVadStateChange(std::function<void(bool speaking)> callback) override;
size_t GetFeedSize() override;
void EnableDeviceAec(bool enable) override;
private:
EventGroupHandle_t event_group_ = nullptr;
const esp_afe_sr_iface_t* afe_iface_ = nullptr;
esp_afe_sr_data_t* afe_data_ = nullptr;
std::function<void(std::vector<int16_t>&& data)> output_callback_;
std::function<void(bool speaking)> vad_state_change_callback_;
AudioCodec* codec_ = nullptr;
int frame_samples_ = 0;
bool is_speaking_ = false;
std::vector<int16_t> input_buffer_;
std::mutex input_buffer_mutex_;
std::vector<int16_t> output_buffer_;
void AudioProcessorTask();
};
#endif