Documentation
¶
Overview ¶
Package audio provides audio processors for the pipeline: VAD (voice activity detection) and an audio buffer processor that merges user and bot audio with optional turn-based and buffered callbacks.
Package audio provides audio processors (VAD, buffer/merge/turn callbacks).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewAudioFilterProcessorFromOptions ¶
func NewAudioFilterProcessorFromOptions(name string, opts json.RawMessage) processors.Processor
NewAudioFilterProcessorFromOptions builds an AudioFilterProcessor from JSON plugin options.
Types ¶
type AudioBufferProcessor ¶
type AudioBufferProcessor struct {
*processors.BaseProcessor
SampleRate int // 0 = use from StartFrame
NumChannels int // 1 = mono mix, 2 = stereo interleave
BufferSize int // bytes; 0 = no buffered callbacks, only turn callbacks
EnableTurnAudio bool
OnAudioData func(merged []byte, sampleRate, numChannels int)
OnTrackAudioData func(userBuf, botBuf []byte, sampleRate, numChannels int)
OnUserTurnAudioData func(buf []byte, sampleRate, numChannels int)
OnBotTurnAudioData func(buf []byte, sampleRate, numChannels int)
// contains filtered or unexported fields
}
AudioBufferProcessor buffers user and bot audio, resamples to a target rate, syncs buffers (pad with silence), and merges for mono (mix) or stereo (user=left, bot=right). Optional callbacks: OnAudioData (merged), OnTrackAudioData (separate tracks), and when EnableTurnAudio is true, OnUserTurnAudioData and OnBotTurnAudioData per turn.
func NewAudioBufferProcessor ¶
func NewAudioBufferProcessor(name string, sampleRate, numChannels, bufferSize int, enableTurnAudio bool) *AudioBufferProcessor
NewAudioBufferProcessor returns an AudioBufferProcessor with the given config.
func (*AudioBufferProcessor) Cleanup ¶
func (p *AudioBufferProcessor) Cleanup(ctx context.Context) error
Cleanup ensures any buffered audio is flushed before the processor is torn down.
func (*AudioBufferProcessor) ProcessFrame ¶
func (p *AudioBufferProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error
ProcessFrame buffers and resamples user/bot audio, syncs buffers, and invokes callbacks.
func (*AudioBufferProcessor) StartRecording ¶
func (p *AudioBufferProcessor) StartRecording()
StartRecording enables buffering and resets buffers.
func (*AudioBufferProcessor) StopRecording ¶
func (p *AudioBufferProcessor) StopRecording(ctx context.Context)
StopRecording flushes remaining audio via callbacks and stops recording.
type AudioFilterProcessor ¶
type AudioFilterProcessor struct {
*processors.BaseProcessor
Chain *audiofilters.Chain
}
AudioFilterProcessor applies a configured chain of audio filters to raw PCM audio frames before they reach downstream processors (e.g. VAD, STT).
func NewAudioFilterProcessor ¶
func NewAudioFilterProcessor(name string, chain *audiofilters.Chain) *AudioFilterProcessor
NewAudioFilterProcessor constructs an AudioFilterProcessor with the given filter chain. When chain is nil, the processor is a no-op pass-through.
func (*AudioFilterProcessor) ProcessFrame ¶
func (p *AudioFilterProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error
ProcessFrame applies filters to audio-carrying frames and forwards all frames unchanged otherwise.
type AudioFilterProcessorOptions ¶
type AudioFilterProcessorOptions struct {
Filters []struct {
// Type selects the filter implementation, e.g. "gain".
Type string `json:"type"`
// Gain configures GainFilter when Type is "gain".
Gain float64 `json:"gain,omitempty"`
} `json:"filters,omitempty"`
}
AudioFilterProcessorOptions describes JSON options for the "audio_filter" processor when used via plugin_options.
type VADProcessor ¶
type VADProcessor struct {
*processors.BaseProcessor
Analyzer vad.Analyzer
SpeechActivityPeriod time.Duration
// contains filtered or unexported fields
}
VADProcessor processes audio frames through voice activity detection and pushes VAD-related frames downstream: VADUserStartedSpeakingFrame, VADUserStoppedSpeakingFrame, and optionally UserSpeakingFrame periodically while speech is detected.
func NewVADProcessor ¶
func NewVADProcessor(name string, analyzer vad.Analyzer, speechActivityPeriod time.Duration) *VADProcessor
NewVADProcessor returns a VADProcessor that uses the given analyzer. speechActivityPeriod is the minimum interval between UserSpeakingFrame pushes while speech is detected; zero defaults to 200ms.
func (*VADProcessor) ProcessFrame ¶
func (p *VADProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error
ProcessFrame forwards the frame downstream first, then runs VAD on audio frames and pushes VAD start/stop/activity frames on state transitions.