audio

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 16, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package audio provides audio processors for the pipeline: VAD (voice activity detection) and an audio buffer processor that merges user and bot audio with optional turn-based and buffered callbacks.

Package audio provides audio processors (VAD, buffer/merge/turn callbacks).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewAudioFilterProcessorFromOptions

func NewAudioFilterProcessorFromOptions(name string, opts json.RawMessage) processors.Processor

NewAudioFilterProcessorFromOptions builds an AudioFilterProcessor from JSON plugin options.

Types

type AudioBufferProcessor

type AudioBufferProcessor struct {
	*processors.BaseProcessor
	SampleRate      int // 0 = use from StartFrame
	NumChannels     int // 1 = mono mix, 2 = stereo interleave
	BufferSize      int // bytes; 0 = no buffered callbacks, only turn callbacks
	EnableTurnAudio bool

	OnAudioData         func(merged []byte, sampleRate, numChannels int)
	OnTrackAudioData    func(userBuf, botBuf []byte, sampleRate, numChannels int)
	OnUserTurnAudioData func(buf []byte, sampleRate, numChannels int)
	OnBotTurnAudioData  func(buf []byte, sampleRate, numChannels int)
	// contains filtered or unexported fields
}

AudioBufferProcessor buffers user and bot audio, resamples to a target rate, syncs buffers (pad with silence), and merges for mono (mix) or stereo (user=left, bot=right). Optional callbacks: OnAudioData (merged), OnTrackAudioData (separate tracks), and when EnableTurnAudio is true, OnUserTurnAudioData and OnBotTurnAudioData per turn.

func NewAudioBufferProcessor

func NewAudioBufferProcessor(name string, sampleRate, numChannels, bufferSize int, enableTurnAudio bool) *AudioBufferProcessor

NewAudioBufferProcessor returns an AudioBufferProcessor with the given config.

func (*AudioBufferProcessor) Cleanup

func (p *AudioBufferProcessor) Cleanup(ctx context.Context) error

Cleanup ensures any buffered audio is flushed before the processor is torn down.

func (*AudioBufferProcessor) ProcessFrame

ProcessFrame buffers and resamples user/bot audio, syncs buffers, and invokes callbacks.

func (*AudioBufferProcessor) StartRecording

func (p *AudioBufferProcessor) StartRecording()

StartRecording enables buffering and resets buffers.

func (*AudioBufferProcessor) StopRecording

func (p *AudioBufferProcessor) StopRecording(ctx context.Context)

StopRecording flushes remaining audio via callbacks and stops recording.

type AudioFilterProcessor

type AudioFilterProcessor struct {
	*processors.BaseProcessor

	Chain *audiofilters.Chain
}

AudioFilterProcessor applies a configured chain of audio filters to raw PCM audio frames before they reach downstream processors (e.g. VAD, STT).

func NewAudioFilterProcessor

func NewAudioFilterProcessor(name string, chain *audiofilters.Chain) *AudioFilterProcessor

NewAudioFilterProcessor constructs an AudioFilterProcessor with the given filter chain. When chain is nil, the processor is a no-op pass-through.

func (*AudioFilterProcessor) ProcessFrame

ProcessFrame applies filters to audio-carrying frames and forwards all frames unchanged otherwise.

type AudioFilterProcessorOptions

type AudioFilterProcessorOptions struct {
	Filters []struct {
		// Type selects the filter implementation, e.g. "gain".
		Type string `json:"type"`
		// Gain configures GainFilter when Type is "gain".
		Gain float64 `json:"gain,omitempty"`
	} `json:"filters,omitempty"`
}

AudioFilterProcessorOptions describes JSON options for the "audio_filter" processor when used via plugin_options.

type VADProcessor

type VADProcessor struct {
	*processors.BaseProcessor
	Analyzer             vad.Analyzer
	SpeechActivityPeriod time.Duration
	// contains filtered or unexported fields
}

VADProcessor processes audio frames through voice activity detection and pushes VAD-related frames downstream: VADUserStartedSpeakingFrame, VADUserStoppedSpeakingFrame, and optionally UserSpeakingFrame periodically while speech is detected.

func NewVADProcessor

func NewVADProcessor(name string, analyzer vad.Analyzer, speechActivityPeriod time.Duration) *VADProcessor

NewVADProcessor returns a VADProcessor that uses the given analyzer. speechActivityPeriod is the minimum interval between UserSpeakingFrame pushes while speech is detected; zero defaults to 200ms.

func (*VADProcessor) ProcessFrame

func (p *VADProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error

ProcessFrame forwards the frame downstream first, then runs VAD on audio frames and pushes VAD start/stop/activity frames on state transitions.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL