audio

package

v0.2.0 Latest Latest Go to latest Published: May 16, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Voxray-AI/Voxray

Links

Open Source Insights

Documentation ¶

Overview ¶

Package audio provides audio processors for the pipeline: VAD (voice activity detection) and an audio buffer processor that merges user and bot audio with optional turn-based and buffered callbacks.

Package audio provides audio processors (VAD, buffer/merge/turn callbacks).

Index ¶

func NewAudioFilterProcessorFromOptions(name string, opts json.RawMessage) processors.Processor
type AudioBufferProcessor
- func NewAudioBufferProcessor(name string, sampleRate, numChannels, bufferSize int, enableTurnAudio bool) *AudioBufferProcessor
type AudioFilterProcessor
- func NewAudioFilterProcessor(name string, chain *audiofilters.Chain) *AudioFilterProcessor
- func (p *AudioFilterProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error
type AudioFilterProcessorOptions
type VADProcessor
- func NewVADProcessor(name string, analyzer vad.Analyzer, speechActivityPeriod time.Duration) *VADProcessor
- func (p *VADProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewAudioFilterProcessorFromOptions ¶

func NewAudioFilterProcessorFromOptions(name string, opts json.RawMessage) processors.Processor

NewAudioFilterProcessorFromOptions builds an AudioFilterProcessor from JSON plugin options.

Types ¶

type AudioBufferProcessor ¶

type AudioBufferProcessor struct {
	*processors.BaseProcessor
	SampleRate      int // 0 = use from StartFrame
	NumChannels     int // 1 = mono mix, 2 = stereo interleave
	BufferSize      int // bytes; 0 = no buffered callbacks, only turn callbacks
	EnableTurnAudio bool

	OnAudioData         func(merged []byte, sampleRate, numChannels int)
	OnTrackAudioData    func(userBuf, botBuf []byte, sampleRate, numChannels int)
	OnUserTurnAudioData func(buf []byte, sampleRate, numChannels int)
	OnBotTurnAudioData  func(buf []byte, sampleRate, numChannels int)
	// contains filtered or unexported fields
}

AudioBufferProcessor buffers user and bot audio, resamples to a target rate, syncs buffers (pad with silence), and merges for mono (mix) or stereo (user=left, bot=right). Optional callbacks: OnAudioData (merged), OnTrackAudioData (separate tracks), and when EnableTurnAudio is true, OnUserTurnAudioData and OnBotTurnAudioData per turn.

func NewAudioBufferProcessor ¶

func NewAudioBufferProcessor(name string, sampleRate, numChannels, bufferSize int, enableTurnAudio bool) *AudioBufferProcessor

NewAudioBufferProcessor returns an AudioBufferProcessor with the given config.

func (*AudioBufferProcessor) Cleanup ¶

func (p *AudioBufferProcessor) Cleanup(ctx context.Context) error

Cleanup ensures any buffered audio is flushed before the processor is torn down.

func (*AudioBufferProcessor) ProcessFrame ¶

func (p *AudioBufferProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error

ProcessFrame buffers and resamples user/bot audio, syncs buffers, and invokes callbacks.

func (*AudioBufferProcessor) StartRecording ¶

func (p *AudioBufferProcessor) StartRecording()

StartRecording enables buffering and resets buffers.

func (*AudioBufferProcessor) StopRecording ¶

func (p *AudioBufferProcessor) StopRecording(ctx context.Context)

StopRecording flushes remaining audio via callbacks and stops recording.

type AudioFilterProcessor ¶

type AudioFilterProcessor struct {
	*processors.BaseProcessor

	Chain *audiofilters.Chain
}

AudioFilterProcessor applies a configured chain of audio filters to raw PCM audio frames before they reach downstream processors (e.g. VAD, STT).

func NewAudioFilterProcessor ¶

func NewAudioFilterProcessor(name string, chain *audiofilters.Chain) *AudioFilterProcessor

NewAudioFilterProcessor constructs an AudioFilterProcessor with the given filter chain. When chain is nil, the processor is a no-op pass-through.

func (*AudioFilterProcessor) ProcessFrame ¶

func (p *AudioFilterProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error

ProcessFrame applies filters to audio-carrying frames and forwards all frames unchanged otherwise.

type AudioFilterProcessorOptions ¶

type AudioFilterProcessorOptions struct {
	Filters []struct {
		// Type selects the filter implementation, e.g. "gain".
		Type string `json:"type"`
		// Gain configures GainFilter when Type is "gain".
		Gain float64 `json:"gain,omitempty"`
	} `json:"filters,omitempty"`
}

AudioFilterProcessorOptions describes JSON options for the "audio_filter" processor when used via plugin_options.

type VADProcessor ¶

type VADProcessor struct {
	*processors.BaseProcessor
	Analyzer             vad.Analyzer
	SpeechActivityPeriod time.Duration
	// contains filtered or unexported fields
}

VADProcessor processes audio frames through voice activity detection and pushes VAD-related frames downstream: VADUserStartedSpeakingFrame, VADUserStoppedSpeakingFrame, and optionally UserSpeakingFrame periodically while speech is detected.

func NewVADProcessor ¶

func NewVADProcessor(name string, analyzer vad.Analyzer, speechActivityPeriod time.Duration) *VADProcessor

NewVADProcessor returns a VADProcessor that uses the given analyzer. speechActivityPeriod is the minimum interval between UserSpeakingFrame pushes while speech is detected; zero defaults to 200ms.

func (*VADProcessor) ProcessFrame ¶

func (p *VADProcessor) ProcessFrame(ctx context.Context, f frames.Frame, dir processors.Direction) error

ProcessFrame forwards the frame downstream first, then runs VAD on audio frames and pushes VAD start/stop/activity frames on state transitions.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL