vad

package
v0.30.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

Documentation

Index

Constants

View Source
const (
	SampleRate     = 16000
	FrameSize      = 512 // 32ms at 16kHz
	BytesPerSample = 2
	FrameBytes     = FrameSize * BytesPerSample
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Detector

type Detector interface {
	ProcessFrame([]int16) (float32, error)
	Reset()
}

Detector is the speech-probability contract consumed by dictation processors.

type SileroVAD

type SileroVAD struct {
	// contains filtered or unexported fields
}

SileroVAD runs voice activity detection via ONNX Runtime. <1ms per frame, ~2MB model, no CGo beyond onnxruntime DLL.

func NewSileroVAD

func NewSileroVAD(modelPath string) (*SileroVAD, error)

NewSileroVAD loads the Silero VAD ONNX model and prepares inference tensors. The onnxruntime shared library must already be in PATH or beside the executable.

func (*SileroVAD) Close

func (v *SileroVAD) Close()

func (*SileroVAD) ProcessFrame

func (v *SileroVAD) ProcessFrame(pcm []int16) (float32, error)

ProcessFrame returns speech probability (0.0-1.0) for a single audio frame. pcm must contain exactly FrameSize samples of S16 PCM.

func (*SileroVAD) Reset

func (v *SileroVAD) Reset()

Reset clears the hidden state for a new recording session.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL