vad

package

v0.30.0 Latest Latest Go to latest Published: May 7, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/kombifyio/SpeechKit

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
type Detector
type SileroVAD
- func NewSileroVAD(modelPath string) (*SileroVAD, error)

Constants ¶

View Source

const (
	SampleRate     = 16000
	FrameSize      = 512 // 32ms at 16kHz
	BytesPerSample = 2
	FrameBytes     = FrameSize * BytesPerSample
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Detector ¶

type Detector interface {
	ProcessFrame([]int16) (float32, error)
	Reset()
}

Detector is the speech-probability contract consumed by dictation processors.

type SileroVAD ¶

type SileroVAD struct {
	// contains filtered or unexported fields
}

SileroVAD runs voice activity detection via ONNX Runtime. <1ms per frame, ~2MB model, no CGo beyond onnxruntime DLL.

func NewSileroVAD ¶

func NewSileroVAD(modelPath string) (*SileroVAD, error)

NewSileroVAD loads the Silero VAD ONNX model and prepares inference tensors. The onnxruntime shared library must already be in PATH or beside the executable.

func (*SileroVAD) Close ¶

func (v *SileroVAD) Close()

func (*SileroVAD) ProcessFrame ¶

func (v *SileroVAD) ProcessFrame(pcm []int16) (float32, error)

ProcessFrame returns speech probability (0.0-1.0) for a single audio frame. pcm must contain exactly FrameSize samples of S16 PCM.

func (*SileroVAD) Reset ¶

func (v *SileroVAD) Reset()

Reset clears the hidden state for a new recording session.

Source Files ¶

View all Source files

silero.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL