Documentation
¶
Index ¶
Constants ¶
View Source
const ( SampleRate = 16000 FrameSize = 512 // 32ms at 16kHz BytesPerSample = 2 FrameBytes = FrameSize * BytesPerSample )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type SileroVAD ¶
type SileroVAD struct {
// contains filtered or unexported fields
}
SileroVAD runs voice activity detection via ONNX Runtime. <1ms per frame, ~2MB model, no CGo beyond onnxruntime DLL.
func NewSileroVAD ¶
NewSileroVAD loads the Silero VAD ONNX model and prepares inference tensors. The onnxruntime shared library must already be in PATH or beside the executable.
func (*SileroVAD) ProcessFrame ¶
ProcessFrame returns speech probability (0.0-1.0) for a single audio frame. pcm must contain exactly FrameSize samples of S16 PCM.
Click to show internal directories.
Click to hide internal directories.