Documentation
¶
Overview ¶
Package audio normalizes inbound audio payloads to the Framework kernel's canonical PCM format (16 kHz, signed 16-bit little-endian, mono) before they enter the STT router. Every client format that the server accepts is converted here; the kernel itself never sees raw MP3, Opus, or stereo input.
Index ¶
Constants ¶
const TargetBytesPerSample = 2
TargetBytesPerSample is 16-bit signed little-endian.
const TargetChannels = 1
TargetChannels is the canonical channel count (mono) used by every STT provider.
const TargetSampleRate = 16000
TargetSampleRate is the canonical rate every STT provider expects.
Variables ¶
ErrFFmpegUnavailable is returned when a payload requires ffmpeg but the binary is not on PATH. Container deployments include ffmpeg; bare-metal dev environments may need to install it.
var ErrUnsupportedFormat = errors.New("audio: unsupported format (supported: wav, mp3, raw pcm16 @ 16kHz mono, webm/opus, ogg/opus)")
ErrUnsupportedFormat is returned when neither the Content-Type nor the payload's magic bytes identify a format we can decode.
Functions ¶
This section is empty.
Types ¶
type DecodedAudio ¶
type DecodedAudio struct {
PCM []byte
SourceFormat string
SourceRate int
SourceCh int
DurationMs int64
}
DecodedAudio is the canonical 16 kHz, mono, S16LE PCM payload plus the metadata recovered from the source.
func Decode ¶
func Decode(raw []byte, contentType string) (*DecodedAudio, error)
Decode consumes raw bytes and a client-provided Content-Type and returns a canonical DecodedAudio ready for the STT router. contentType may be empty; in that case magic-byte sniffing is attempted.
func DecodeReader ¶
DecodeReader is a convenience wrapper that buffers a reader (capped at maxBytes) and then invokes Decode.