Documentation
¶
Overview ¶
Package dictation implements pause-based segmentation for Dictation Mode: it consumes VAD speech-probability frames and emits one transcription request per natural pause.
The package is platform-neutral and reusable: Device-Target and Server-Target both call into it. Audio capture and the STT call itself live in sibling packages (audio, stt) — this package only owns the "where does an utterance end" decision.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
PauseThreshold time.Duration
MinSegment time.Duration
Padding time.Duration
Overlap time.Duration
}
Config controls pause-based dictation segmentation.
type Processor ¶
type Processor struct {
// contains filtered or unexported fields
}
Processor segments PCM audio into speech chunks using a VAD.
func NewProcessor ¶
NewProcessor creates a dictation processor with sane defaults.
func (*Processor) FeedPCM ¶
FeedPCM ingests raw S16 mono PCM and returns any segments flushed while processing.