dictation

package

v0.22.4 Latest Latest Go to latest Published: Apr 21, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Details

This section is empty.

This section is empty.

This section is empty.

type Config struct {
	PauseThreshold time.Duration
	MinSegment     time.Duration
	Padding        time.Duration
	Overlap        time.Duration
}

Config controls pause-based dictation segmentation.

type Processor struct {
	// contains filtered or unexported fields
}

Processor segments PCM audio into speech chunks using a VAD.

func NewProcessor(detector vad.Detector, cfg Config) *Processor

NewProcessor creates a dictation processor with sane defaults.

func (p *Processor) FeedPCM(pcm []byte) ([]Segment, error)

FeedPCM ingests raw S16 mono PCM and returns any segments flushed while processing.

func (p *Processor) Flush() ([]Segment, error)

Flush returns the trailing buffered segment, if any, and resets session state.

func (p *Processor) Reset()

Reset clears the current dictation session and VAD state.

type Segment struct {
	PCM       []byte
	Duration  time.Duration
	Paragraph bool
	Final     bool
}

Segment is a transcribable utterance extracted from a dictation session.