Documentation
¶
Overview ¶
Package speechkit provides the public SDK for embedding SpeechKit voice capture and transcription into host applications.
The central type is Runtime, which manages shared state and event delivery. An Engine is the full voice pipeline; RecordingController and TranscriptionWorker can be composed independently for custom pipelines.
Index ¶
- Constants
- Variables
- func FallbackDictationSegments(fullPCM []byte) []dictation.Segment
- type AudioRecorder
- type Command
- type CommandBus
- type CommandType
- type CommitObserver
- type Completion
- type DictationSegmenter
- type Engine
- type Event
- type EventType
- type Hooks
- type JobSubmitter
- type Persistence
- type QuickNoteStore
- type RecordingController
- type RecordingObserver
- type RecordingStartOptions
- type RecordingStopOptions
- type Runtime
- func (r *Runtime) Close()
- func (r *Runtime) Commands() CommandBus
- func (r *Runtime) Events() <-chan Event
- func (r *Runtime) Publish(event Event) bool
- func (r *Runtime) SetState(snapshot Snapshot)
- func (r *Runtime) Start(ctx context.Context) error
- func (r *Runtime) State() Snapshot
- func (r *Runtime) Stop(ctx context.Context) error
- func (r *Runtime) UpdateState(update func(*Snapshot)) Snapshot
- type SegmentCollector
- type SegmentCollectorFactory
- type Snapshot
- type Submission
- type Transcriber
- type Transcript
- type TranscriptInterceptor
- type TranscriptOutput
- type TranscriptionJob
- type TranscriptionObserver
- type TranscriptionRunner
- type TranscriptionStore
- type TranscriptionWorker
- type TranscriptionWorkerConfig
Constants ¶
const ( DefaultDictationMinSegment = 1200 * time.Millisecond DefaultDictationPadding = 160 * time.Millisecond DefaultDictationOverlap = 200 * time.Millisecond )
const DefaultMinPCMBytes = 3200
const DefaultProcessingMessage = "Recording stopped · Transcribing"
Variables ¶
var ( ErrMissingRunner = errors.New("speechkit: transcription worker requires a runner") ErrMissingTranscriber = errors.New("speechkit: transcription runner requires a transcriber") ErrWorkerClosed = errors.New("speechkit: transcription worker is closed") ErrWorkerQueueFull = errors.New("speechkit: transcription worker queue is full") )
ErrCommandHandlerUnavailable is returned by [CommandBus.Dispatch] when no command handler has been configured on the Runtime.
Functions ¶
func FallbackDictationSegments ¶
FallbackDictationSegments wraps all of fullPCM in a single segment. Used when VAD-based segmentation is unavailable or produces no output.
Types ¶
type AudioRecorder ¶
AudioRecorder is the hardware abstraction for microphone capture.
type Command ¶
type Command struct {
Type CommandType
Text string
NoteID int64
Target string
Metadata map[string]string
}
Command is a request dispatched through the CommandBus.
type CommandBus ¶
CommandBus delivers Command values to the registered handler.
type CommandType ¶
type CommandType string
CommandType identifies the action a Command requests.
const ( CommandShowDashboard CommandType = "dashboard.show" CommandStartDictation CommandType = "dictation.start" CommandStopDictation CommandType = "dictation.stop" CommandSetActiveMode CommandType = "mode.set_active" CommandOpenQuickNote CommandType = "quicknote.open" CommandOpenQuickCapture CommandType = "quicknote.capture.open" CommandCloseQuickCapture CommandType = "quicknote.capture.close" CommandArmQuickNoteRecording CommandType = "quicknote.record.arm" CommandCopyLastTranscription CommandType = "transcription.copy_last" CommandInsertLastTranscription CommandType = "transcription.insert_last" CommandSummarizeSelection CommandType = "selection.summarize" )
type CommitObserver ¶
type CommitObserver interface {
OnCommit(completion Completion)
}
CommitObserver is notified after each successful TranscriptionRunner.Commit.
type Completion ¶
type Completion struct {
Transcript Transcript
QuickNoteCommitted bool
QuickNoteCreated bool
QuickNoteID int64
TranscriptionPersisted bool
}
Completion describes the outcome of a TranscriptionRunner.Commit call.
type DictationSegmenter ¶
type DictationSegmenter struct {
// contains filtered or unexported fields
}
DictationSegmenter implements SegmentCollector using VAD-based pause detection to split continuous speech into discrete segments.
func NewDictationSegmenter ¶
func NewDictationSegmenter(detector vad.Detector, pauseThreshold time.Duration) *DictationSegmenter
func (*DictationSegmenter) CollectStopSegments ¶
func (s *DictationSegmenter) CollectStopSegments(fullPCM []byte) ([]dictation.Segment, error)
func (*DictationSegmenter) FeedPCM ¶
func (s *DictationSegmenter) FeedPCM(pcm []byte) error
type Engine ¶
type Engine interface {
Start(context.Context) error
Stop(context.Context) error
Events() <-chan Event
Commands() CommandBus
State() Snapshot
}
Engine is the interface implemented by a full SpeechKit voice pipeline.
type Event ¶
type Event struct {
Type EventType
Time time.Time
Message string
Text string
Provider string
QuickNote bool
Err error
Shortcut string
}
Event is a notification published to the event channel returned by Runtime.Events. Consumers should switch on Type and inspect the relevant fields.
type EventType ¶
type EventType string
EventType identifies the kind of event published to the event channel.
const ( EventStateChanged EventType = "state.changed" EventRecordingStarted EventType = "recording.started" EventProcessingStarted EventType = "processing.started" EventTranscriptionReady EventType = "transcription.ready" EventTranscriptCommitted EventType = "transcription.committed" EventQuickNoteModeArmed EventType = "quicknote.mode_armed" EventQuickNoteUpdated EventType = "quicknote.updated" EventWarningRaised EventType = "warning.raised" EventErrorRaised EventType = "error.raised" EventShortcutMatched EventType = "shortcut.matched" )
type Hooks ¶
type Hooks struct {
Start func(context.Context) error
Stop func(context.Context) error
HandleCommand func(context.Context, Command) error
}
Hooks are the lifecycle callbacks wired into a Runtime. Nil hooks are silently skipped.
type JobSubmitter ¶
type JobSubmitter interface {
Submit(TranscriptionJob) error
}
JobSubmitter accepts a TranscriptionJob for async processing.
type Persistence ¶
type Persistence interface {
QuickNoteStore
TranscriptionStore
}
Persistence combines QuickNoteStore and TranscriptionStore.
type QuickNoteStore ¶
type QuickNoteStore interface {
SaveQuickNote(ctx context.Context, text, language, provider string, durationMs, latencyMs int64, audioData []byte) (int64, error)
GetQuickNoteText(ctx context.Context, id int64) (string, error)
UpdateQuickNote(ctx context.Context, id int64, text string) error
UpdateQuickNoteCapture(ctx context.Context, id int64, text, provider string, durationMs, latencyMs int64, audioData []byte) error
}
QuickNoteStore persists and retrieves Quick Note records.
type RecordingController ¶
type RecordingController struct {
// contains filtered or unexported fields
}
RecordingController manages the start/stop lifecycle of a single recording session and hands audio segments to the submission queue.
func NewRecordingController ¶
func NewRecordingController(recorder AudioRecorder, submitter JobSubmitter, observer RecordingObserver, segmenterFactory SegmentCollectorFactory) *RecordingController
func (*RecordingController) IsRecording ¶
func (c *RecordingController) IsRecording() bool
func (*RecordingController) Start ¶
func (c *RecordingController) Start(opts RecordingStartOptions) error
func (*RecordingController) Stop ¶
func (c *RecordingController) Stop(opts RecordingStopOptions) error
type RecordingObserver ¶
type RecordingStartOptions ¶
type RecordingStopOptions ¶
type RecordingStopOptions struct {
Label string
}
type Runtime ¶
type Runtime struct {
// contains filtered or unexported fields
}
Runtime manages shared observable state and event delivery for a SpeechKit session. Create one with NewRuntime and wire it into the host application via Runtime.Events and Runtime.Commands.
func NewRuntime ¶
func (*Runtime) Commands ¶
func (r *Runtime) Commands() CommandBus
func (*Runtime) UpdateState ¶
type SegmentCollector ¶
type SegmentCollector interface {
FeedPCM([]byte) error
CollectStopSegments(fullPCM []byte) ([]dictation.Segment, error)
}
SegmentCollector accumulates real-time PCM frames and splits them into dictation segments when recording stops.
type SegmentCollectorFactory ¶
type SegmentCollectorFactory func() SegmentCollector
type Snapshot ¶
type Snapshot struct {
Status string
Text string
Level float64
Hotkey string
ActiveMode string
Providers []string
ActiveProfiles map[string]string
Transcriptions int
QuickNoteMode bool
QuickCaptureMode bool
LastTranscriptionText string
}
Snapshot is a point-in-time copy of the Runtime's observable state. All slice and map fields are safe to read without holding any lock.
type Submission ¶
type Submission struct {
PCM []byte
WAV []byte
DurationSecs float64
Language string
Prefix string
QuickNote bool
QuickNoteID int64
}
Submission carries a single audio segment and its metadata into the transcription pipeline.
type Transcriber ¶
type Transcriber interface {
Transcribe(ctx context.Context, audio []byte, durationSecs float64, language string) (Transcript, error)
}
Transcriber converts raw WAV audio into a Transcript.
type Transcript ¶
type Transcript struct {
Text string
Language string
Duration time.Duration
Provider string
Model string
Confidence float64
}
Transcript holds the result of a single transcription call.
type TranscriptInterceptor ¶
type TranscriptInterceptor interface {
Intercept(ctx context.Context, transcript Transcript, target any) (bool, error)
}
TranscriptInterceptor can handle a transcript before it reaches the normal output path. Return (true, nil) to signal that the transcript was consumed.
type TranscriptOutput ¶
type TranscriptOutput interface {
Deliver(ctx context.Context, transcript Transcript, target any) error
}
TranscriptOutput delivers a completed Transcript to the host application (e.g. clipboard injection or text-field paste).
type TranscriptionJob ¶
type TranscriptionJob struct {
Submission
Target any
}
TranscriptionJob pairs a Submission with its delivery target.
func (TranscriptionJob) Clone ¶
func (j TranscriptionJob) Clone() TranscriptionJob
type TranscriptionObserver ¶
type TranscriptionObserver interface {
OnState(status, text string)
OnLog(message, kind string)
OnTranscriptCommitted(transcript Transcript, quickNote bool)
}
TranscriptionObserver receives real-time status and log updates from a TranscriptionWorker during processing.
type TranscriptionRunner ¶
type TranscriptionRunner struct {
// contains filtered or unexported fields
}
TranscriptionRunner transcribes audio submissions and persists results. Create one with NewTranscriptionRunner.
func NewTranscriptionRunner ¶
func NewTranscriptionRunner(transcriber Transcriber, store Persistence) *TranscriptionRunner
NewTranscriptionRunner creates a TranscriptionRunner backed by the given transcriber and persistence store. Either argument may be nil.
func (*TranscriptionRunner) Commit ¶
func (r *TranscriptionRunner) Commit(ctx context.Context, submission Submission, transcript Transcript) (Completion, error)
func (*TranscriptionRunner) WithObserver ¶
func (r *TranscriptionRunner) WithObserver(observer CommitObserver) *TranscriptionRunner
type TranscriptionStore ¶
type TranscriptionStore interface {
SaveTranscription(ctx context.Context, text, language, provider, model string, durationMs, latencyMs int64, audioData []byte) error
}
TranscriptionStore persists completed dictation transcriptions.
type TranscriptionWorker ¶
type TranscriptionWorker struct {
// contains filtered or unexported fields
}
TranscriptionWorker processes TranscriptionJob values from an internal queue on a single goroutine. Start it with TranscriptionWorker.Start and submit work with TranscriptionWorker.Submit.
func NewTranscriptionWorker ¶
func NewTranscriptionWorker(cfg TranscriptionWorkerConfig) (*TranscriptionWorker, error)
func (*TranscriptionWorker) Close ¶
func (w *TranscriptionWorker) Close()
func (*TranscriptionWorker) Start ¶
func (w *TranscriptionWorker) Start(ctx context.Context)
func (*TranscriptionWorker) Submit ¶
func (w *TranscriptionWorker) Submit(job TranscriptionJob) error
func (*TranscriptionWorker) Wait ¶
func (w *TranscriptionWorker) Wait()
type TranscriptionWorkerConfig ¶
type TranscriptionWorkerConfig struct {
Timeout time.Duration
QueueSize int
Runner *TranscriptionRunner
Output TranscriptOutput
Interceptor TranscriptInterceptor
Observer TranscriptionObserver
}
TranscriptionWorkerConfig configures a TranscriptionWorker. Runner is required; all other fields are optional.