Documentation
¶
Overview ¶
Package voiceagent is the Voice Agent kernel — realtime audio-to-audio session manager backed by Gemini Live, with Persona/Role/Sequence resolution from internal/voicebehavior. The local desktop provider (cmd/speechkit) and the server adapter (internal/server/voiceagent) both depend on this package; the kernel itself stays free of OS or HTTP plumbing.
Audit 2026-05-24 maintainability sweep.
Index ¶
- Constants
- func RenderHostInstructionUpdate(cfg LiveConfig) string
- type ActivityDetectionPolicy
- type ActivityHandling
- type Callbacks
- type ContextCompressionPolicy
- type EndSensitivity
- type GeminiLive
- func (g *GeminiLive) Close() error
- func (g *GeminiLive) Connect(ctx context.Context, cfg LiveConfig) error
- func (g *GeminiLive) Name() string
- func (g *GeminiLive) Receive(ctx context.Context) (*LiveMessage, error)
- func (g *GeminiLive) Reconnect(ctx context.Context) error
- func (g *GeminiLive) SendAudio(chunk []byte) error
- func (g *GeminiLive) SendAudioStreamEnd() error
- func (g *GeminiLive) SendText(text string) error
- func (g *GeminiLive) SendToolResponse(response ToolResponse) error
- type IdleConfig
- type IdleTimer
- type LiveConfig
- type LiveInstructionUpdater
- type LiveMessage
- type LivePolicies
- type LiveProvider
- type LiveReconnector
- type LocalVoiceAgentDeps
- type LocalVoiceAgentProvider
- func (p *LocalVoiceAgentProvider) Close() error
- func (p *LocalVoiceAgentProvider) Connect(ctx context.Context, cfg live.LiveConfig) error
- func (p *LocalVoiceAgentProvider) Name() string
- func (p *LocalVoiceAgentProvider) Receive(ctx context.Context) (*live.LiveMessage, error)
- func (p *LocalVoiceAgentProvider) SendAudio(chunk []byte) error
- func (p *LocalVoiceAgentProvider) SendAudioStreamEnd() error
- func (p *LocalVoiceAgentProvider) SendText(text string) error
- func (p *LocalVoiceAgentProvider) SendToolResponse(_ live.ToolResponse) error
- func (p *LocalVoiceAgentProvider) UpdateInstructions(ctx context.Context, cfg live.LiveConfig) error
- type OpenAILive
- func (p *OpenAILive) Close() error
- func (p *OpenAILive) Connect(ctx context.Context, cfg LiveConfig) error
- func (p *OpenAILive) Name() string
- func (p *OpenAILive) Receive(ctx context.Context) (*LiveMessage, error)
- func (p *OpenAILive) SendAudio(chunk []byte) error
- func (p *OpenAILive) SendAudioStreamEnd() error
- func (p *OpenAILive) SendText(text string) error
- func (p *OpenAILive) SendToolResponse(response ToolResponse) error
- func (p *OpenAILive) UpdateInstructions(ctx context.Context, cfg LiveConfig) error
- type Session
- type StartSensitivity
- type State
- type ThinkingLevel
- type ThinkingPolicy
- type ToolBehavior
- type ToolCall
- type ToolDefinition
- type ToolResponse
- type ToolResponseScheduling
- type TurnCoverage
- type WorkflowConfig
- type WorkflowStep
Constants ¶
const ( StateInactive = live.StateInactive StateConnecting = live.StateConnecting StateListening = live.StateListening StateProcessing = live.StateProcessing StateSpeaking = live.StateSpeaking StateRecovering = live.StateRecovering StateDeactivating = live.StateDeactivating ThinkingLevelOff = live.ThinkingLevelOff ThinkingLevelLow = live.ThinkingLevelLow ThinkingLevelMedium = live.ThinkingLevelMedium ThinkingLevelHigh = live.ThinkingLevelHigh StartSensitivityLow = live.StartSensitivityLow StartSensitivityMedium = live.StartSensitivityMedium StartSensitivityHigh = live.StartSensitivityHigh EndSensitivityLow = live.EndSensitivityLow EndSensitivityMedium = live.EndSensitivityMedium EndSensitivityHigh = live.EndSensitivityHigh ActivityHandlingUnspecified = live.ActivityHandlingUnspecified ActivityHandlingNoInterrupt = live.ActivityHandlingNoInterrupt ActivityHandlingStartOfActivityInterrupts = live.ActivityHandlingStartOfActivityInterrupts TurnCoverageUnspecified = live.TurnCoverageUnspecified TurnCoverageTurnIncludesOnlyActivity = live.TurnCoverageTurnIncludesOnlyActivity TurnCoverageTurnIncludesAllInput = live.TurnCoverageTurnIncludesAllInput TurnCoverageTurnIncludesAudioActivity = live.TurnCoverageTurnIncludesAudioActivity ToolBehaviorUnspecified = live.ToolBehaviorUnspecified ToolBehaviorBlocking = live.ToolBehaviorBlocking ToolBehaviorNonBlocking = live.ToolBehaviorNonBlocking ToolResponseSchedulingUnspecified = live.ToolResponseSchedulingUnspecified ToolResponseSchedulingSilent = live.ToolResponseSchedulingSilent ToolResponseSchedulingWhenIdle = live.ToolResponseSchedulingWhenIdle ToolResponseSchedulingInterrupt = live.ToolResponseSchedulingInterrupt )
const DefaultOpenAIRealtimeModel = defaultOpenAIRealtimeModel
DefaultOpenAIRealtimeModel is the public runtime default for OpenAI-backed Voice Agent sessions.
Variables ¶
This section is empty.
Functions ¶
func RenderHostInstructionUpdate ¶ added in v0.28.2
func RenderHostInstructionUpdate(cfg LiveConfig) string
RenderHostInstructionUpdate is a thin wrapper around live.RenderHostInstructionUpdate.
Types ¶
type ActivityDetectionPolicy ¶ added in v0.18.0
type ActivityDetectionPolicy = live.ActivityDetectionPolicy
type ActivityHandling ¶ added in v0.18.0
type ActivityHandling = live.ActivityHandling
type ContextCompressionPolicy ¶ added in v0.18.0
type ContextCompressionPolicy = live.ContextCompressionPolicy
type EndSensitivity ¶ added in v0.18.0
type EndSensitivity = live.EndSensitivity
type GeminiLive ¶
type GeminiLive struct {
// contains filtered or unexported fields
}
GeminiLive implements LiveProvider using the Google GenAI Live API.
func (*GeminiLive) Close ¶
func (g *GeminiLive) Close() error
func (*GeminiLive) Connect ¶
func (g *GeminiLive) Connect(ctx context.Context, cfg LiveConfig) error
func (*GeminiLive) Name ¶
func (g *GeminiLive) Name() string
func (*GeminiLive) Receive ¶
func (g *GeminiLive) Receive(ctx context.Context) (*LiveMessage, error)
func (*GeminiLive) Reconnect ¶ added in v0.18.0
func (g *GeminiLive) Reconnect(ctx context.Context) error
Reconnect re-establishes the session using the stored resumption handle. If the handle has expired (TTL) or been cleared, a fresh session is opened.
func (*GeminiLive) SendAudio ¶
func (g *GeminiLive) SendAudio(chunk []byte) error
func (*GeminiLive) SendAudioStreamEnd ¶ added in v0.22.4
func (g *GeminiLive) SendAudioStreamEnd() error
func (*GeminiLive) SendText ¶
func (g *GeminiLive) SendText(text string) error
func (*GeminiLive) SendToolResponse ¶ added in v0.18.0
func (g *GeminiLive) SendToolResponse(response ToolResponse) error
type IdleConfig ¶
type IdleConfig = live.IdleConfig
func DefaultIdleConfig ¶
func DefaultIdleConfig() IdleConfig
DefaultIdleConfig is a thin wrapper around live.DefaultIdleConfig.
type IdleTimer ¶
func NewIdleTimer ¶
func NewIdleTimer(cfg IdleConfig, session *Session) *IdleTimer
NewIdleTimer is a thin wrapper around live.NewIdleTimer.
type LiveConfig ¶
type LiveConfig = live.LiveConfig
type LiveInstructionUpdater ¶ added in v0.28.2
type LiveInstructionUpdater = live.LiveInstructionUpdater
type LiveMessage ¶
type LiveMessage = live.LiveMessage
type LivePolicies ¶ added in v0.18.0
type LivePolicies = live.LivePolicies
type LiveProvider ¶
type LiveProvider = live.LiveProvider
type LiveReconnector ¶ added in v0.18.0
type LiveReconnector = live.LiveReconnector
type LocalVoiceAgentDeps ¶ added in v0.34.9
type LocalVoiceAgentDeps struct {
STT cascaded.STT
Agent cascaded.Agent
TTS cascaded.TTS
Config cascaded.Config
}
LocalVoiceAgentDeps bundles the kernel-level routers and flows the Device-Target wires into the local Voice Agent provider.
Why a separate name from cascaded.Deps: a Device-Target caller cares about "I'm wiring the LOCAL provider", not about the underlying cascaded turn-based architecture. The latter is an implementation detail of the local path.
type LocalVoiceAgentProvider ¶ added in v0.34.9
type LocalVoiceAgentProvider struct {
// contains filtered or unexported fields
}
LocalVoiceAgentProvider is the Device-Target-side local Voice Agent provider. It wraps a cascaded.Provider and adapts its minimal SessionConfig / Message types to the public live.LiveConfig / live.LiveMessage interface so a Wails session can drive it the same way it drives Gemini Live.
func NewLocalVoiceAgentProvider ¶ added in v0.34.9
func NewLocalVoiceAgentProvider(deps LocalVoiceAgentDeps) *LocalVoiceAgentProvider
NewLocalVoiceAgentProvider constructs the local provider. Panics if STT or Agent is nil — these are hard requirements; nothing in the Device-Target should ever construct this without them.
func (*LocalVoiceAgentProvider) Close ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) Close() error
Close delegates.
func (*LocalVoiceAgentProvider) Connect ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) Connect(ctx context.Context, cfg live.LiveConfig) error
Connect translates the rich live.LiveConfig into cascaded.SessionConfig.
func (*LocalVoiceAgentProvider) Name ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) Name() string
Name overrides cascaded.Provider.Name() so observability tells the Device-Target-embedded path apart from the Server-Target wiring.
func (*LocalVoiceAgentProvider) Receive ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) Receive(ctx context.Context) (*live.LiveMessage, error)
Receive blocks for the next cascaded.Message and adapts it into a live.LiveMessage.
func (*LocalVoiceAgentProvider) SendAudio ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) SendAudio(chunk []byte) error
SendAudio delegates.
func (*LocalVoiceAgentProvider) SendAudioStreamEnd ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) SendAudioStreamEnd() error
SendAudioStreamEnd delegates.
func (*LocalVoiceAgentProvider) SendText ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) SendText(text string) error
SendText delegates.
func (*LocalVoiceAgentProvider) SendToolResponse ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) SendToolResponse(_ live.ToolResponse) error
SendToolResponse is a no-op. The local cascaded path is turn-based and does not surface tool calls upstream — clients that ask for tool calls fall back to the cloud Voice Agent.
func (*LocalVoiceAgentProvider) UpdateInstructions ¶ added in v0.34.9
func (p *LocalVoiceAgentProvider) UpdateInstructions(ctx context.Context, cfg live.LiveConfig) error
UpdateInstructions translates rich live.LiveConfig to SessionConfig for mid-session prompt updates. Implementing this also satisfies live.LiveInstructionUpdater.
type OpenAILive ¶ added in v0.31.0
type OpenAILive struct {
// contains filtered or unexported fields
}
OpenAILive implements LiveProvider against the OpenAI Realtime API (WebSocket). It mirrors GeminiLive's surface so callers don't need to know which backend is active.
func NewOpenAILive ¶ added in v0.31.0
func NewOpenAILive() *OpenAILive
NewOpenAILive returns a fresh OpenAI Realtime provider.
func (*OpenAILive) Close ¶ added in v0.31.0
func (p *OpenAILive) Close() error
Close terminates the WebSocket. Idempotent.
func (*OpenAILive) Connect ¶ added in v0.31.0
func (p *OpenAILive) Connect(ctx context.Context, cfg LiveConfig) error
Connect dials the OpenAI Realtime WebSocket, sends the configured instructions/voice/tools as a session.update, and waits for the session.updated acknowledgement before returning.
func (*OpenAILive) Name ¶ added in v0.31.0
func (p *OpenAILive) Name() string
Name identifies the provider in Voice Agent logs.
func (*OpenAILive) Receive ¶ added in v0.31.0
func (p *OpenAILive) Receive(ctx context.Context) (*LiveMessage, error)
Receive translates the next server event into a LiveMessage. Server events that don't map to LiveMessage fields (session.created/updated, rate-limit telemetry, etc.) are swallowed and the loop fetches the next frame so callers see an aligned event stream.
func (*OpenAILive) SendAudio ¶ added in v0.31.0
func (p *OpenAILive) SendAudio(chunk []byte) error
SendAudio resamples a 16 kHz mic chunk to 24 kHz and forwards it as a base64-encoded input_audio_buffer.append event. Empty chunks are no-ops.
func (*OpenAILive) SendAudioStreamEnd ¶ added in v0.31.0
func (p *OpenAILive) SendAudioStreamEnd() error
SendAudioStreamEnd flushes the input audio buffer and triggers a model response. With server VAD enabled OpenAI commits the buffer automatically at end-of-speech; this explicit commit covers push-to-talk style turns where the kernel decides when the user stopped speaking.
func (*OpenAILive) SendText ¶ added in v0.31.0
func (p *OpenAILive) SendText(text string) error
SendText injects a text-only user turn and triggers a response.
func (*OpenAILive) SendToolResponse ¶ added in v0.31.0
func (p *OpenAILive) SendToolResponse(response ToolResponse) error
SendToolResponse delivers the host-side tool result back to the model and triggers a follow-up response.
func (*OpenAILive) UpdateInstructions ¶ added in v0.31.0
func (p *OpenAILive) UpdateInstructions(ctx context.Context, cfg LiveConfig) error
UpdateInstructions sends a fresh session.update with new instructions/tools. Implements LiveInstructionUpdater so the kernel can refresh persona prompts without forcing a reconnect.
type Session ¶
func NewSession ¶
func NewSession(provider LiveProvider, callbacks Callbacks) *Session
NewSession is a thin wrapper around live.NewSession so existing non-OSS call sites can continue to construct sessions through this package.
type StartSensitivity ¶ added in v0.18.0
type StartSensitivity = live.StartSensitivity
type ThinkingLevel ¶ added in v0.18.0
type ThinkingLevel = live.ThinkingLevel
type ThinkingPolicy ¶ added in v0.18.0
type ThinkingPolicy = live.ThinkingPolicy
type ToolBehavior ¶ added in v0.18.0
type ToolBehavior = live.ToolBehavior
type ToolDefinition ¶ added in v0.18.0
type ToolDefinition = live.ToolDefinition
type ToolResponse ¶ added in v0.18.0
type ToolResponse = live.ToolResponse
type ToolResponseScheduling ¶ added in v0.18.0
type ToolResponseScheduling = live.ToolResponseScheduling
type TurnCoverage ¶ added in v0.18.0
type TurnCoverage = live.TurnCoverage
type WorkflowConfig ¶ added in v0.28.2
type WorkflowConfig = live.WorkflowConfig
type WorkflowStep ¶ added in v0.28.2
type WorkflowStep = live.WorkflowStep