Documentation
¶
Overview ¶
Package voiceagent implements the Voice Agent Mode — a real-time, bidirectional voice conversation using native audio-to-audio models (Gemini Live API, OpenAI Realtime API) over WebSocket.
Index ¶
- type Callbacks
- type GeminiLive
- func (g *GeminiLive) Close() error
- func (g *GeminiLive) Connect(ctx context.Context, cfg LiveConfig) error
- func (g *GeminiLive) Name() string
- func (g *GeminiLive) Receive(ctx context.Context) (*LiveMessage, error)
- func (g *GeminiLive) SendAudio(chunk []byte) error
- func (g *GeminiLive) SendText(text string) error
- type IdleConfig
- type IdleTimer
- type LiveConfig
- type LiveMessage
- type LiveProvider
- type Session
- type State
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Callbacks ¶
type Callbacks struct {
OnStateChange func(state State)
OnAudio func(audio []byte) // Audio chunk to play
OnText func(text string) // Text for display (speech bubble)
OnError func(err error)
}
Callbacks are event handlers for UI integration.
type GeminiLive ¶
type GeminiLive struct {
// contains filtered or unexported fields
}
GeminiLive implements LiveProvider using the Google GenAI Live API.
func (*GeminiLive) Close ¶
func (g *GeminiLive) Close() error
func (*GeminiLive) Connect ¶
func (g *GeminiLive) Connect(ctx context.Context, cfg LiveConfig) error
func (*GeminiLive) Name ¶
func (g *GeminiLive) Name() string
func (*GeminiLive) Receive ¶
func (g *GeminiLive) Receive(ctx context.Context) (*LiveMessage, error)
func (*GeminiLive) SendAudio ¶
func (g *GeminiLive) SendAudio(chunk []byte) error
func (*GeminiLive) SendText ¶
func (g *GeminiLive) SendText(text string) error
type IdleConfig ¶
type IdleConfig struct {
ReminderAfter time.Duration // Default: 5 minutes
DeactivateAfter time.Duration // Default: 15 minutes
}
IdleConfig configures the idle timer behavior.
func DefaultIdleConfig ¶
func DefaultIdleConfig() IdleConfig
DefaultIdleConfig returns sensible defaults.
type IdleTimer ¶
type IdleTimer struct {
// contains filtered or unexported fields
}
IdleTimer manages reminder and auto-deactivation for Voice Agent.
func NewIdleTimer ¶
func NewIdleTimer(cfg IdleConfig, session *Session) *IdleTimer
NewIdleTimer creates an idle timer bound to a session.
type LiveConfig ¶
type LiveConfig struct {
Model string // e.g. "gemini-2.5-flash-native-audio-preview-12-2025"
APIKey string
Voice string // Voice name
SystemPrompt string
VocabularyHint string
Locale string
}
LiveConfig configures a real-time session.
type LiveMessage ¶
type LiveMessage struct {
Audio []byte // PCM audio chunk (24kHz 16-bit mono)
Text string // Text transcript (may be partial or empty)
Done bool // True when the model's turn is complete
}
LiveMessage is a message received from the real-time model.
type LiveProvider ¶
type LiveProvider interface {
// Connect establishes a WebSocket session to the real-time model.
Connect(ctx context.Context, cfg LiveConfig) error
// SendAudio streams PCM audio chunks to the model.
// Format: 16-bit signed int, little-endian, mono, 16kHz.
SendAudio(chunk []byte) error
// Receive blocks until the next server message arrives.
// Returns audio chunks and/or text from the model.
Receive(ctx context.Context) (*LiveMessage, error)
// SendText injects a text prompt into the session (for idle reminders).
SendText(text string) error
// Close terminates the WebSocket session.
Close() error
// Name returns the provider identifier.
Name() string
}
LiveProvider abstracts a real-time audio-to-audio model connection.
type Session ¶
type Session struct {
// contains filtered or unexported fields
}
Session manages a Voice Agent conversation.
func NewSession ¶
func NewSession(provider LiveProvider, callbacks Callbacks) *Session
NewSession creates a Voice Agent session with the given provider.
func (*Session) CurrentState ¶
State returns the current session state.
func (*Session) Start ¶
func (s *Session) Start(ctx context.Context, cfg LiveConfig, idleCfg IdleConfig) error
Start activates the Voice Agent session.