Documentation
¶
Overview ¶
Package agentkit provides a small Go harness for building SpeechKit Voice Agent hosts.
It wraps the realtime Voice Agent runtime with a session type, a thread-safe tool registry, lifecycle hooks, and a swappable session memory interface. Use it when embedding SpeechKit into an agent host that needs predictable tool registration and prompt/session lifecycle boundaries.
Index ¶
- Variables
- type AgentSession
- func (a *AgentSession) EndAudioStream() error
- func (a *AgentSession) Memory() Memory
- func (a *AgentSession) Registry() *ToolRegistry
- func (a *AgentSession) SendAudio(chunk []byte) error
- func (a *AgentSession) SendText(text string) error
- func (a *AgentSession) SessionID() string
- func (a *AgentSession) Start(ctx context.Context, cfg LiveConfig, idleCfg IdleConfig) error
- func (a *AgentSession) State() live.State
- func (a *AgentSession) Stop()
- type Callbacks
- type FuncTool
- type IdleConfig
- type LifecycleHooks
- type LiveConfig
- type LiveProvider
- type Memory
- type Schema
- type SessionContext
- type Tool
- type ToolBehavior
- type ToolCall
- type ToolDefinition
- type ToolRegistry
- type ToolResponse
Constants ¶
This section is empty.
Variables ¶
var ( // ErrDuplicateTool is returned by ToolRegistry.Register when a tool with // the same name is already registered. ErrDuplicateTool = errors.New("agentkit: tool already registered") // ErrUnknownTool is returned when a ToolCall references a name that is // not present in the registry. Callers normally see this surfaced as a // JSON {"error": "..."} response sent back to the model. ErrUnknownTool = errors.New("agentkit: unknown tool") // ErrNilTool is returned by ToolRegistry.Register when the supplied // Tool is nil or has an empty Name. ErrNilTool = errors.New("agentkit: tool is nil or has empty name") )
Functions ¶
This section is empty.
Types ¶
type AgentSession ¶
type AgentSession struct {
// contains filtered or unexported fields
}
AgentSession wraps a live.Session with a tool registry, lifecycle hooks, and a session-scoped memory store. It is the primary entry point for callers building voice agents on top of SpeechKit.
Construction order:
- Build a LiveProvider (e.g. live_gemini.NewLiveProvider(...))
- Define your Callbacks (OnAudio for playback, OnError, ...)
- Build a ToolRegistry and Register each Tool
- Build LifecycleHooks (optional)
- NewAgentSession(provider, callbacks, registry, hooks, memory)
- Start(ctx, cfg, idleCfg)
AgentSession is safe for concurrent calls to SendAudio / SendText / EndAudioStream and the underlying live.Session enforces single- session activation.
func NewAgentSession ¶
func NewAgentSession( provider LiveProvider, userCallbacks Callbacks, registry *ToolRegistry, hooks LifecycleHooks, memory Memory, ) *AgentSession
NewAgentSession constructs an AgentSession.
userCallbacks may carry the consumer's own Callbacks (e.g. OnAudio for playback). agentkit wraps OnToolCall, OnInputTranscript, OnOutputTranscript and OnSessionEnd to drive tool dispatch and lifecycle hooks; the caller's originals are still invoked.
If registry is nil, an empty registry is used. If memory is nil, the default in-memory store is used.
func (*AgentSession) EndAudioStream ¶
func (a *AgentSession) EndAudioStream() error
EndAudioStream signals the end of the current microphone stream.
func (*AgentSession) Memory ¶
func (a *AgentSession) Memory() Memory
Memory returns the Memory store bound to this session.
func (*AgentSession) Registry ¶
func (a *AgentSession) Registry() *ToolRegistry
Registry returns the ToolRegistry bound to this session.
func (*AgentSession) SendAudio ¶
func (a *AgentSession) SendAudio(chunk []byte) error
SendAudio forwards a PCM audio chunk to the realtime model.
func (*AgentSession) SendText ¶
func (a *AgentSession) SendText(text string) error
SendText injects a user text turn into the live session.
func (*AgentSession) SessionID ¶
func (a *AgentSession) SessionID() string
SessionID returns the unique id assigned to this session.
func (*AgentSession) Start ¶
func (a *AgentSession) Start(ctx context.Context, cfg LiveConfig, idleCfg IdleConfig) error
Start activates the underlying voice-agent session. It injects the registry's tool definitions into cfg.Tools (preserving any tools the caller already supplied) before forwarding to live.Session.Start.
OnSessionStart fires after a successful Start. If Start returns an error, no hook fires.
func (*AgentSession) State ¶
func (a *AgentSession) State() live.State
State returns the underlying session state.
func (*AgentSession) Stop ¶
func (a *AgentSession) Stop()
Stop deactivates the underlying session. Any pending tool dispatches receive their context cancellation through the session ctx.
type Callbacks ¶
Callbacks are the realtime session event handlers (audio out, text, transcripts, tool calls, errors, session end).
type FuncTool ¶
type FuncTool struct {
ToolName string
ToolDescription string
ToolSchema Schema
Fn func(ctx context.Context, args map[string]any) (map[string]any, error)
}
FuncTool is a convenience Tool implementation backed by a closure. Useful for inline registration without defining a new struct.
func (*FuncTool) Description ¶
func (*FuncTool) InputSchema ¶
type IdleConfig ¶
type IdleConfig = live.IdleConfig
IdleConfig configures the per-session idle reminder + auto-deactivate timer.
type LifecycleHooks ¶
type LifecycleHooks struct {
OnSessionStart func(ctx context.Context, sc SessionContext)
OnUserMessage func(ctx context.Context, sc SessionContext, text string)
OnToolCall func(ctx context.Context, sc SessionContext, call ToolCall)
OnAgentMessage func(ctx context.Context, sc SessionContext, text string)
OnSessionEnd func(ctx context.Context, sc SessionContext)
}
LifecycleHooks are optional callbacks fired during the agent session. Each hook is invoked synchronously from the dispatch goroutine; long work should be moved off-thread by the caller. Any hook may be nil.
OnUserMessage and OnAgentMessage are called only when the underlying transcript event signals done=true. The accumulated text since the last done is delivered as the message.
type LiveConfig ¶
type LiveConfig = live.LiveConfig
LiveConfig configures a realtime audio session (model, voice, prompts, locale, tools, policies).
type LiveProvider ¶
type LiveProvider = live.LiveProvider
LiveProvider is the WebSocket-backed realtime model adapter (Gemini Live, OpenAI Realtime, ...). Pass it to NewAgentSession.
type Memory ¶
type Memory interface {
Get(ctx context.Context, key string) (value string, ok bool, err error)
Set(ctx context.Context, key, value string) error
List(ctx context.Context, prefix string) ([]string, error)
Delete(ctx context.Context, key string) error
}
Memory is a swappable session-scoped key-value store. The default implementation is in-memory; consumers may provide Postgres-, Redis-, or vector-backed implementations.
Convention: keys are namespaced by session id using the prefix "{sessionID}/" so that List(sessionID + "/") yields all keys for one session. The InMemory implementation does not enforce this; it is a caller-side convention.
type Schema ¶
Schema is a JSON Schema object describing a tool's input parameters. Use the standard JSON Schema vocabulary (type, properties, required, ...).
type SessionContext ¶
type SessionContext struct {
SessionID string
Memory Memory
Registry *ToolRegistry
}
SessionContext is passed to every lifecycle hook. It exposes the active session id, the bound tool registry, and the session memory store.
type Tool ¶
type Tool interface {
// Name is the unique identifier the model sees. Must match across
// ToolDefinition and ToolCall.Name.
Name() string
// Description is a short natural-language hint shown to the model so it
// can decide when to invoke this tool.
Description() string
// InputSchema is the JSON Schema for the tool's arguments. Returning
// nil is equivalent to "no parameters".
InputSchema() Schema
// Invoke runs the tool. The returned map is sent back to the model as
// the tool response payload. A non-nil error is converted to
// {"error": err.Error()} for the model.
Invoke(ctx context.Context, args map[string]any) (map[string]any, error)
}
Tool is a host-side function the realtime voice agent may invoke.
Implementations must be safe for concurrent calls; the agent runtime dispatches Invoke from a goroutine per ToolCall.
type ToolBehavior ¶
type ToolBehavior = live.ToolBehavior
ToolBehavior controls whether the model waits for the tool result.
type ToolCall ¶
ToolCall is emitted by the model when it wants to invoke a host-side tool. The agentkit ToolRegistry resolves Name to a registered Tool and dispatches Args to its Invoke method.
type ToolDefinition ¶
type ToolDefinition = live.ToolDefinition
ToolDefinition describes a host-side tool to the realtime model. Built from a Tool by ToolRegistry.Definitions.
type ToolRegistry ¶
type ToolRegistry struct {
// contains filtered or unexported fields
}
ToolRegistry is a thread-safe collection of tools indexed by Name.
func (*ToolRegistry) All ¶
func (r *ToolRegistry) All() []Tool
All returns a snapshot of every registered tool, ordered by Name.
func (*ToolRegistry) Definitions ¶
func (r *ToolRegistry) Definitions() []ToolDefinition
Definitions converts every registered tool into a ToolDefinition suitable for placing in LiveConfig.Tools.
func (*ToolRegistry) Len ¶
func (r *ToolRegistry) Len() int
Len returns the number of registered tools.
func (*ToolRegistry) Lookup ¶
func (r *ToolRegistry) Lookup(name string) (Tool, bool)
Lookup returns the registered tool with the given name. The bool reports whether the tool exists.
func (*ToolRegistry) MustRegister ¶
func (r *ToolRegistry) MustRegister(t Tool)
MustRegister calls Register and panics on error. Convenient at init time.
func (*ToolRegistry) Register ¶
func (r *ToolRegistry) Register(t Tool) error
Register adds a tool to the registry. Returns ErrDuplicateTool if a tool with the same Name is already present, or ErrNilTool if the supplied Tool is nil or has an empty Name.
type ToolResponse ¶
type ToolResponse = live.ToolResponse
ToolResponse carries the result of a host-side tool invocation back to the model.