agentkit

package
v0.40.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

Documentation

Overview

Package agentkit provides a small Go harness for building SpeechKit Voice Agent hosts.

It wraps the realtime Voice Agent runtime with a session type, a thread-safe tool registry, lifecycle hooks, and a swappable session memory interface. Use it when embedding SpeechKit into an agent host that needs predictable tool registration and prompt/session lifecycle boundaries.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrDuplicateTool is returned by ToolRegistry.Register when a tool with
	// the same name is already registered.
	ErrDuplicateTool = errors.New("agentkit: tool already registered")

	// ErrUnknownTool is returned when a ToolCall references a name that is
	// not present in the registry. Callers normally see this surfaced as a
	// JSON {"error": "..."} response sent back to the model.
	ErrUnknownTool = errors.New("agentkit: unknown tool")

	// ErrNilTool is returned by ToolRegistry.Register when the supplied
	// Tool is nil or has an empty Name.
	ErrNilTool = errors.New("agentkit: tool is nil or has empty name")
)

Functions

This section is empty.

Types

type AgentSession

type AgentSession struct {
	// contains filtered or unexported fields
}

AgentSession wraps a live.Session with a tool registry, lifecycle hooks, and a session-scoped memory store. It is the primary entry point for callers building voice agents on top of SpeechKit.

Construction order:

  1. Build a LiveProvider (e.g. live_gemini.NewLiveProvider(...))
  2. Define your Callbacks (OnAudio for playback, OnError, ...)
  3. Build a ToolRegistry and Register each Tool
  4. Build LifecycleHooks (optional)
  5. NewAgentSession(provider, callbacks, registry, hooks, memory)
  6. Start(ctx, cfg, idleCfg)

AgentSession is safe for concurrent calls to SendAudio / SendText / EndAudioStream and the underlying live.Session enforces single- session activation.

func NewAgentSession

func NewAgentSession(
	provider LiveProvider,
	userCallbacks Callbacks,
	registry *ToolRegistry,
	hooks LifecycleHooks,
	memory Memory,
) *AgentSession

NewAgentSession constructs an AgentSession.

userCallbacks may carry the consumer's own Callbacks (e.g. OnAudio for playback). agentkit wraps OnToolCall, OnInputTranscript, OnOutputTranscript and OnSessionEnd to drive tool dispatch and lifecycle hooks; the caller's originals are still invoked.

If registry is nil, an empty registry is used. If memory is nil, the default in-memory store is used.

func (*AgentSession) EndAudioStream

func (a *AgentSession) EndAudioStream() error

EndAudioStream signals the end of the current microphone stream.

func (*AgentSession) Memory

func (a *AgentSession) Memory() Memory

Memory returns the Memory store bound to this session.

func (*AgentSession) Registry

func (a *AgentSession) Registry() *ToolRegistry

Registry returns the ToolRegistry bound to this session.

func (*AgentSession) SendAudio

func (a *AgentSession) SendAudio(chunk []byte) error

SendAudio forwards a PCM audio chunk to the realtime model.

func (*AgentSession) SendText

func (a *AgentSession) SendText(text string) error

SendText injects a user text turn into the live session.

func (*AgentSession) SessionID

func (a *AgentSession) SessionID() string

SessionID returns the unique id assigned to this session.

func (*AgentSession) Start

func (a *AgentSession) Start(ctx context.Context, cfg LiveConfig, idleCfg IdleConfig) error

Start activates the underlying voice-agent session. It injects the registry's tool definitions into cfg.Tools (preserving any tools the caller already supplied) before forwarding to live.Session.Start.

OnSessionStart fires after a successful Start. If Start returns an error, no hook fires.

func (*AgentSession) State

func (a *AgentSession) State() live.State

State returns the underlying session state.

func (*AgentSession) Stop

func (a *AgentSession) Stop()

Stop deactivates the underlying session. Any pending tool dispatches receive their context cancellation through the session ctx.

type Callbacks

type Callbacks = live.Callbacks

Callbacks are the realtime session event handlers (audio out, text, transcripts, tool calls, errors, session end).

type FuncTool

type FuncTool struct {
	ToolName        string
	ToolDescription string
	ToolSchema      Schema
	Fn              func(ctx context.Context, args map[string]any) (map[string]any, error)
}

FuncTool is a convenience Tool implementation backed by a closure. Useful for inline registration without defining a new struct.

func (*FuncTool) Description

func (f *FuncTool) Description() string

func (*FuncTool) InputSchema

func (f *FuncTool) InputSchema() Schema

func (*FuncTool) Invoke

func (f *FuncTool) Invoke(ctx context.Context, args map[string]any) (map[string]any, error)

func (*FuncTool) Name

func (f *FuncTool) Name() string

type IdleConfig

type IdleConfig = live.IdleConfig

IdleConfig configures the per-session idle reminder + auto-deactivate timer.

type LifecycleHooks

type LifecycleHooks struct {
	OnSessionStart func(ctx context.Context, sc SessionContext)
	OnUserMessage  func(ctx context.Context, sc SessionContext, text string)
	OnToolCall     func(ctx context.Context, sc SessionContext, call ToolCall)
	OnAgentMessage func(ctx context.Context, sc SessionContext, text string)
	OnSessionEnd   func(ctx context.Context, sc SessionContext)
}

LifecycleHooks are optional callbacks fired during the agent session. Each hook is invoked synchronously from the dispatch goroutine; long work should be moved off-thread by the caller. Any hook may be nil.

OnUserMessage and OnAgentMessage are called only when the underlying transcript event signals done=true. The accumulated text since the last done is delivered as the message.

type LiveConfig

type LiveConfig = live.LiveConfig

LiveConfig configures a realtime audio session (model, voice, prompts, locale, tools, policies).

type LiveProvider

type LiveProvider = live.LiveProvider

LiveProvider is the WebSocket-backed realtime model adapter (Gemini Live, OpenAI Realtime, ...). Pass it to NewAgentSession.

type Memory

type Memory interface {
	Get(ctx context.Context, key string) (value string, ok bool, err error)
	Set(ctx context.Context, key, value string) error
	List(ctx context.Context, prefix string) ([]string, error)
	Delete(ctx context.Context, key string) error
}

Memory is a swappable session-scoped key-value store. The default implementation is in-memory; consumers may provide Postgres-, Redis-, or vector-backed implementations.

Convention: keys are namespaced by session id using the prefix "{sessionID}/" so that List(sessionID + "/") yields all keys for one session. The InMemory implementation does not enforce this; it is a caller-side convention.

func NewInMemory

func NewInMemory() Memory

NewInMemory returns a thread-safe in-memory Memory.

type Schema

type Schema = map[string]any

Schema is a JSON Schema object describing a tool's input parameters. Use the standard JSON Schema vocabulary (type, properties, required, ...).

type SessionContext

type SessionContext struct {
	SessionID string
	Memory    Memory
	Registry  *ToolRegistry
}

SessionContext is passed to every lifecycle hook. It exposes the active session id, the bound tool registry, and the session memory store.

type Tool

type Tool interface {
	// Name is the unique identifier the model sees. Must match across
	// ToolDefinition and ToolCall.Name.
	Name() string

	// Description is a short natural-language hint shown to the model so it
	// can decide when to invoke this tool.
	Description() string

	// InputSchema is the JSON Schema for the tool's arguments. Returning
	// nil is equivalent to "no parameters".
	InputSchema() Schema

	// Invoke runs the tool. The returned map is sent back to the model as
	// the tool response payload. A non-nil error is converted to
	// {"error": err.Error()} for the model.
	Invoke(ctx context.Context, args map[string]any) (map[string]any, error)
}

Tool is a host-side function the realtime voice agent may invoke.

Implementations must be safe for concurrent calls; the agent runtime dispatches Invoke from a goroutine per ToolCall.

type ToolBehavior

type ToolBehavior = live.ToolBehavior

ToolBehavior controls whether the model waits for the tool result.

type ToolCall

type ToolCall = live.ToolCall

ToolCall is emitted by the model when it wants to invoke a host-side tool. The agentkit ToolRegistry resolves Name to a registered Tool and dispatches Args to its Invoke method.

type ToolDefinition

type ToolDefinition = live.ToolDefinition

ToolDefinition describes a host-side tool to the realtime model. Built from a Tool by ToolRegistry.Definitions.

type ToolRegistry

type ToolRegistry struct {
	// contains filtered or unexported fields
}

ToolRegistry is a thread-safe collection of tools indexed by Name.

func NewRegistry

func NewRegistry() *ToolRegistry

NewRegistry returns an empty ToolRegistry.

func (*ToolRegistry) All

func (r *ToolRegistry) All() []Tool

All returns a snapshot of every registered tool, ordered by Name.

func (*ToolRegistry) Definitions

func (r *ToolRegistry) Definitions() []ToolDefinition

Definitions converts every registered tool into a ToolDefinition suitable for placing in LiveConfig.Tools.

func (*ToolRegistry) Len

func (r *ToolRegistry) Len() int

Len returns the number of registered tools.

func (*ToolRegistry) Lookup

func (r *ToolRegistry) Lookup(name string) (Tool, bool)

Lookup returns the registered tool with the given name. The bool reports whether the tool exists.

func (*ToolRegistry) MustRegister

func (r *ToolRegistry) MustRegister(t Tool)

MustRegister calls Register and panics on error. Convenient at init time.

func (*ToolRegistry) Register

func (r *ToolRegistry) Register(t Tool) error

Register adds a tool to the registry. Returns ErrDuplicateTool if a tool with the same Name is already present, or ErrNilTool if the supplied Tool is nil or has an empty Name.

type ToolResponse

type ToolResponse = live.ToolResponse

ToolResponse carries the result of a host-side tool invocation back to the model.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL