Documentation
¶
Overview ¶
Package providers implements multi-LLM provider support with unified interfaces.
Package providers implements multi-LLM provider support with unified interfaces.
This package provides a common abstraction for predict-based LLM providers including OpenAI, Anthropic Claude, and Google Gemini. It handles:
- Predict completion requests with streaming support
- Tool/function calling with provider-specific formats
- Cost tracking and token usage calculation
- Rate limiting and error handling
All providers implement the Provider interface for basic predict, and ToolSupport interface for function calling capabilities.
Index ¶
- Constants
- func CheckHTTPError(resp *http.Response) error
- func ExtractOrderedEmbeddings[T any](data []T, getIndex func(T) int, getEmbedding func(T) []float32, ...) ([][]float32, error)
- func HasAudioSupport(p Provider) bool
- func HasImageSupport(p Provider) bool
- func HasVideoSupport(p Provider) bool
- func IsFormatSupported(p Provider, contentType, mimeType string) bool
- func IsValidationAbort(err error) bool
- func LoadFileAsBase64(filePath string) (string, error)deprecated
- func LogEmbeddingRequest(provider, model string, textCount int, start time.Time)
- func LogEmbeddingRequestWithTokens(provider, model string, textCount, tokens int, start time.Time)
- func MarshalRequest(req any) ([]byte, error)
- func RegisterProviderFactory(providerType string, factory ProviderFactory)
- func SetErrorResponse(predictResp *PredictionResponse, respBody []byte, start time.Time)
- func StringPtr(s string) *string
- func SupportsMultimodal(p Provider) bool
- func UnmarshalJSON(respBody []byte, v interface{}, predictResp *PredictionResponse, ...) error
- func UnmarshalResponse(body []byte, resp any) error
- func ValidateMultimodalMessage(p Provider, msg types.Message) error
- func ValidateMultimodalRequest(p MultimodalSupport, req PredictionRequest) error
- type AudioStreamingCapabilities
- type BaseEmbeddingProvider
- func (b *BaseEmbeddingProvider) DoEmbeddingRequest(ctx context.Context, cfg HTTPRequestConfig) ([]byte, error)
- func (b *BaseEmbeddingProvider) EmbedWithEmptyCheck(ctx context.Context, req EmbeddingRequest, embedFn EmbedFunc) (EmbeddingResponse, error)
- func (b *BaseEmbeddingProvider) EmbeddingDimensions() int
- func (b *BaseEmbeddingProvider) EmptyResponseForModel(model string) EmbeddingResponse
- func (b *BaseEmbeddingProvider) HandleEmptyRequest(req EmbeddingRequest) (EmbeddingResponse, bool)
- func (b *BaseEmbeddingProvider) ID() string
- func (b *BaseEmbeddingProvider) MaxBatchSize() int
- func (b *BaseEmbeddingProvider) Model() string
- func (b *BaseEmbeddingProvider) ResolveModel(reqModel string) string
- type BaseProvider
- type EmbedFunc
- type EmbeddingProvider
- type EmbeddingRequest
- type EmbeddingResponse
- type EmbeddingUsage
- type ExecutionResult
- type HTTPRequestConfig
- type ImageDetail
- type MediaLoader
- type MediaLoaderConfig
- type MultimodalCapabilities
- type MultimodalSupport
- type MultimodalToolSupport
- type PredictionRequest
- type PredictionResponse
- type Pricing
- type Provider
- type ProviderDefaults
- type ProviderFactory
- type ProviderSpec
- type Registry
- type SSEScanner
- type StreamChunk
- type StreamEvent
- type StreamInputSession
- type StreamInputSupport
- type StreamObserver
- type StreamingCapabilities
- type StreamingInputConfig
- type StreamingToolDefinition
- type ToolDescriptor
- type ToolResponse
- type ToolResponseSupport
- type ToolResult
- type ToolSupport
- type UnsupportedContentError
- type UnsupportedProviderError
- type ValidationAbortError
- type VideoResolution
- type VideoStreamingCapabilities
Constants ¶
const ( ContentTypeHeader = "Content-Type" AuthorizationHeader = "Authorization" ApplicationJSON = "application/json" BearerPrefix = "Bearer " )
Common HTTP constants for embedding providers.
Variables ¶
This section is empty.
Functions ¶
func CheckHTTPError ¶ added in v1.1.0
CheckHTTPError checks if HTTP response is an error and returns formatted error with body
func ExtractOrderedEmbeddings ¶ added in v1.1.6
func ExtractOrderedEmbeddings[T any]( data []T, getIndex func(T) int, getEmbedding func(T) []float32, expectedCount int, ) ([][]float32, error)
ExtractOrderedEmbeddings extracts embeddings from indexed response data and places them in the correct order. Returns an error if count doesn't match.
func HasAudioSupport ¶ added in v1.1.0
HasAudioSupport checks if a provider supports audio inputs
func HasImageSupport ¶ added in v1.1.0
HasImageSupport checks if a provider supports image inputs
func HasVideoSupport ¶ added in v1.1.0
HasVideoSupport checks if a provider supports video inputs
func IsFormatSupported ¶ added in v1.1.0
IsFormatSupported checks if a provider supports a specific media format (MIME type)
func IsValidationAbort ¶
IsValidationAbort checks if an error is a validation abort
func LoadFileAsBase64
deprecated
added in
v1.1.0
LoadFileAsBase64 reads a file and returns its content as a base64-encoded string.
Deprecated: Use MediaLoader.GetBase64Data instead for better functionality including storage reference support, URL loading, and proper context handling.
This function is kept for backward compatibility but will be removed in a future version. It now delegates to the new MediaLoader implementation.
func LogEmbeddingRequest ¶ added in v1.1.6
LogEmbeddingRequest logs a completed embedding request with common fields.
func LogEmbeddingRequestWithTokens ¶ added in v1.1.6
LogEmbeddingRequestWithTokens logs a completed embedding request with token count.
func MarshalRequest ¶ added in v1.1.6
MarshalRequest marshals a request body to JSON with standardized error handling.
func RegisterProviderFactory ¶ added in v1.1.0
func RegisterProviderFactory(providerType string, factory ProviderFactory)
RegisterProviderFactory registers a factory function for a provider type
func SetErrorResponse ¶ added in v1.1.0
func SetErrorResponse(predictResp *PredictionResponse, respBody []byte, start time.Time)
SetErrorResponse sets latency and raw body on error responses
func StringPtr ¶ added in v1.1.0
StringPtr is a helper function that returns a pointer to a string. This is commonly used across provider implementations for optional fields.
func SupportsMultimodal ¶ added in v1.1.0
SupportsMultimodal checks if a provider implements multimodal support
func UnmarshalJSON ¶ added in v1.1.0
func UnmarshalJSON(respBody []byte, v interface{}, predictResp *PredictionResponse, start time.Time) error
UnmarshalJSON unmarshals JSON with error recovery that sets latency and raw response
func UnmarshalResponse ¶ added in v1.1.6
UnmarshalResponse unmarshals a response body from JSON with standardized error handling.
func ValidateMultimodalMessage ¶ added in v1.1.0
ValidateMultimodalMessage checks if a message's multimodal content is supported by the provider
func ValidateMultimodalRequest ¶ added in v1.1.0
func ValidateMultimodalRequest(p MultimodalSupport, req PredictionRequest) error
ValidateMultimodalRequest validates all messages in a predict request for multimodal compatibility This is a helper function to reduce duplication across provider implementations
Types ¶
type AudioStreamingCapabilities ¶ added in v1.1.0
type AudioStreamingCapabilities struct {
// SupportedEncodings lists supported audio encodings
// Common values: "pcm", "opus", "mp3", "aac"
SupportedEncodings []string `json:"supported_encodings"`
// SupportedSampleRates lists supported sample rates in Hz
// Common values: 8000, 16000, 24000, 44100, 48000
SupportedSampleRates []int `json:"supported_sample_rates"`
// SupportedChannels lists supported channel counts
// Common values: 1 (mono), 2 (stereo)
SupportedChannels []int `json:"supported_channels"`
// SupportedBitDepths lists supported bit depths
// Common values: 16, 24, 32
SupportedBitDepths []int `json:"supported_bit_depths,omitempty"`
// PreferredEncoding is the recommended encoding for best quality/latency
PreferredEncoding string `json:"preferred_encoding"`
// PreferredSampleRate is the recommended sample rate
PreferredSampleRate int `json:"preferred_sample_rate"`
}
AudioStreamingCapabilities describes audio streaming support.
type BaseEmbeddingProvider ¶ added in v1.1.6
type BaseEmbeddingProvider struct {
ProviderModel string
BaseURL string
APIKey string
HTTPClient *http.Client
Dimensions int
ProviderID string
BatchSize int
}
BaseEmbeddingProvider provides common functionality for embedding providers. Embed this struct in provider-specific implementations to reduce duplication.
func NewBaseEmbeddingProvider ¶ added in v1.1.6
func NewBaseEmbeddingProvider( providerID, defaultModel, defaultBaseURL string, defaultDimensions, defaultBatchSize int, defaultTimeout time.Duration, ) *BaseEmbeddingProvider
NewBaseEmbeddingProvider creates a base embedding provider with defaults.
func (*BaseEmbeddingProvider) DoEmbeddingRequest ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) DoEmbeddingRequest( ctx context.Context, cfg HTTPRequestConfig, ) ([]byte, error)
DoEmbeddingRequest performs a common HTTP POST request for embeddings. Returns the response body and any error.
func (*BaseEmbeddingProvider) EmbedWithEmptyCheck ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) EmbedWithEmptyCheck( ctx context.Context, req EmbeddingRequest, embedFn EmbedFunc, ) (EmbeddingResponse, error)
EmbedWithEmptyCheck wraps embedding logic with empty request handling.
func (*BaseEmbeddingProvider) EmbeddingDimensions ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) EmbeddingDimensions() int
EmbeddingDimensions returns the dimensionality of embedding vectors.
func (*BaseEmbeddingProvider) EmptyResponseForModel ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) EmptyResponseForModel(model string) EmbeddingResponse
EmptyResponseForModel returns an empty EmbeddingResponse with the given model. Use this for handling empty input cases.
func (*BaseEmbeddingProvider) HandleEmptyRequest ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) HandleEmptyRequest( req EmbeddingRequest, ) (EmbeddingResponse, bool)
HandleEmptyRequest checks if the request has no texts and returns early if so. Returns (response, true) if empty, (zero, false) if not empty.
func (*BaseEmbeddingProvider) ID ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) ID() string
ID returns the provider identifier.
func (*BaseEmbeddingProvider) MaxBatchSize ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) MaxBatchSize() int
MaxBatchSize returns the maximum texts per single API request.
func (*BaseEmbeddingProvider) Model ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) Model() string
Model returns the current embedding model.
func (*BaseEmbeddingProvider) ResolveModel ¶ added in v1.1.6
func (b *BaseEmbeddingProvider) ResolveModel(reqModel string) string
ResolveModel returns the model to use, preferring the request model over the default.
type BaseProvider ¶ added in v1.1.0
type BaseProvider struct {
// contains filtered or unexported fields
}
BaseProvider provides common functionality shared across all provider implementations. It should be embedded in concrete provider structs to avoid code duplication.
func NewBaseProvider ¶ added in v1.1.0
func NewBaseProvider(id string, includeRawOutput bool, client *http.Client) BaseProvider
NewBaseProvider creates a new BaseProvider with common fields
func NewBaseProviderWithAPIKey ¶ added in v1.1.0
func NewBaseProviderWithAPIKey(id string, includeRawOutput bool, primaryKey, fallbackKey string) (provider BaseProvider, apiKey string)
NewBaseProviderWithAPIKey creates a BaseProvider and retrieves API key from environment It tries the primary key first, then falls back to the secondary key if primary is empty.
func (*BaseProvider) Close ¶ added in v1.1.0
func (b *BaseProvider) Close() error
Close closes the HTTP client's idle connections
func (*BaseProvider) GetHTTPClient ¶ added in v1.1.0
func (b *BaseProvider) GetHTTPClient() *http.Client
GetHTTPClient returns the underlying HTTP client for provider-specific use
func (*BaseProvider) ID ¶ added in v1.1.0
func (b *BaseProvider) ID() string
ID returns the provider ID
func (*BaseProvider) ShouldIncludeRawOutput ¶ added in v1.1.0
func (b *BaseProvider) ShouldIncludeRawOutput() bool
ShouldIncludeRawOutput returns whether to include raw API responses in output
func (*BaseProvider) SupportsStreaming ¶ added in v1.1.0
func (b *BaseProvider) SupportsStreaming() bool
SupportsStreaming returns true by default (can be overridden by providers that don't support streaming)
type EmbeddingProvider ¶ added in v1.1.6
type EmbeddingProvider interface {
// Embed generates embeddings for the given texts.
// The response contains one embedding vector per input text, in the same order.
// Implementations should handle batching internally if the request exceeds MaxBatchSize.
Embed(ctx context.Context, req EmbeddingRequest) (EmbeddingResponse, error)
// EmbeddingDimensions returns the dimensionality of embedding vectors.
// Common values: 1536 (OpenAI ada-002/3-small), 768 (Gemini), 3072 (OpenAI 3-large)
EmbeddingDimensions() int
// MaxBatchSize returns the maximum number of texts per single API request.
// Callers should batch requests appropriately, or rely on the provider
// to handle splitting internally.
MaxBatchSize() int
// ID returns the provider identifier (e.g., "openai-embedding", "gemini-embedding")
ID() string
}
EmbeddingProvider generates text embeddings for semantic similarity operations. Implementations exist for OpenAI, Gemini, and other embedding APIs.
Embeddings are dense vector representations of text that capture semantic meaning. Similar texts will have embeddings with high cosine similarity scores.
Example usage:
provider, _ := openai.NewEmbeddingProvider()
resp, err := provider.Embed(ctx, providers.EmbeddingRequest{
Texts: []string{"Hello world", "Hi there"},
})
similarity := CosineSimilarity(resp.Embeddings[0], resp.Embeddings[1])
type EmbeddingRequest ¶ added in v1.1.6
type EmbeddingRequest struct {
// Texts to embed (batched for efficiency)
Texts []string
// Model override for embedding model (optional, uses provider default if empty)
Model string
}
EmbeddingRequest represents a request for text embeddings.
type EmbeddingResponse ¶ added in v1.1.6
type EmbeddingResponse struct {
// Embeddings contains one vector per input text, in the same order
Embeddings [][]float32
// Model is the model that was used for embedding
Model string
// Usage contains token consumption information (optional)
Usage *EmbeddingUsage
}
EmbeddingResponse contains the embedding vectors from a provider.
type EmbeddingUsage ¶ added in v1.1.6
type EmbeddingUsage struct {
// TotalTokens is the total number of tokens processed
TotalTokens int
}
EmbeddingUsage tracks token consumption for embedding requests.
type ExecutionResult ¶
type ExecutionResult interface{}
ExecutionResult is a forward declaration to avoid circular import.
type HTTPRequestConfig ¶ added in v1.1.6
type HTTPRequestConfig struct {
URL string
Body []byte
UseAPIKey bool // If true, adds Authorization: Bearer <APIKey> header
ContentType string // Defaults to application/json
}
HTTPRequestConfig configures how to make an HTTP request.
type ImageDetail ¶ added in v1.1.0
type ImageDetail string
ImageDetail specifies the level of detail for image processing
const ( ImageDetailLow ImageDetail = "low" // Faster, less detailed analysis ImageDetailHigh ImageDetail = "high" // Slower, more detailed analysis ImageDetailAuto ImageDetail = "auto" // Provider chooses automatically )
Image detail levels for multimodal processing.
type MediaLoader ¶ added in v1.1.2
type MediaLoader struct {
// contains filtered or unexported fields
}
MediaLoader handles loading media content from various sources (inline data, files, URLs, storage). It provides a unified interface for providers to access media regardless of the source.
func NewMediaLoader ¶ added in v1.1.2
func NewMediaLoader(config MediaLoaderConfig) *MediaLoader
NewMediaLoader creates a new MediaLoader with the given configuration.
func (*MediaLoader) GetBase64Data ¶ added in v1.1.2
func (ml *MediaLoader) GetBase64Data(ctx context.Context, media *types.MediaContent) (string, error)
GetBase64Data loads media content and returns it as base64-encoded data. It handles all media sources: inline data, file paths, URLs, and storage references.
type MediaLoaderConfig ¶ added in v1.1.2
type MediaLoaderConfig struct {
// StorageService is optional - required only for loading from storage references
StorageService storage.MediaStorageService
// HTTPTimeout for URL fetching (default: 30s)
HTTPTimeout time.Duration
// MaxURLSizeBytes is the maximum size for URL-based media (default: 50MB)
MaxURLSizeBytes int64
}
MediaLoaderConfig configures the MediaLoader behavior.
type MultimodalCapabilities ¶ added in v1.1.0
type MultimodalCapabilities struct {
SupportsImages bool // Provider can process image inputs
SupportsAudio bool // Provider can process audio inputs
SupportsVideo bool // Provider can process video inputs
ImageFormats []string // Supported image MIME types (e.g., "image/jpeg", "image/png")
AudioFormats []string // Supported audio MIME types (e.g., "audio/mpeg", "audio/wav")
VideoFormats []string // Supported video MIME types (e.g., "video/mp4")
MaxImageSizeMB int // Maximum image size in megabytes (0 = unlimited/unknown)
MaxAudioSizeMB int // Maximum audio size in megabytes (0 = unlimited/unknown)
MaxVideoSizeMB int // Maximum video size in megabytes (0 = unlimited/unknown)
}
MultimodalCapabilities describes what types of multimodal content a provider supports
type MultimodalSupport ¶ added in v1.1.0
type MultimodalSupport interface {
Provider // Extends the base Provider interface
// GetMultimodalCapabilities returns what types of multimodal content this provider supports
GetMultimodalCapabilities() MultimodalCapabilities
// PredictMultimodal performs a predict request with multimodal message content
// Messages in the request can contain Parts with images, audio, or video
PredictMultimodal(ctx context.Context, req PredictionRequest) (PredictionResponse, error)
// PredictMultimodalStream performs a streaming predict request with multimodal content
PredictMultimodalStream(ctx context.Context, req PredictionRequest) (<-chan StreamChunk, error)
}
MultimodalSupport interface for providers that support multimodal inputs
func GetMultimodalProvider ¶ added in v1.1.0
func GetMultimodalProvider(p Provider) MultimodalSupport
GetMultimodalProvider safely casts a provider to MultimodalSupport Returns nil if the provider doesn't support multimodal
type MultimodalToolSupport ¶ added in v1.1.0
type MultimodalToolSupport interface {
MultimodalSupport // Extends multimodal support
ToolSupport // Extends tool support
// PredictMultimodalWithTools performs a predict request with both multimodal content and tools
PredictMultimodalWithTools(ctx context.Context, req PredictionRequest, tools interface{}, toolChoice string) (PredictionResponse, []types.MessageToolCall, error)
}
MultimodalToolSupport interface for providers that support both multimodal and tools
type PredictionRequest ¶ added in v1.1.0
type PredictionRequest struct {
System string `json:"system"`
Messages []types.Message `json:"messages"`
Temperature float32 `json:"temperature"`
TopP float32 `json:"top_p"`
MaxTokens int `json:"max_tokens"`
Seed *int `json:"seed,omitempty"`
Metadata map[string]interface{} `json:"metadata,omitempty"` // Optional metadata for provider-specific context
}
PredictionRequest represents a request to a predict provider
type PredictionResponse ¶ added in v1.1.0
type PredictionResponse struct {
Content string `json:"content"`
Parts []types.ContentPart `json:"parts,omitempty"` // Multimodal content parts (text, image, audio, video)
CostInfo *types.CostInfo `json:"cost_info,omitempty"` // Cost breakdown for this response (includes token counts)
Latency time.Duration `json:"latency"`
Raw []byte `json:"raw,omitempty"`
RawRequest interface{} `json:"raw_request,omitempty"` // Raw API request (for debugging)
ToolCalls []types.MessageToolCall `json:"tool_calls,omitempty"` // Tools called in this response
}
PredictionResponse represents a response from a predict provider
type Provider ¶
type Provider interface {
ID() string
Predict(ctx context.Context, req PredictionRequest) (PredictionResponse, error)
// Streaming support
PredictStream(ctx context.Context, req PredictionRequest) (<-chan StreamChunk, error)
SupportsStreaming() bool
ShouldIncludeRawOutput() bool
Close() error // Close cleans up provider resources (e.g., HTTP connections)
// CalculateCost calculates cost breakdown for given token counts
CalculateCost(inputTokens, outputTokens, cachedTokens int) types.CostInfo
}
Provider interface defines the contract for predict providers
func CreateProviderFromSpec ¶
func CreateProviderFromSpec(spec ProviderSpec) (Provider, error)
CreateProviderFromSpec creates a provider implementation from a spec. Returns an error if the provider type is unsupported.
type ProviderDefaults ¶
ProviderDefaults holds default parameters for providers
type ProviderFactory ¶ added in v1.1.0
type ProviderFactory func(spec ProviderSpec) (Provider, error)
ProviderFactory is a function that creates a provider from a spec
type ProviderSpec ¶
type ProviderSpec struct {
ID string
Type string
Model string
BaseURL string
Defaults ProviderDefaults
IncludeRawOutput bool
AdditionalConfig map[string]interface{} // Flexible key-value pairs for provider-specific configuration
}
ProviderSpec holds the configuration needed to create a provider instance
type Registry ¶
type Registry struct {
// contains filtered or unexported fields
}
Registry manages available providers
func (*Registry) Close ¶
Close closes all registered providers and cleans up their resources. Returns the first error encountered, if any.
type SSEScanner ¶
type SSEScanner struct {
// contains filtered or unexported fields
}
SSEScanner scans Server-Sent Events (SSE) streams
func NewSSEScanner ¶
func NewSSEScanner(r io.Reader) *SSEScanner
NewSSEScanner creates a new SSE scanner
type StreamChunk ¶
type StreamChunk struct {
// Content is the accumulated content so far
Content string `json:"content"`
// Delta is the new content in this chunk
Delta string `json:"delta"`
// MediaDelta contains new media content in this chunk (audio, video, images)
// Uses the same MediaContent type as non-streaming messages for API consistency.
MediaDelta *types.MediaContent `json:"media_delta,omitempty"`
// TokenCount is the total number of tokens so far
TokenCount int `json:"token_count"`
// DeltaTokens is the number of tokens in this delta
DeltaTokens int `json:"delta_tokens"`
// ToolCalls contains accumulated tool calls (for assistant messages that invoke tools)
ToolCalls []types.MessageToolCall `json:"tool_calls,omitempty"`
// FinishReason is nil until stream is complete
// Values: "stop", "length", "content_filter", "tool_calls", "error", "validation_failed", "cancelled"
FinishReason *string `json:"finish_reason,omitempty"`
// Interrupted indicates the response was interrupted (e.g., user started speaking)
// When true, clients should clear any buffered audio and prepare for a new response
Interrupted bool `json:"interrupted,omitempty"`
// Error is set if an error occurred during streaming
Error error `json:"error,omitempty"`
// Metadata contains provider-specific metadata
Metadata map[string]interface{} `json:"metadata,omitempty"`
// FinalResult contains the complete execution result (only set in the final chunk)
FinalResult ExecutionResult `json:"final_result,omitempty"`
// CostInfo contains cost breakdown (only present in final chunk when FinishReason != nil)
CostInfo *types.CostInfo `json:"cost_info,omitempty"`
}
StreamChunk represents a batch of tokens with metadata
type StreamEvent ¶
type StreamEvent struct {
// Type is the event type: "chunk", "complete", "error"
Type string `json:"type"`
// Chunk contains the stream chunk data
Chunk *StreamChunk `json:"chunk,omitempty"`
// Error is set for error events
Error error `json:"error,omitempty"`
// Timestamp is when the event occurred
Timestamp time.Time `json:"timestamp"`
}
StreamEvent is sent to observers for monitoring
type StreamInputSession ¶ added in v1.1.0
type StreamInputSession interface {
// SendChunk sends a media chunk to the provider.
// Returns an error if the chunk cannot be sent or the session is closed.
// This method is safe to call from multiple goroutines.
SendChunk(ctx context.Context, chunk *types.MediaChunk) error
// SendText sends a text message to the provider during the streaming session.
// This is useful for sending text prompts or instructions during audio streaming.
// Note: This marks the turn as complete, triggering a response.
SendText(ctx context.Context, text string) error
// SendSystemContext sends a text message as context without completing the turn.
// Use this for system prompts that provide context but shouldn't trigger an immediate response.
// The audio/text that follows will be processed with this context in mind.
SendSystemContext(ctx context.Context, text string) error
// Response returns a receive-only channel for streaming responses.
// The channel is closed when the session ends or encounters an error.
// Consumers should read from this channel in a separate goroutine.
Response() <-chan StreamChunk
// Close ends the streaming session and releases resources.
// After calling Close, SendChunk and SendText will return errors.
// The Response channel will be closed.
// Close is safe to call multiple times.
Close() error
// Error returns any error that occurred during the session.
// Returns nil if no error has occurred.
Error() error
// Done returns a channel that's closed when the session ends.
// This is useful for select statements to detect session completion.
Done() <-chan struct{}
}
StreamInputSession manages a bidirectional streaming session with a provider. The session allows sending media chunks (e.g., audio from a microphone) and receiving streaming responses from the LLM.
Example usage:
session, err := provider.CreateStreamSession(ctx, StreamInputRequest{
Config: types.StreamingMediaConfig{
Type: types.ContentTypeAudio,
ChunkSize: 8192,
SampleRate: 16000,
Encoding: "pcm",
Channels: 1,
},
})
if err != nil {
return err
}
defer session.Close()
// Send audio chunks in a goroutine
go func() {
for chunk := range micInput {
if err := session.SendChunk(ctx, chunk); err != nil {
log.Printf("send error: %v", err)
break
}
}
}()
// Receive responses
for chunk := range session.Response() {
if chunk.Error != nil {
log.Printf("response error: %v", chunk.Error)
break
}
fmt.Print(chunk.Delta)
}
type StreamInputSupport ¶ added in v1.1.0
type StreamInputSupport interface {
Provider // Extends the base Provider interface
// CreateStreamSession creates a new bidirectional streaming session.
// The session remains active until Close() is called or an error occurs.
// Returns an error if the provider doesn't support the requested media type.
CreateStreamSession(ctx context.Context, req *StreamingInputConfig) (StreamInputSession, error)
// SupportsStreamInput returns the media types supported for streaming input.
// Common values: types.ContentTypeAudio, types.ContentTypeVideo
SupportsStreamInput() []string
// GetStreamingCapabilities returns detailed information about streaming support.
// This includes supported codecs, sample rates, and other constraints.
GetStreamingCapabilities() StreamingCapabilities
}
StreamInputSupport extends the Provider interface for bidirectional streaming. Providers that implement this interface can handle streaming media input (e.g., real-time audio) and provide streaming responses.
type StreamObserver ¶
type StreamObserver interface {
OnChunk(chunk StreamChunk)
OnComplete(totalTokens int, duration time.Duration)
OnError(err error)
}
StreamObserver receives stream events for monitoring
type StreamingCapabilities ¶ added in v1.1.0
type StreamingCapabilities struct {
// SupportedMediaTypes lists the media types that can be streamed
// Values: types.ContentTypeAudio, types.ContentTypeVideo
SupportedMediaTypes []string `json:"supported_media_types"`
// Audio capabilities
Audio *AudioStreamingCapabilities `json:"audio,omitempty"`
// Video capabilities
Video *VideoStreamingCapabilities `json:"video,omitempty"`
// BidirectionalSupport indicates if the provider supports full bidirectional streaming
BidirectionalSupport bool `json:"bidirectional_support"`
// MaxSessionDuration is the maximum duration for a streaming session (in seconds)
// Zero means no limit
MaxSessionDuration int `json:"max_session_duration,omitempty"`
// MinChunkSize is the minimum chunk size in bytes
MinChunkSize int `json:"min_chunk_size,omitempty"`
// MaxChunkSize is the maximum chunk size in bytes
MaxChunkSize int `json:"max_chunk_size,omitempty"`
}
StreamingCapabilities describes what streaming features a provider supports.
type StreamingInputConfig ¶ added in v1.1.6
type StreamingInputConfig struct {
// Config specifies the media streaming configuration (codec, sample rate, etc.)
Config types.StreamingMediaConfig `json:"config"`
// SystemInstruction is the system prompt to configure the model's behavior.
// For Gemini Live API, this is included in the setup message.
SystemInstruction string `json:"system_instruction,omitempty"`
// Tools defines functions the model can call during the session.
// When configured, the model returns structured tool calls instead of
// speaking them as text. Supported by Gemini Live API.
Tools []StreamingToolDefinition `json:"tools,omitempty"`
// Metadata contains provider-specific session configuration
// Example: {"response_modalities": ["TEXT", "AUDIO"]} for Gemini
Metadata map[string]interface{} `json:"metadata,omitempty"`
}
StreamingInputConfig configures a new streaming input session.
func (*StreamingInputConfig) Validate ¶ added in v1.1.6
func (r *StreamingInputConfig) Validate() error
Validate checks if the StreamInputRequest is valid
type StreamingToolDefinition ¶ added in v1.1.6
type StreamingToolDefinition struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
Parameters map[string]interface{} `json:"parameters,omitempty"` // JSON Schema
}
StreamingToolDefinition represents a function/tool available in streaming sessions.
type ToolDescriptor ¶
type ToolDescriptor struct {
Name string `json:"name"`
Description string `json:"description"`
InputSchema json.RawMessage `json:"input_schema"`
OutputSchema json.RawMessage `json:"output_schema"`
}
ToolDescriptor represents a tool that can be used by providers
type ToolResponse ¶ added in v1.1.6
type ToolResponse struct {
ToolCallID string `json:"tool_call_id"`
Result string `json:"result"`
IsError bool `json:"is_error,omitempty"` // True if the tool execution failed
}
ToolResponse represents a single tool execution result.
type ToolResponseSupport ¶ added in v1.1.6
type ToolResponseSupport interface {
// SendToolResponse sends the result of a tool execution back to the model.
// The toolCallID must match the ID from the MessageToolCall.
// The result is typically JSON-encoded but the format depends on the tool.
// After receiving the tool response, the model will continue generating.
SendToolResponse(ctx context.Context, toolCallID string, result string) error
// SendToolResponses sends multiple tool results at once (for parallel tool calls).
// This is more efficient than sending individual responses for providers that
// support batched tool responses.
SendToolResponses(ctx context.Context, responses []ToolResponse) error
}
ToolResponseSupport is an optional interface for streaming sessions that support tool calling. When the model returns a tool call, the caller can execute the tool and send the result back using this interface. The session will then continue generating a response based on the tool result.
Use type assertion to check if a StreamInputSession supports this interface:
if toolSession, ok := session.(ToolResponseSupport); ok {
err := toolSession.SendToolResponse(ctx, toolCallID, result)
}
type ToolResult ¶
type ToolResult = types.MessageToolResult
ToolResult represents the result of a tool execution This is an alias to types.MessageToolResult for provider-specific context
type ToolSupport ¶
type ToolSupport interface {
Provider // Extends the base Provider interface
// BuildTooling converts tool descriptors to provider-native format
BuildTooling(descriptors []*ToolDescriptor) (interface{}, error)
// PredictWithTools performs a predict request with tool support
PredictWithTools(ctx context.Context, req PredictionRequest, tools interface{}, toolChoice string) (PredictionResponse, []types.MessageToolCall, error)
// PredictStreamWithTools performs a streaming predict request with tool support
PredictStreamWithTools(
ctx context.Context,
req PredictionRequest,
tools interface{},
toolChoice string,
) (<-chan StreamChunk, error)
}
ToolSupport interface for providers that support tool/function calling
type UnsupportedContentError ¶ added in v1.1.0
type UnsupportedContentError struct {
Provider string // Provider ID
ContentType string // "image", "audio", "video", or "multimodal"
Message string // Human-readable error message
PartIndex int // Index of the unsupported content part (if applicable)
MIMEType string // Specific MIME type that's unsupported (if applicable)
}
UnsupportedContentError is returned when a provider doesn't support certain content types
func (*UnsupportedContentError) Error ¶ added in v1.1.0
func (e *UnsupportedContentError) Error() string
type UnsupportedProviderError ¶
type UnsupportedProviderError struct {
ProviderType string
}
UnsupportedProviderError is returned when a provider type is not recognized
func (*UnsupportedProviderError) Error ¶
func (e *UnsupportedProviderError) Error() string
Error returns the error message for this unsupported provider error.
type ValidationAbortError ¶
type ValidationAbortError struct {
Reason string
Chunk StreamChunk
}
ValidationAbortError is returned when a streaming validator aborts a stream
func (*ValidationAbortError) Error ¶
func (e *ValidationAbortError) Error() string
Error returns the error message for this validation abort error.
type VideoResolution ¶ added in v1.1.0
VideoResolution represents a video resolution.
func (VideoResolution) String ¶ added in v1.1.0
func (r VideoResolution) String() string
String returns a string representation of the resolution (e.g., "1920x1080")
type VideoStreamingCapabilities ¶ added in v1.1.0
type VideoStreamingCapabilities struct {
// SupportedEncodings lists supported video encodings
// Common values: "h264", "vp8", "vp9", "av1"
SupportedEncodings []string `json:"supported_encodings"`
// SupportedResolutions lists supported resolutions (width x height)
SupportedResolutions []VideoResolution `json:"supported_resolutions"`
// SupportedFrameRates lists supported frame rates
// Common values: 15, 24, 30, 60
SupportedFrameRates []int `json:"supported_frame_rates"`
// PreferredEncoding is the recommended encoding
PreferredEncoding string `json:"preferred_encoding"`
// PreferredResolution is the recommended resolution
PreferredResolution VideoResolution `json:"preferred_resolution"`
// PreferredFrameRate is the recommended frame rate
PreferredFrameRate int `json:"preferred_frame_rate"`
}
VideoStreamingCapabilities describes video streaming support.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package claude provides Anthropic Claude LLM provider integration.
|
Package claude provides Anthropic Claude LLM provider integration. |
|
Package gemini provides Gemini Live API streaming support.
|
Package gemini provides Gemini Live API streaming support. |
|
Package imagen provides Google Imagen image generation provider integration.
|
Package imagen provides Google Imagen image generation provider integration. |
|
Package mock provides mock provider implementation for testing and development.
|
Package mock provides mock provider implementation for testing and development. |
|
Package openai provides OpenAI LLM provider integration.
|
Package openai provides OpenAI LLM provider integration. |
|
Package replay provides a provider that replays recorded sessions deterministically.
|
Package replay provides a provider that replays recorded sessions deterministically. |
|
Package voyageai provides embedding generation via the Voyage AI API.
|
Package voyageai provides embedding generation via the Voyage AI API. |