Documentation
¶
Overview ¶
Package provider provides AI provider implementations for text generation and embedding generation. Providers may support one or both capabilities.
Index ¶
- Variables
- type AnthropicConfig
- type AnthropicOption
- func WithAnthropicBackoffFactor(f float64) AnthropicOption
- func WithAnthropicBaseURL(url string) AnthropicOption
- func WithAnthropicInitialDelay(d time.Duration) AnthropicOption
- func WithAnthropicMaxRetries(n int) AnthropicOption
- func WithAnthropicModel(model string) AnthropicOption
- func WithAnthropicTimeout(d time.Duration) AnthropicOption
- type AnthropicProvider
- type CachingTransport
- type ChatCompletionRequest
- func (r ChatCompletionRequest) MaxTokens() int
- func (r ChatCompletionRequest) Messages() []Message
- func (r ChatCompletionRequest) Temperature() float64
- func (r ChatCompletionRequest) WithMaxTokens(n int) ChatCompletionRequest
- func (r ChatCompletionRequest) WithTemperature(t float64) ChatCompletionRequest
- type ChatCompletionResponse
- type HugotEmbedding
- type LocalVisionEmbedding
- type Message
- type OpenAIConfig
- type OpenAIOption
- type OpenAIProvider
- func (p *OpenAIProvider) ChatCompletion(ctx context.Context, req ChatCompletionRequest) (ChatCompletionResponse, error)
- func (p *OpenAIProvider) Close() error
- func (p *OpenAIProvider) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)
- func (p *OpenAIProvider) SupportsEmbedding() bool
- func (p *OpenAIProvider) SupportsTextGeneration() bool
- type OpenAIVisionProvider
- type ProviderError
- type TextGenerator
- type Usage
- type VisionModelConfig
Constants ¶
This section is empty.
Variables ¶
var ( // ErrUnsupportedOperation indicates the provider doesn't support the requested operation. ErrUnsupportedOperation = errors.New("operation not supported by this provider") // ErrRateLimited indicates the provider rate limited the request. ErrRateLimited = errors.New("rate limited") // ErrContextTooLong indicates the input exceeded the context window. ErrContextTooLong = errors.New("context too long") // ErrProviderError indicates a general provider error. ErrProviderError = errors.New("provider error") )
Common errors.
var SigLIP2BaseConfig = VisionModelConfig{ ModelDir: "google_siglip2-base-patch16-512", VisionOnnx: "vision_model.onnx", TextOnnx: "text_model.onnx", ImageSize: 512, ImageMean: [3]float32{0.5, 0.5, 0.5}, ImageStd: [3]float32{0.5, 0.5, 0.5}, VisionOutputName: "pooler_output", }
SigLIP2BaseConfig is the configuration for google/siglip2-base-patch16-512.
Functions ¶
This section is empty.
Types ¶
type AnthropicConfig ¶
type AnthropicConfig struct {
APIKey string
BaseURL string
Model string
Timeout time.Duration
MaxRetries int
InitialDelay time.Duration
BackoffFactor float64
}
AnthropicConfig holds configuration for Anthropic provider.
type AnthropicOption ¶
type AnthropicOption func(*AnthropicProvider)
AnthropicOption is a functional option for AnthropicProvider.
func WithAnthropicBackoffFactor ¶
func WithAnthropicBackoffFactor(f float64) AnthropicOption
WithAnthropicBackoffFactor sets the backoff multiplier.
func WithAnthropicBaseURL ¶
func WithAnthropicBaseURL(url string) AnthropicOption
WithAnthropicBaseURL sets the base URL (for testing or proxies).
func WithAnthropicInitialDelay ¶
func WithAnthropicInitialDelay(d time.Duration) AnthropicOption
WithAnthropicInitialDelay sets the initial retry delay.
func WithAnthropicMaxRetries ¶
func WithAnthropicMaxRetries(n int) AnthropicOption
WithAnthropicMaxRetries sets the maximum retry count.
func WithAnthropicModel ¶
func WithAnthropicModel(model string) AnthropicOption
WithAnthropicModel sets the Claude model.
func WithAnthropicTimeout ¶
func WithAnthropicTimeout(d time.Duration) AnthropicOption
WithAnthropicTimeout sets the HTTP timeout.
type AnthropicProvider ¶
type AnthropicProvider struct {
// contains filtered or unexported fields
}
AnthropicProvider implements text generation using Anthropic Claude API. Note: Anthropic does not provide embeddings, so this provider only supports text generation.
func NewAnthropicProvider ¶
func NewAnthropicProvider(apiKey string, opts ...AnthropicOption) *AnthropicProvider
NewAnthropicProvider creates a new Anthropic Claude provider.
func NewAnthropicProviderFromConfig ¶
func NewAnthropicProviderFromConfig(cfg AnthropicConfig) *AnthropicProvider
NewAnthropicProviderFromConfig creates a provider from configuration.
func (*AnthropicProvider) ChatCompletion ¶
func (p *AnthropicProvider) ChatCompletion(ctx context.Context, req ChatCompletionRequest) (ChatCompletionResponse, error)
ChatCompletion generates a chat completion using Claude.
func (*AnthropicProvider) Close ¶
func (p *AnthropicProvider) Close() error
Close is a no-op for the Anthropic provider.
func (*AnthropicProvider) SupportsEmbedding ¶
func (p *AnthropicProvider) SupportsEmbedding() bool
SupportsEmbedding returns false (Anthropic doesn't support embeddings).
func (*AnthropicProvider) SupportsTextGeneration ¶
func (p *AnthropicProvider) SupportsTextGeneration() bool
SupportsTextGeneration returns true.
type CachingTransport ¶
type CachingTransport struct {
// contains filtered or unexported fields
}
CachingTransport is an http.RoundTripper that caches POST request/response pairs in a SQLite database, keyed by the SHA-256 of method + URL + request body. Only 2xx responses are cached. Cache read/write errors are non-fatal — they silently fall through to the inner transport.
func NewCachingTransport ¶
func NewCachingTransport(dir string, inner http.RoundTripper) (*CachingTransport, error)
NewCachingTransport creates a CachingTransport that stores cached responses in a SQLite database under dir/http_cache.db. If inner is nil, http.DefaultTransport is used.
func (*CachingTransport) Close ¶
func (t *CachingTransport) Close() error
Close closes the underlying SQLite database.
type ChatCompletionRequest ¶
type ChatCompletionRequest struct {
// contains filtered or unexported fields
}
ChatCompletionRequest represents a request for text generation.
func NewChatCompletionRequest ¶
func NewChatCompletionRequest(messages []Message) ChatCompletionRequest
NewChatCompletionRequest creates a new ChatCompletionRequest.
func (ChatCompletionRequest) MaxTokens ¶
func (r ChatCompletionRequest) MaxTokens() int
MaxTokens returns the max tokens setting.
func (ChatCompletionRequest) Messages ¶
func (r ChatCompletionRequest) Messages() []Message
Messages returns the messages.
func (ChatCompletionRequest) Temperature ¶
func (r ChatCompletionRequest) Temperature() float64
Temperature returns the temperature setting.
func (ChatCompletionRequest) WithMaxTokens ¶
func (r ChatCompletionRequest) WithMaxTokens(n int) ChatCompletionRequest
WithMaxTokens returns a new request with the specified max tokens.
func (ChatCompletionRequest) WithTemperature ¶
func (r ChatCompletionRequest) WithTemperature(t float64) ChatCompletionRequest
WithTemperature returns a new request with the specified temperature.
type ChatCompletionResponse ¶
type ChatCompletionResponse struct {
// contains filtered or unexported fields
}
ChatCompletionResponse represents a text generation response.
func NewChatCompletionResponse ¶
func NewChatCompletionResponse(content, finishReason string, usage Usage) ChatCompletionResponse
NewChatCompletionResponse creates a new ChatCompletionResponse.
func (ChatCompletionResponse) Content ¶
func (r ChatCompletionResponse) Content() string
Content returns the generated content.
func (ChatCompletionResponse) FinishReason ¶
func (r ChatCompletionResponse) FinishReason() string
FinishReason returns why generation stopped.
func (ChatCompletionResponse) Usage ¶
func (r ChatCompletionResponse) Usage() Usage
Usage returns token usage information.
type HugotEmbedding ¶
type HugotEmbedding struct {
// contains filtered or unexported fields
}
HugotEmbedding provides local embedding generation using the st-codesearch-distilroberta-base model via the hugot Go backend.
The model can come from two sources (checked in order):
- Model files on disk — a subdirectory of cacheDir containing tokenizer.json.
- Statically embedded in the binary (build tag embed_model), extracted to cacheDir on first use.
All instances share a single ONNX Runtime session because ORT only supports one active session per process.
func NewHugotEmbedding ¶
func NewHugotEmbedding(cacheDir string) *HugotEmbedding
NewHugotEmbedding creates a HugotEmbedding that looks for model files in cacheDir. If no model exists on disk and the embed_model build tag was used, the embedded model is extracted to cacheDir automatically.
func (*HugotEmbedding) Available ¶
func (h *HugotEmbedding) Available() bool
Available reports whether a usable model exists — either compiled into the binary (embed_model build tag) or present on disk in cacheDir.
func (*HugotEmbedding) Close ¶
func (h *HugotEmbedding) Close() error
Close is a no-op. The ONNX Runtime session is process-global and shared across all HugotEmbedding instances; it is cleaned up when the process exits.
func (*HugotEmbedding) Embed ¶
func (h *HugotEmbedding) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)
Embed generates embeddings for the given text items using the local model. Items without a text payload return an error — hugot is a text-only model.
type LocalVisionEmbedding ¶ added in v1.3.0
type LocalVisionEmbedding struct {
// contains filtered or unexported fields
}
LocalVisionEmbedding manages a local ONNX dual-encoder vision-language model. It implements search.Embedder, dispatching each item to either the vision or text encoder based on its payload. Both encoders produce vectors in the same embedding space.
The specific model (SigLIP2, CLIP, etc.) is determined by the VisionModelConfig passed at construction time. All instances share the process-wide ORT session.
func NewLocalVisionEmbedding ¶ added in v1.3.0
func NewLocalVisionEmbedding(config VisionModelConfig, cacheDir string) *LocalVisionEmbedding
NewLocalVisionEmbedding creates a LocalVisionEmbedding for the model described by config, looking for files in cacheDir.
func (*LocalVisionEmbedding) Close ¶ added in v1.3.0
func (l *LocalVisionEmbedding) Close() error
Close is a no-op. The ORT session is process-global.
func (*LocalVisionEmbedding) Embed ¶ added in v1.3.1
func (l *LocalVisionEmbedding) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)
Embed dispatches each item to the vision or text ONNX pipeline based on which payload the item carries. Image items go through the vision encoder, text items go through the text encoder; both produce vectors in the same embedding space. Items carrying both payloads use the image encoder (the local SigLIP2 model is a dual encoder and cannot embed a combined input).
type Message ¶
type Message struct {
// contains filtered or unexported fields
}
Message represents a chat message.
func AssistantMessage ¶
AssistantMessage creates an assistant message.
func SystemMessage ¶
SystemMessage creates a system message.
type OpenAIConfig ¶
type OpenAIConfig struct {
APIKey string
BaseURL string
ChatModel string
EmbeddingModel string
Timeout time.Duration
MaxRetries int
InitialDelay time.Duration
BackoffFactor float64
HTTPClient *http.Client
ExtraParams map[string]any
QueryInstruction string
DocumentInstruction string
}
OpenAIConfig holds configuration for OpenAI provider.
type OpenAIOption ¶
type OpenAIOption func(*OpenAIProvider)
OpenAIOption is a functional option for OpenAIProvider.
func WithBackoffFactor ¶
func WithBackoffFactor(f float64) OpenAIOption
WithBackoffFactor sets the backoff multiplier.
func WithChatModel ¶
func WithChatModel(model string) OpenAIOption
WithChatModel sets the chat completion model.
func WithEmbeddingModel ¶
func WithEmbeddingModel(model string) OpenAIOption
WithEmbeddingModel sets the embedding model.
func WithInitialDelay ¶
func WithInitialDelay(d time.Duration) OpenAIOption
WithInitialDelay sets the initial retry delay.
func WithMaxRetries ¶
func WithMaxRetries(n int) OpenAIOption
WithMaxRetries sets the maximum retry count.
type OpenAIProvider ¶
type OpenAIProvider struct {
// contains filtered or unexported fields
}
OpenAIProvider implements both text generation and embedding using OpenAI API.
func NewOpenAIProvider ¶
func NewOpenAIProvider(apiKey string, opts ...OpenAIOption) *OpenAIProvider
NewOpenAIProvider creates a new OpenAI provider.
func NewOpenAIProviderFromConfig ¶
func NewOpenAIProviderFromConfig(cfg OpenAIConfig) *OpenAIProvider
NewOpenAIProviderFromConfig creates a provider from configuration.
func (*OpenAIProvider) ChatCompletion ¶
func (p *OpenAIProvider) ChatCompletion(ctx context.Context, req ChatCompletionRequest) (ChatCompletionResponse, error)
ChatCompletion generates a chat completion.
func (*OpenAIProvider) Close ¶
func (p *OpenAIProvider) Close() error
Close is a no-op for the OpenAI provider.
func (*OpenAIProvider) Embed ¶
func (p *OpenAIProvider) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)
Embed generates embeddings for the given text items in a single API call. Items without a text payload return an error — OpenAI text embedding endpoints do not accept image inputs.
func (*OpenAIProvider) SupportsEmbedding ¶
func (p *OpenAIProvider) SupportsEmbedding() bool
SupportsEmbedding returns true.
func (*OpenAIProvider) SupportsTextGeneration ¶
func (p *OpenAIProvider) SupportsTextGeneration() bool
SupportsTextGeneration returns true.
type OpenAIVisionProvider ¶ added in v1.3.1
type OpenAIVisionProvider struct {
// contains filtered or unexported fields
}
OpenAIVisionProvider embeds text or image inputs via an OpenAI-compatible vision-language embedding API (e.g. Qwen3-VL-Embedding). It implements Embedder and uses the vLLM "messages" format for all inputs so that the model's chat template is applied consistently across modalities.
func NewOpenAIVisionProvider ¶ added in v1.3.1
func NewOpenAIVisionProvider(cfg OpenAIConfig) *OpenAIVisionProvider
NewOpenAIVisionProvider creates a provider from configuration.
func (*OpenAIVisionProvider) Close ¶ added in v1.3.1
func (p *OpenAIVisionProvider) Close() error
Close is a no-op for the remote provider.
func (*OpenAIVisionProvider) Embed ¶ added in v1.3.1
func (p *OpenAIVisionProvider) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)
Embed sends each item to the remote API using the vLLM "messages" format. Both text and image items are sent as chat messages because Qwen3-VL-Embedding applies a chat template that must be consistent across modalities for cross-modal search to work. Sending text queries via the plain "input" field would bypass the chat template, placing them in a different embedding space than image embeddings.
type ProviderError ¶
type ProviderError struct {
// contains filtered or unexported fields
}
ProviderError wraps provider errors with additional context.
func NewProviderError ¶
func NewProviderError(operation string, statusCode int, message string, cause error) *ProviderError
NewProviderError creates a new ProviderError.
func (*ProviderError) Error ¶
func (e *ProviderError) Error() string
Error implements the error interface.
func (*ProviderError) IsContextTooLong ¶
func (e *ProviderError) IsContextTooLong() bool
IsContextTooLong returns true if the error is due to context length.
func (*ProviderError) IsRateLimited ¶
func (e *ProviderError) IsRateLimited() bool
IsRateLimited returns true if the error is due to rate limiting.
func (*ProviderError) Message ¶
func (e *ProviderError) Message() string
Message returns the error message.
func (*ProviderError) Operation ¶
func (e *ProviderError) Operation() string
Operation returns the operation that failed.
func (*ProviderError) StatusCode ¶
func (e *ProviderError) StatusCode() int
StatusCode returns the HTTP status code if available.
func (*ProviderError) Unwrap ¶
func (e *ProviderError) Unwrap() error
Unwrap returns the underlying cause.
type TextGenerator ¶
type TextGenerator interface {
// ChatCompletion generates a text completion for the given messages.
ChatCompletion(ctx context.Context, req ChatCompletionRequest) (ChatCompletionResponse, error)
}
TextGenerator generates text completions.
type Usage ¶
type Usage struct {
// contains filtered or unexported fields
}
Usage represents token usage information.
func (Usage) CompletionTokens ¶
CompletionTokens returns the number of completion tokens.
func (Usage) PromptTokens ¶
PromptTokens returns the number of prompt tokens.
func (Usage) TotalTokens ¶
TotalTokens returns the total number of tokens.
type VisionModelConfig ¶ added in v1.3.0
type VisionModelConfig struct {
// ModelDir is the subdirectory name under the models cache directory
// (e.g. "google_siglip2-base-patch16-512").
ModelDir string
// VisionOnnx is the ONNX filename for the vision encoder
// (e.g. "vision_model.onnx").
VisionOnnx string
// TextOnnx is the ONNX filename for the text encoder
// (e.g. "text_model.onnx").
TextOnnx string
// ImageSize is the target height/width in pixels after resize and crop
// (e.g. 512 for SigLIP2, 224 for CLIP ViT-B/32).
ImageSize int
// ImageMean is the per-channel normalization mean applied after rescaling
// pixel values to [0, 1].
ImageMean [3]float32
// ImageStd is the per-channel normalization standard deviation.
ImageStd [3]float32
// VisionOutputName selects which model output to use for embeddings
// (e.g. "pooler_output"). Empty string uses the first output.
VisionOutputName string
}
VisionModelConfig describes how to load and preprocess images for a specific local ONNX vision-language model. Different models (SigLIP2, CLIP, etc.) provide different configs; the runtime code is shared.