provider

package
v1.3.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 21, 2026 License: Apache-2.0 Imports: 29 Imported by: 0

Documentation

Overview

Package provider provides AI provider implementations for text generation and embedding generation. Providers may support one or both capabilities.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrUnsupportedOperation indicates the provider doesn't support the requested operation.
	ErrUnsupportedOperation = errors.New("operation not supported by this provider")

	// ErrRateLimited indicates the provider rate limited the request.
	ErrRateLimited = errors.New("rate limited")

	// ErrContextTooLong indicates the input exceeded the context window.
	ErrContextTooLong = errors.New("context too long")

	// ErrProviderError indicates a general provider error.
	ErrProviderError = errors.New("provider error")
)

Common errors.

View Source
var SigLIP2BaseConfig = VisionModelConfig{
	ModelDir:         "google_siglip2-base-patch16-512",
	VisionOnnx:       "vision_model.onnx",
	TextOnnx:         "text_model.onnx",
	ImageSize:        512,
	ImageMean:        [3]float32{0.5, 0.5, 0.5},
	ImageStd:         [3]float32{0.5, 0.5, 0.5},
	VisionOutputName: "pooler_output",
}

SigLIP2BaseConfig is the configuration for google/siglip2-base-patch16-512.

Functions

This section is empty.

Types

type AnthropicConfig

type AnthropicConfig struct {
	APIKey        string
	BaseURL       string
	Model         string
	Timeout       time.Duration
	MaxRetries    int
	InitialDelay  time.Duration
	BackoffFactor float64
}

AnthropicConfig holds configuration for Anthropic provider.

type AnthropicOption

type AnthropicOption func(*AnthropicProvider)

AnthropicOption is a functional option for AnthropicProvider.

func WithAnthropicBackoffFactor

func WithAnthropicBackoffFactor(f float64) AnthropicOption

WithAnthropicBackoffFactor sets the backoff multiplier.

func WithAnthropicBaseURL

func WithAnthropicBaseURL(url string) AnthropicOption

WithAnthropicBaseURL sets the base URL (for testing or proxies).

func WithAnthropicInitialDelay

func WithAnthropicInitialDelay(d time.Duration) AnthropicOption

WithAnthropicInitialDelay sets the initial retry delay.

func WithAnthropicMaxRetries

func WithAnthropicMaxRetries(n int) AnthropicOption

WithAnthropicMaxRetries sets the maximum retry count.

func WithAnthropicModel

func WithAnthropicModel(model string) AnthropicOption

WithAnthropicModel sets the Claude model.

func WithAnthropicTimeout

func WithAnthropicTimeout(d time.Duration) AnthropicOption

WithAnthropicTimeout sets the HTTP timeout.

type AnthropicProvider

type AnthropicProvider struct {
	// contains filtered or unexported fields
}

AnthropicProvider implements text generation using Anthropic Claude API. Note: Anthropic does not provide embeddings, so this provider only supports text generation.

func NewAnthropicProvider

func NewAnthropicProvider(apiKey string, opts ...AnthropicOption) *AnthropicProvider

NewAnthropicProvider creates a new Anthropic Claude provider.

func NewAnthropicProviderFromConfig

func NewAnthropicProviderFromConfig(cfg AnthropicConfig) *AnthropicProvider

NewAnthropicProviderFromConfig creates a provider from configuration.

func (*AnthropicProvider) ChatCompletion

ChatCompletion generates a chat completion using Claude.

func (*AnthropicProvider) Close

func (p *AnthropicProvider) Close() error

Close is a no-op for the Anthropic provider.

func (*AnthropicProvider) SupportsEmbedding

func (p *AnthropicProvider) SupportsEmbedding() bool

SupportsEmbedding returns false (Anthropic doesn't support embeddings).

func (*AnthropicProvider) SupportsTextGeneration

func (p *AnthropicProvider) SupportsTextGeneration() bool

SupportsTextGeneration returns true.

type CachingTransport

type CachingTransport struct {
	// contains filtered or unexported fields
}

CachingTransport is an http.RoundTripper that caches POST request/response pairs in a SQLite database, keyed by the SHA-256 of method + URL + request body. Only 2xx responses are cached. Cache read/write errors are non-fatal — they silently fall through to the inner transport.

func NewCachingTransport

func NewCachingTransport(dir string, inner http.RoundTripper) (*CachingTransport, error)

NewCachingTransport creates a CachingTransport that stores cached responses in a SQLite database under dir/http_cache.db. If inner is nil, http.DefaultTransport is used.

func (*CachingTransport) Close

func (t *CachingTransport) Close() error

Close closes the underlying SQLite database.

func (*CachingTransport) RoundTrip

func (t *CachingTransport) RoundTrip(req *http.Request) (*http.Response, error)

RoundTrip implements http.RoundTripper.

type ChatCompletionRequest

type ChatCompletionRequest struct {
	// contains filtered or unexported fields
}

ChatCompletionRequest represents a request for text generation.

func NewChatCompletionRequest

func NewChatCompletionRequest(messages []Message) ChatCompletionRequest

NewChatCompletionRequest creates a new ChatCompletionRequest.

func (ChatCompletionRequest) MaxTokens

func (r ChatCompletionRequest) MaxTokens() int

MaxTokens returns the max tokens setting.

func (ChatCompletionRequest) Messages

func (r ChatCompletionRequest) Messages() []Message

Messages returns the messages.

func (ChatCompletionRequest) Temperature

func (r ChatCompletionRequest) Temperature() float64

Temperature returns the temperature setting.

func (ChatCompletionRequest) WithMaxTokens

func (r ChatCompletionRequest) WithMaxTokens(n int) ChatCompletionRequest

WithMaxTokens returns a new request with the specified max tokens.

func (ChatCompletionRequest) WithTemperature

WithTemperature returns a new request with the specified temperature.

type ChatCompletionResponse

type ChatCompletionResponse struct {
	// contains filtered or unexported fields
}

ChatCompletionResponse represents a text generation response.

func NewChatCompletionResponse

func NewChatCompletionResponse(content, finishReason string, usage Usage) ChatCompletionResponse

NewChatCompletionResponse creates a new ChatCompletionResponse.

func (ChatCompletionResponse) Content

func (r ChatCompletionResponse) Content() string

Content returns the generated content.

func (ChatCompletionResponse) FinishReason

func (r ChatCompletionResponse) FinishReason() string

FinishReason returns why generation stopped.

func (ChatCompletionResponse) Usage

func (r ChatCompletionResponse) Usage() Usage

Usage returns token usage information.

type HugotEmbedding

type HugotEmbedding struct {
	// contains filtered or unexported fields
}

HugotEmbedding provides local embedding generation using the st-codesearch-distilroberta-base model via the hugot Go backend.

The model can come from two sources (checked in order):

  1. Model files on disk — a subdirectory of cacheDir containing tokenizer.json.
  2. Statically embedded in the binary (build tag embed_model), extracted to cacheDir on first use.

All instances share a single ONNX Runtime session because ORT only supports one active session per process.

func NewHugotEmbedding

func NewHugotEmbedding(cacheDir string) *HugotEmbedding

NewHugotEmbedding creates a HugotEmbedding that looks for model files in cacheDir. If no model exists on disk and the embed_model build tag was used, the embedded model is extracted to cacheDir automatically.

func (*HugotEmbedding) Available

func (h *HugotEmbedding) Available() bool

Available reports whether a usable model exists — either compiled into the binary (embed_model build tag) or present on disk in cacheDir.

func (*HugotEmbedding) Close

func (h *HugotEmbedding) Close() error

Close is a no-op. The ONNX Runtime session is process-global and shared across all HugotEmbedding instances; it is cleaned up when the process exits.

func (*HugotEmbedding) Embed

func (h *HugotEmbedding) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)

Embed generates embeddings for the given text items using the local model. Items without a text payload return an error — hugot is a text-only model.

type LocalVisionEmbedding added in v1.3.0

type LocalVisionEmbedding struct {
	// contains filtered or unexported fields
}

LocalVisionEmbedding manages a local ONNX dual-encoder vision-language model. It implements search.Embedder, dispatching each item to either the vision or text encoder based on its payload. Both encoders produce vectors in the same embedding space.

The specific model (SigLIP2, CLIP, etc.) is determined by the VisionModelConfig passed at construction time. All instances share the process-wide ORT session.

func NewLocalVisionEmbedding added in v1.3.0

func NewLocalVisionEmbedding(config VisionModelConfig, cacheDir string) *LocalVisionEmbedding

NewLocalVisionEmbedding creates a LocalVisionEmbedding for the model described by config, looking for files in cacheDir.

func (*LocalVisionEmbedding) Close added in v1.3.0

func (l *LocalVisionEmbedding) Close() error

Close is a no-op. The ORT session is process-global.

func (*LocalVisionEmbedding) Embed added in v1.3.1

func (l *LocalVisionEmbedding) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)

Embed dispatches each item to the vision or text ONNX pipeline based on which payload the item carries. Image items go through the vision encoder, text items go through the text encoder; both produce vectors in the same embedding space. Items carrying both payloads use the image encoder (the local SigLIP2 model is a dual encoder and cannot embed a combined input).

type Message

type Message struct {
	// contains filtered or unexported fields
}

Message represents a chat message.

func AssistantMessage

func AssistantMessage(content string) Message

AssistantMessage creates an assistant message.

func NewMessage

func NewMessage(role, content string) Message

NewMessage creates a new Message.

func SystemMessage

func SystemMessage(content string) Message

SystemMessage creates a system message.

func UserMessage

func UserMessage(content string) Message

UserMessage creates a user message.

func (Message) Content

func (m Message) Content() string

Content returns the message content.

func (Message) Role

func (m Message) Role() string

Role returns the message role (e.g., "system", "user", "assistant").

type OpenAIConfig

type OpenAIConfig struct {
	APIKey              string
	BaseURL             string
	ChatModel           string
	EmbeddingModel      string
	Timeout             time.Duration
	MaxRetries          int
	InitialDelay        time.Duration
	BackoffFactor       float64
	HTTPClient          *http.Client
	ExtraParams         map[string]any
	QueryInstruction    string
	DocumentInstruction string
}

OpenAIConfig holds configuration for OpenAI provider.

type OpenAIOption

type OpenAIOption func(*OpenAIProvider)

OpenAIOption is a functional option for OpenAIProvider.

func WithBackoffFactor

func WithBackoffFactor(f float64) OpenAIOption

WithBackoffFactor sets the backoff multiplier.

func WithChatModel

func WithChatModel(model string) OpenAIOption

WithChatModel sets the chat completion model.

func WithEmbeddingModel

func WithEmbeddingModel(model string) OpenAIOption

WithEmbeddingModel sets the embedding model.

func WithInitialDelay

func WithInitialDelay(d time.Duration) OpenAIOption

WithInitialDelay sets the initial retry delay.

func WithMaxRetries

func WithMaxRetries(n int) OpenAIOption

WithMaxRetries sets the maximum retry count.

type OpenAIProvider

type OpenAIProvider struct {
	// contains filtered or unexported fields
}

OpenAIProvider implements both text generation and embedding using OpenAI API.

func NewOpenAIProvider

func NewOpenAIProvider(apiKey string, opts ...OpenAIOption) *OpenAIProvider

NewOpenAIProvider creates a new OpenAI provider.

func NewOpenAIProviderFromConfig

func NewOpenAIProviderFromConfig(cfg OpenAIConfig) *OpenAIProvider

NewOpenAIProviderFromConfig creates a provider from configuration.

func (*OpenAIProvider) ChatCompletion

ChatCompletion generates a chat completion.

func (*OpenAIProvider) Close

func (p *OpenAIProvider) Close() error

Close is a no-op for the OpenAI provider.

func (*OpenAIProvider) Embed

func (p *OpenAIProvider) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)

Embed generates embeddings for the given text items in a single API call. Items without a text payload return an error — OpenAI text embedding endpoints do not accept image inputs.

func (*OpenAIProvider) SupportsEmbedding

func (p *OpenAIProvider) SupportsEmbedding() bool

SupportsEmbedding returns true.

func (*OpenAIProvider) SupportsTextGeneration

func (p *OpenAIProvider) SupportsTextGeneration() bool

SupportsTextGeneration returns true.

type OpenAIVisionProvider added in v1.3.1

type OpenAIVisionProvider struct {
	// contains filtered or unexported fields
}

OpenAIVisionProvider embeds text or image inputs via an OpenAI-compatible vision-language embedding API (e.g. Qwen3-VL-Embedding). It implements Embedder and uses the vLLM "messages" format for all inputs so that the model's chat template is applied consistently across modalities.

func NewOpenAIVisionProvider added in v1.3.1

func NewOpenAIVisionProvider(cfg OpenAIConfig) *OpenAIVisionProvider

NewOpenAIVisionProvider creates a provider from configuration.

func (*OpenAIVisionProvider) Close added in v1.3.1

func (p *OpenAIVisionProvider) Close() error

Close is a no-op for the remote provider.

func (*OpenAIVisionProvider) Embed added in v1.3.1

func (p *OpenAIVisionProvider) Embed(ctx context.Context, items []search.EmbeddingItem) ([][]float64, error)

Embed sends each item to the remote API using the vLLM "messages" format. Both text and image items are sent as chat messages because Qwen3-VL-Embedding applies a chat template that must be consistent across modalities for cross-modal search to work. Sending text queries via the plain "input" field would bypass the chat template, placing them in a different embedding space than image embeddings.

type ProviderError

type ProviderError struct {
	// contains filtered or unexported fields
}

ProviderError wraps provider errors with additional context.

func NewProviderError

func NewProviderError(operation string, statusCode int, message string, cause error) *ProviderError

NewProviderError creates a new ProviderError.

func (*ProviderError) Error

func (e *ProviderError) Error() string

Error implements the error interface.

func (*ProviderError) IsContextTooLong

func (e *ProviderError) IsContextTooLong() bool

IsContextTooLong returns true if the error is due to context length.

func (*ProviderError) IsRateLimited

func (e *ProviderError) IsRateLimited() bool

IsRateLimited returns true if the error is due to rate limiting.

func (*ProviderError) Message

func (e *ProviderError) Message() string

Message returns the error message.

func (*ProviderError) Operation

func (e *ProviderError) Operation() string

Operation returns the operation that failed.

func (*ProviderError) StatusCode

func (e *ProviderError) StatusCode() int

StatusCode returns the HTTP status code if available.

func (*ProviderError) Unwrap

func (e *ProviderError) Unwrap() error

Unwrap returns the underlying cause.

type TextGenerator

type TextGenerator interface {
	// ChatCompletion generates a text completion for the given messages.
	ChatCompletion(ctx context.Context, req ChatCompletionRequest) (ChatCompletionResponse, error)
}

TextGenerator generates text completions.

type Usage

type Usage struct {
	// contains filtered or unexported fields
}

Usage represents token usage information.

func NewUsage

func NewUsage(prompt, completion, total int) Usage

NewUsage creates a new Usage.

func (Usage) CompletionTokens

func (u Usage) CompletionTokens() int

CompletionTokens returns the number of completion tokens.

func (Usage) PromptTokens

func (u Usage) PromptTokens() int

PromptTokens returns the number of prompt tokens.

func (Usage) TotalTokens

func (u Usage) TotalTokens() int

TotalTokens returns the total number of tokens.

type VisionModelConfig added in v1.3.0

type VisionModelConfig struct {
	// ModelDir is the subdirectory name under the models cache directory
	// (e.g. "google_siglip2-base-patch16-512").
	ModelDir string

	// VisionOnnx is the ONNX filename for the vision encoder
	// (e.g. "vision_model.onnx").
	VisionOnnx string

	// TextOnnx is the ONNX filename for the text encoder
	// (e.g. "text_model.onnx").
	TextOnnx string

	// ImageSize is the target height/width in pixels after resize and crop
	// (e.g. 512 for SigLIP2, 224 for CLIP ViT-B/32).
	ImageSize int

	// ImageMean is the per-channel normalization mean applied after rescaling
	// pixel values to [0, 1].
	ImageMean [3]float32

	// ImageStd is the per-channel normalization standard deviation.
	ImageStd [3]float32

	// VisionOutputName selects which model output to use for embeddings
	// (e.g. "pooler_output"). Empty string uses the first output.
	VisionOutputName string
}

VisionModelConfig describes how to load and preprocess images for a specific local ONNX vision-language model. Different models (SigLIP2, CLIP, etc.) provide different configs; the runtime code is shared.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL