modelrepo

package

v0.30.0 Latest Latest Go to latest Published: Jun 14, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/contenox/runtime

Links

Open Source Insights

Documentation ¶

Overview ¶

Package modelrepo defines the provider-facing contracts for LLM backends: the Provider interface (capabilities + client factories), the per-capability client interfaces (LLMPromptExecClient, LLMChatClient, LLMEmbedClient, LLMStreamClient), and the shared request/response types (Message, ChatResult, StreamParcel, Tool, ChatArgument).

Concrete providers live in subpackages (openai, gemini, vertex, vllm, ollama, local). Higher-level code such as llmrepo and runtimestate depends only on the interfaces declared here; provider subpackages are imported for their side effects to register catalogs with runtimestate.

Index ¶

Variables
func ClampMaxOutputTokens(requested, ceiling int) (int, bool)
func ClampMaxOutputTokensPtr(tokens *int, ceiling int) *int
func RegisterCatalogProvider(backendType string, constructor CatalogProviderConstructor)
func SetupOllamaLocalInstance(ctx context.Context, tag string) (string, testcontainers.Container, func(), error)
func SetupVLLMLocalInstance(ctx context.Context, model string, tag string, toolParser string) (string, testcontainers.Container, func(), error)
type BackendSpec
type CapabilityConfig
type CatalogFactory
- func DefaultCatalogFactory() CatalogFactory
type CatalogOption
- func WithCatalogHTTPClient(client *http.Client) CatalogOption
- func WithCatalogTracker(tracker libtracker.ActivityTracker) CatalogOption
type CatalogOptions
type CatalogProvider
- func NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)
type CatalogProviderConstructor
type ChatArgument
- func WithMaxTokens(tokens int) ChatArgument
- func WithSeed(seed int) ChatArgument
- func WithTemperature(temp float64) ChatArgument
- func WithTool(tool Tool) ChatArgument
- func WithTools(tools ...Tool) ChatArgument
- func WithTopP(p float64) ChatArgument
type ChatConfig
type ChatResult
type FunctionTool
type LLMChatClient
type LLMEmbedClient
type LLMPromptExecClient
type LLMStreamClient
type Message
type MockChatClient
- func (m *MockChatClient) Chat(ctx context.Context, messages []Message, opts ...ChatArgument) (ChatResult, error)
- func (m *MockChatClient) Close() error
type MockEmbedClient
- func (m *MockEmbedClient) Close() error
- func (m *MockEmbedClient) Embed(ctx context.Context, prompt string) ([]float64, error)
type MockPromptClient
- func (m *MockPromptClient) Close() error
- func (m *MockPromptClient) Prompt(ctx context.Context, systemInstruction string, temperature float32, ...) (string, error)
type MockProvider
- func (m *MockProvider) CanChat() bool
- func (m *MockProvider) CanEmbed() bool
- func (m *MockProvider) CanPrompt() bool
- func (m *MockProvider) CanStream() bool
- func (m *MockProvider) CanThink() bool
- func (m *MockProvider) GetBackendIDs() []string
- func (m *MockProvider) GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
- func (m *MockProvider) GetContextLength() int
- func (m *MockProvider) GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
- func (m *MockProvider) GetID() string
- func (m *MockProvider) GetMaxOutputTokens() int
- func (m *MockProvider) GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
- func (m *MockProvider) GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
- func (m *MockProvider) GetType() string
- func (m *MockProvider) ModelName() string
type MockStreamClient
- func (m *MockStreamClient) Close() error
- func (m *MockStreamClient) Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)
type ObservedModel
type Provider
type StreamParcel
type Tool
type ToolCall
type WithShift
- func (WithShift) Apply(cfg *ChatConfig)
type WithThink
- func (w WithThink) Apply(cfg *ChatConfig)

Constants ¶

This section is empty.

Variables ¶

View Source

var ErrNotSupported = errors.New("operation not supported")

ErrNotSupported is returned when an operation is not supported.

View Source

var ErrRefused = errors.New("model refused the request")

ErrRefused is returned when the model refuses to generate a response (stop_reason == "refusal"), typically due to a safety filter.

Functions ¶

func ClampMaxOutputTokens ¶ added in v0.29.0

func ClampMaxOutputTokens(requested, ceiling int) (int, bool)

ClampMaxOutputTokens returns the effective output-token request after applying a provider ceiling. A ceiling of 0 means unknown, so no clamp is applied.

func ClampMaxOutputTokensPtr ¶ added in v0.29.0

func ClampMaxOutputTokensPtr(tokens *int, ceiling int) *int

ClampMaxOutputTokensPtr copies tokens and applies ClampMaxOutputTokens. Returning a fresh pointer avoids mutating ChatConfig values captured by args.

func RegisterCatalogProvider ¶

func RegisterCatalogProvider(backendType string, constructor CatalogProviderConstructor)

RegisterCatalogProvider registers a backend catalog implementation by type. Vendor packages call this from init() to avoid import cycles from modelrepo -> vendor packages.

func SetupOllamaLocalInstance ¶

func SetupOllamaLocalInstance(ctx context.Context, tag string) (string, testcontainers.Container, func(), error)

func SetupVLLMLocalInstance ¶

func SetupVLLMLocalInstance(ctx context.Context, model string, tag string, toolParser string) (string, testcontainers.Container, func(), error)

SetupVLLMLocalInstance creates a vLLM container for testing.

Types ¶

type BackendSpec ¶

type BackendSpec struct {
	Type    string
	BaseURL string
	APIKey  string
}

BackendSpec is the runtime-independent input needed to talk to a model catalog. It deliberately excludes DB/KV concerns; callers resolve those before construction.

type CapabilityConfig ¶

type CapabilityConfig struct {
	ContextLength int
	// MaxOutputTokens is the provider's hard ceiling on output tokens.
	// Leave as 0 when unknown; the client will not clamp.
	MaxOutputTokens int
	CanChat         bool
	CanEmbed        bool
	CanStream       bool
	CanPrompt       bool
	CanThink        bool
}

type CatalogFactory ¶

type CatalogFactory interface {
	NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)
}

CatalogFactory constructs CatalogProvider implementations from backend specs.

func DefaultCatalogFactory ¶

func DefaultCatalogFactory() CatalogFactory

DefaultCatalogFactory returns the registry-backed factory used by runtimestate.

type CatalogOption ¶

type CatalogOption func(*CatalogOptions)

CatalogOption mutates CatalogOptions before a provider is constructed.

func WithCatalogHTTPClient ¶

func WithCatalogHTTPClient(client *http.Client) CatalogOption

WithCatalogHTTPClient overrides the HTTP client used for observation and Provider construction.

func WithCatalogTracker ¶

func WithCatalogTracker(tracker libtracker.ActivityTracker) CatalogOption

WithCatalogTracker injects the tracker used by ProviderFor when building execution Providers.

type CatalogOptions ¶

type CatalogOptions struct {
	HTTPClient *http.Client
	Tracker    libtracker.ActivityTracker
}

CatalogOptions carries optional construction dependencies used by vendor implementations.

type CatalogProvider ¶

type CatalogProvider interface {
	Type() string
	ListModels(ctx context.Context) ([]ObservedModel, error)
	ProviderFor(model ObservedModel) Provider
}

CatalogProvider observes the models exposed by one backend instance and can turn an observed model into the existing execution Provider abstraction.

func NewCatalogProvider ¶

func NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)

NewCatalogProvider constructs a registry-backed catalog provider.

type CatalogProviderConstructor ¶

type CatalogProviderConstructor func(spec BackendSpec, opts CatalogOptions) (CatalogProvider, error)

CatalogProviderConstructor is the registry tools implemented by vendor packages.

type ChatArgument ¶

type ChatArgument interface {
	Apply(config *ChatConfig)
}

func WithMaxTokens ¶

func WithMaxTokens(tokens int) ChatArgument

func WithSeed ¶

func WithSeed(seed int) ChatArgument

func WithTemperature ¶

func WithTemperature(temp float64) ChatArgument

func WithTool ¶

func WithTool(tool Tool) ChatArgument

func WithTools ¶

func WithTools(tools ...Tool) ChatArgument

func WithTopP ¶

func WithTopP(p float64) ChatArgument

type ChatConfig ¶

type ChatConfig struct {
	Temperature *float64 `json:"temperature,omitempty"`
	MaxTokens   *int     `json:"max_tokens,omitempty"`
	TopP        *float64 `json:"top_p,omitempty"`
	Seed        *int     `json:"seed,omitempty"`
	Tools       []Tool   `json:"tools,omitempty"`
	// Think controls reasoning-model behaviour. nil = use provider default.
	// Normalized values are auto, off, minimal, low, medium, high, and xhigh.
	Think *string `json:"think,omitempty"`
	// Shift instructs the provider to slide the context window on overflow
	// instead of returning a token-limit error.
	Shift *bool `json:"shift,omitempty"`
	// Truncate instructs the provider to truncate history on overflow.
	Truncate *bool `json:"truncate,omitempty"`
}

type ChatResult ¶

type ChatResult struct {
	Message   Message
	ToolCalls []ToolCall
}

type FunctionTool ¶

type FunctionTool struct {
	Name        string      `json:"name"`
	Description string      `json:"description,omitempty"`
	Parameters  interface{} `json:"parameters,omitempty"`
}

type LLMChatClient ¶

type LLMChatClient interface {
	Chat(ctx context.Context, messages []Message, args ...ChatArgument) (ChatResult, error)
}

Client interfaces

type LLMEmbedClient ¶

type LLMEmbedClient interface {
	Embed(ctx context.Context, prompt string) ([]float64, error)
}

type LLMPromptExecClient ¶

type LLMPromptExecClient interface {
	Prompt(ctx context.Context, systemInstruction string, temperature float32, prompt string) (string, error)
}

type LLMStreamClient ¶

type LLMStreamClient interface {
	Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)
}

type Message ¶

type Message struct {
	Role    string `json:"role"`
	Content string `json:"content"`
	// Thinking contains the model's internal reasoning trace (thinking tokens).
	// Only populated when thinking is enabled. Never sent back to the model.
	Thinking string `json:"thinking,omitempty"`

	// For tool calling (OpenAI / vLLM compatible).
	ToolCalls  []ToolCall `json:"tool_calls,omitempty"`
	ToolCallID string     `json:"tool_call_id,omitempty"`
}

Message now supports OpenAI-style tool calling: - assistant messages can carry tool_calls - tool messages can carry tool_call_id

type MockChatClient ¶

type MockChatClient struct{}

MockChatClient is a mock implementation of LLMChatClient for testing.

func (*MockChatClient) Chat ¶

func (m *MockChatClient) Chat(ctx context.Context, messages []Message, opts ...ChatArgument) (ChatResult, error)

Chat returns a mock response.

func (*MockChatClient) Close ¶

func (m *MockChatClient) Close() error

Close is a no-op for the mock client.

type MockEmbedClient ¶

type MockEmbedClient struct{}

MockEmbedClient is a mock implementation of LLMEmbedClient for testing.

func (*MockEmbedClient) Close ¶

func (m *MockEmbedClient) Close() error

Close is a no-op for the mock client.

func (*MockEmbedClient) Embed ¶

func (m *MockEmbedClient) Embed(ctx context.Context, prompt string) ([]float64, error)

Embed returns a mock embedding.

type MockPromptClient ¶

type MockPromptClient struct{}

MockPromptClient is a mock implementation of LLMPromptExecClient for testing.

func (*MockPromptClient) Close ¶

func (m *MockPromptClient) Close() error

Close is a no-op for the mock client.

func (*MockPromptClient) Prompt ¶

func (m *MockPromptClient) Prompt(ctx context.Context, systemInstruction string, temperature float32, prompt string) (string, error)

Prompt returns a mock response.

type MockProvider ¶

type MockProvider struct {
	ID              string
	Name            string
	ContextLength   int
	MaxOutputTokens int
	CanChatFlag     bool
	CanEmbedFlag    bool
	CanStreamFlag   bool
	CanPromptFlag   bool
	Backends        []string
}

MockProvider is a mock implementation of the Provider interface for testing.

func (*MockProvider) CanChat ¶

func (m *MockProvider) CanChat() bool

CanChat returns whether the mock provider can chat.

func (*MockProvider) CanEmbed ¶

func (m *MockProvider) CanEmbed() bool

CanEmbed returns whether the mock provider can embed.

func (*MockProvider) CanPrompt ¶

func (m *MockProvider) CanPrompt() bool

CanPrompt returns whether the mock provider can prompt.

func (*MockProvider) CanStream ¶

func (m *MockProvider) CanStream() bool

CanStream returns whether the mock provider can stream.

func (*MockProvider) CanThink ¶

func (m *MockProvider) CanThink() bool

CanThink returns whether the mock provider can think.

func (*MockProvider) GetBackendIDs ¶

func (m *MockProvider) GetBackendIDs() []string

GetBackendIDs returns the backend IDs for the mock provider.

func (*MockProvider) GetChatConnection ¶

func (m *MockProvider) GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)

GetChatConnection returns a mock chat client.

func (*MockProvider) GetContextLength ¶

func (m *MockProvider) GetContextLength() int

GetContextLength returns the context length for the mock provider.

func (*MockProvider) GetEmbedConnection ¶

func (m *MockProvider) GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)

GetEmbedConnection returns a mock embed client.

func (*MockProvider) GetID ¶

func (m *MockProvider) GetID() string

GetID returns the ID for the mock provider.

func (*MockProvider) GetMaxOutputTokens ¶ added in v0.29.0

func (m *MockProvider) GetMaxOutputTokens() int

GetMaxOutputTokens returns the max output tokens ceiling for the mock provider.

func (*MockProvider) GetPromptConnection ¶

func (m *MockProvider) GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)

GetPromptConnection returns a mock prompt client.

func (*MockProvider) GetStreamConnection ¶

func (m *MockProvider) GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)

GetStreamConnection returns a mock stream client.

func (*MockProvider) GetType ¶

func (m *MockProvider) GetType() string

GetType returns the provider type for the mock provider.

func (*MockProvider) ModelName ¶

func (m *MockProvider) ModelName() string

ModelName returns the model name for the mock provider.

type MockStreamClient ¶

type MockStreamClient struct{}

MockStreamClient is a mock implementation of LLMStreamClient for testing.

func (*MockStreamClient) Close ¶

func (m *MockStreamClient) Close() error

Close is a no-op for the mock client.

func (*MockStreamClient) Stream ¶

func (m *MockStreamClient) Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)

Stream returns a channel with mock stream parcels.

type ObservedModel ¶

type ObservedModel struct {
	Name          string
	ContextLength int
	ModifiedAt    time.Time
	Size          int64
	Digest        string
	CapabilityConfig
	Meta map[string]string
}

ObservedModel is the normalized result of listing models from a backend. Name is the provider-facing model identifier used for selection and execution.

type Provider ¶

type Provider interface {
	GetBackendIDs() []string
	ModelName() string
	GetID() string
	GetType() string
	GetContextLength() int
	// GetMaxOutputTokens returns the provider's hard ceiling on output tokens
	// (maxOutputTokens / max_tokens / max_completion_tokens in the wire format).
	// Returns 0 when the ceiling is unknown or effectively unlimited.
	GetMaxOutputTokens() int
	CanChat() bool
	CanEmbed() bool
	CanStream() bool
	CanPrompt() bool
	CanThink() bool
	GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
	GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
	GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
	GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
}

type StreamParcel ¶

type StreamParcel struct {
	Data string
	// Thinking carries a streamed reasoning/thinking delta separate from the
	// visible output text. Like Message.Thinking, it is provider-facing output
	// and must never be sent back as conversation history.
	Thinking string
	Error    error
}

type Tool ¶

type Tool struct {
	Type     string        `json:"type"`
	Function *FunctionTool `json:"function,omitempty"`
}

type ToolCall ¶

type ToolCall struct {
	ID       string `json:"id,omitempty"`
	Type     string `json:"type"` // only "function" for now
	Function struct {
		Name      string `json:"name"`
		Arguments string `json:"arguments"`
	} `json:"function"`
	// ProviderMeta carries opaque provider-specific data that must be
	// round-tripped back on the next turn (e.g. Gemini thought_signature).
	ProviderMeta map[string]string `json:"provider_meta,omitempty"`
}

type WithShift ¶

type WithShift struct{}

WithShift is a ChatArgument that enables context shift on overflow.

func (WithShift) Apply ¶

func (WithShift) Apply(cfg *ChatConfig)

type WithThink ¶

type WithThink string

WithThink is a ChatArgument that enables/controls reasoning mode.

func (WithThink) Apply ¶

func (w WithThink) Apply(cfg *ChatConfig)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
anthropic Package anthropic is a direct (non-Vertex) provider for the Anthropic API (api.anthropic.com), which speaks the Messages API.	Package anthropic is a direct (non-Vertex) provider for the Anthropic API (api.anthropic.com), which speaks the Messages API.
bedrock Package bedrock is a provider for AWS Bedrock via the unified Converse API.	Package bedrock is a provider for AWS Bedrock via the unified Converse API.
codec
chatcompletions Package chatcompletions is a transport-agnostic codec for the OpenAI Chat Completions wire format (`/chat/completions`-style request/response and SSE streaming).	Package chatcompletions is a transport-agnostic codec for the OpenAI Chat Completions wire format (`/chat/completions`-style request/response and SSE streaming).
messages Package messages is a transport-agnostic codec for Anthropic's Messages API wire format (request, content-block response, and named-SSE-event streaming).	Package messages is a transport-agnostic codec for Anthropic's Messages API wire format (request, content-block response, and named-SSE-event streaming).
gemini Package gemini implements the modelrepo.Provider contract against Google's Gemini Generative Language API.	Package gemini implements the modelrepo.Provider contract against Google's Gemini Generative Language API.
local Package local implements the modelrepo.Provider contract for in-process inference using llama.cpp via github.com/ollama/ollama/llama (CGo).	Package local implements the modelrepo.Provider contract for in-process inference using llama.cpp via github.com/ollama/ollama/llama (CGo).
mistral Package mistral is a direct (non-Vertex) provider for the Mistral API (api.mistral.ai), which speaks the OpenAI-compatible chat/completions format.	Package mistral is a direct (non-Vertex) provider for the Mistral API (api.mistral.ai), which speaks the OpenAI-compatible chat/completions format.
ollama Package ollama implements the modelrepo.Provider contract against Ollama HTTP endpoints.	Package ollama implements the modelrepo.Provider contract against Ollama HTTP endpoints.
openai Package openai implements the modelrepo.Provider contract against the OpenAI HTTP API and OpenAI-compatible endpoints.	Package openai implements the modelrepo.Provider contract against the OpenAI HTTP API and OpenAI-compatible endpoints.
openrouter Package openrouter is a catalog provider for OpenRouter (openrouter.ai), which exposes 300+ models from many providers through a single OpenAI-compatible endpoint.	Package openrouter is a catalog provider for OpenRouter (openrouter.ai), which exposes 300+ models from many providers through a single OpenAI-compatible endpoint.
vertex Package vertex implements the modelrepo.Provider contract against Google Vertex AI publisher endpoints, using OAuth bearer tokens minted from service-account credentials.	Package vertex implements the modelrepo.Provider contract against Google Vertex AI publisher endpoints, using OAuth bearer tokens minted from service-account credentials.
vllm Package vllm implements the modelrepo.Provider contract against vLLM OpenAI-compatible HTTP endpoints.	Package vllm implements the modelrepo.Provider contract against vLLM OpenAI-compatible HTTP endpoints.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL