modeld

package
v0.32.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 17, 2026 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Overview

Package modeld is the split-out, transport-facing owner of the model repository. It defines the provider-facing contracts for LLM backends: the Provider interface (capabilities + client factories), the per-capability client interfaces (LLMPromptExecClient, LLMChatClient, LLMEmbedClient, LLMStreamClient), and the shared request/response types (Message, ChatResult, StreamParcel, Tool, ChatArgument).

Concrete providers live in subpackages (openai, gemini, vertex, vllm, ollama, local). Higher-level code depends only on the interfaces declared here; provider subpackages are imported for their side effects to register catalogs.

Daemon (see daemon.go) is the in-process singleton that owns backend state behind a mutex, and the transport subpackage exposes that state over the wire (gRPC today, HTTP later).

Index

Constants

This section is empty.

Variables

View Source
var ErrNotSupported = errors.New("operation not supported")

ErrNotSupported is returned when an operation is not supported.

View Source
var ErrRefused = errors.New("model refused the request")

ErrRefused is returned when the model refuses to generate a response (stop_reason == "refusal"), typically due to a safety filter.

Functions

func CanonicalBackendType

func CanonicalBackendType(backendType string) string

CanonicalBackendType maps compatibility backend keywords to the implementation type used by the runtime. "local" is the historical embedded GGUF keyword; the implementation now lives under the feature-complete "llama" provider.

func ClampMaxOutputTokens

func ClampMaxOutputTokens(requested, ceiling int) (int, bool)

ClampMaxOutputTokens returns the effective output-token request after applying a provider ceiling. A ceiling of 0 means unknown, so no clamp is applied.

func ClampMaxOutputTokensPtr

func ClampMaxOutputTokensPtr(tokens *int, ceiling int) *int

ClampMaxOutputTokensPtr copies tokens and applies ClampMaxOutputTokens. Returning a fresh pointer avoids mutating ChatConfig values captured by args.

func RegisterCatalogProvider

func RegisterCatalogProvider(backendType string, constructor CatalogProviderConstructor)

RegisterCatalogProvider registers a backend catalog implementation by type. Vendor packages call this from init() to avoid import cycles from modelrepo -> vendor packages.

func RegisterShutdownHook

func RegisterShutdownHook(fn func() error)

RegisterShutdownHook registers fn to be run by Shutdown. It is intended to be called from a backend package's init(). A nil fn is ignored.

func SetupVLLMLocalInstance

func SetupVLLMLocalInstance(ctx context.Context, model string, tag string, toolParser string) (string, testcontainers.Container, func(), error)

SetupVLLMLocalInstance creates a vLLM container for testing.

func Shutdown

func Shutdown() error

Shutdown runs every registered shutdown hook and returns the first error, if any. All hooks run even if an earlier one fails. It is safe to call when no hooks are registered.

Types

type BackendSpec

type BackendSpec struct {
	Type    string
	BaseURL string
	APIKey  string
}

BackendSpec is the runtime-independent input needed to talk to a model catalog. It deliberately excludes DB/KV concerns; callers resolve those before construction.

type CapabilityConfig

type CapabilityConfig struct {
	ContextLength int
	// MaxOutputTokens is the provider's hard ceiling on output tokens.
	// Leave as 0 when unknown; the client will not clamp.
	MaxOutputTokens int
	CanChat         bool
	CanEmbed        bool
	CanStream       bool
	CanPrompt       bool
	CanThink        bool
}

type CatalogFactory

type CatalogFactory interface {
	NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)
}

CatalogFactory constructs CatalogProvider implementations from backend specs.

func DefaultCatalogFactory

func DefaultCatalogFactory() CatalogFactory

DefaultCatalogFactory returns the registry-backed factory used by runtimestate.

type CatalogOption

type CatalogOption func(*CatalogOptions)

CatalogOption mutates CatalogOptions before a provider is constructed.

func WithCatalogHTTPClient

func WithCatalogHTTPClient(client *http.Client) CatalogOption

WithCatalogHTTPClient overrides the HTTP client used for observation and Provider construction.

func WithCatalogTracker

func WithCatalogTracker(tracker libtracker.ActivityTracker) CatalogOption

WithCatalogTracker injects the tracker used by ProviderFor when building execution Providers.

type CatalogOptions

type CatalogOptions struct {
	HTTPClient *http.Client
	Tracker    libtracker.ActivityTracker
}

CatalogOptions carries optional construction dependencies used by vendor implementations.

type CatalogProvider

type CatalogProvider interface {
	Type() string
	ListModels(ctx context.Context) ([]ObservedModel, error)
	ProviderFor(model ObservedModel) Provider
}

CatalogProvider observes the models exposed by one backend instance and can turn an observed model into the existing execution Provider abstraction.

func NewCatalogProvider

func NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)

NewCatalogProvider constructs a registry-backed catalog provider.

type CatalogProviderConstructor

type CatalogProviderConstructor func(spec BackendSpec, opts CatalogOptions) (CatalogProvider, error)

CatalogProviderConstructor is the registry tools implemented by vendor packages.

type ChatArgument

type ChatArgument interface {
	Apply(config *ChatConfig)
}

func WithMaxTokens

func WithMaxTokens(tokens int) ChatArgument

func WithSeed

func WithSeed(seed int) ChatArgument

func WithTemperature

func WithTemperature(temp float64) ChatArgument

func WithTool

func WithTool(tool Tool) ChatArgument

func WithTools

func WithTools(tools ...Tool) ChatArgument

func WithTopP

func WithTopP(p float64) ChatArgument

type ChatConfig

type ChatConfig struct {
	Temperature *float64 `json:"temperature,omitempty"`
	MaxTokens   *int     `json:"max_tokens,omitempty"`
	TopP        *float64 `json:"top_p,omitempty"`
	Seed        *int     `json:"seed,omitempty"`
	Tools       []Tool   `json:"tools,omitempty"`
	// Think controls reasoning-model behaviour. nil = use provider default.
	// Normalized values are auto, off, minimal, low, medium, high, and xhigh.
	Think *string `json:"think,omitempty"`
	// Shift instructs the provider to slide the context window on overflow
	// instead of returning a token-limit error.
	Shift *bool `json:"shift,omitempty"`
	// Truncate instructs the provider to truncate history on overflow.
	Truncate *bool `json:"truncate,omitempty"`
}

type ChatResult

type ChatResult struct {
	Message   Message
	ToolCalls []ToolCall
}

type Daemon

type Daemon struct {
	// contains filtered or unexported fields
}

Daemon is the process-wide owner of model-repository state. It is the in-process singleton that a transport (see the transport subpackage) serves: it holds the configured backends and their resolved catalog providers behind a mutex so concurrent requests observe one authoritative, consistent state.

modeld is being split out of the runtime package to expose the modelrepo API over a wire transport and to own backend lifecycle independently; Daemon is the seam for that split.

func Default

func Default() *Daemon

Default returns the process-wide singleton Daemon, constructing it on first use. Production callers use this; tests construct isolated instances with NewDaemon.

func NewDaemon

func NewDaemon(opts ...DaemonOption) *Daemon

NewDaemon constructs a Daemon with no backends registered.

func (*Daemon) ListBackends

func (d *Daemon) ListBackends() []string

ListBackends returns the ids of all registered backends, sorted.

func (*Daemon) ListModels

func (d *Daemon) ListModels(ctx context.Context, id string) ([]ObservedModel, error)

ListModels observes the models exposed by the backend registered under id.

func (*Daemon) ProviderFor

func (d *Daemon) ProviderFor(id string, model ObservedModel) (Provider, error)

ProviderFor turns an observed model from backend id into an execution Provider.

func (*Daemon) RegisterBackend

func (d *Daemon) RegisterBackend(id string, spec BackendSpec, opts ...CatalogOption) error

RegisterBackend resolves spec into a catalog provider and stores it under id, replacing any existing backend with the same id.

func (*Daemon) RemoveBackend

func (d *Daemon) RemoveBackend(id string)

RemoveBackend removes the backend registered under id, if any.

func (*Daemon) Stop

func (d *Daemon) Stop() error

Stop drains backend resources by running the registered shutdown hooks (see RegisterShutdownHook). It is safe to call when no hooks are registered.

type DaemonOption

type DaemonOption func(*Daemon)

DaemonOption configures a Daemon at construction.

func WithCatalogFactory

func WithCatalogFactory(f CatalogFactory) DaemonOption

WithCatalogFactory overrides the CatalogFactory used to resolve backends. Defaults to DefaultCatalogFactory().

type FunctionTool

type FunctionTool struct {
	Name        string      `json:"name"`
	Description string      `json:"description,omitempty"`
	Parameters  interface{} `json:"parameters,omitempty"`
}

type LLMChatClient

type LLMChatClient interface {
	Chat(ctx context.Context, messages []Message, args ...ChatArgument) (ChatResult, error)
}

Client interfaces

type LLMEmbedClient

type LLMEmbedClient interface {
	Embed(ctx context.Context, prompt string) ([]float64, error)
}

type LLMPromptExecClient

type LLMPromptExecClient interface {
	Prompt(ctx context.Context, systemInstruction string, temperature float32, prompt string) (string, error)
}

type LLMStreamClient

type LLMStreamClient interface {
	Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)
}

type Message

type Message struct {
	Role    string `json:"role"`
	Content string `json:"content"`
	// Thinking contains the model's internal reasoning trace (thinking tokens).
	// Only populated when thinking is enabled. Never sent back to the model.
	Thinking string `json:"thinking,omitempty"`

	// For tool calling (OpenAI / vLLM compatible).
	ToolCalls  []ToolCall `json:"tool_calls,omitempty"`
	ToolCallID string     `json:"tool_call_id,omitempty"`
}

Message now supports OpenAI-style tool calling: - assistant messages can carry tool_calls - tool messages can carry tool_call_id

type MockChatClient

type MockChatClient struct{}

MockChatClient is a mock implementation of LLMChatClient for testing.

func (*MockChatClient) Chat

func (m *MockChatClient) Chat(ctx context.Context, messages []Message, opts ...ChatArgument) (ChatResult, error)

Chat returns a mock response.

func (*MockChatClient) Close

func (m *MockChatClient) Close() error

Close is a no-op for the mock client.

type MockEmbedClient

type MockEmbedClient struct{}

MockEmbedClient is a mock implementation of LLMEmbedClient for testing.

func (*MockEmbedClient) Close

func (m *MockEmbedClient) Close() error

Close is a no-op for the mock client.

func (*MockEmbedClient) Embed

func (m *MockEmbedClient) Embed(ctx context.Context, prompt string) ([]float64, error)

Embed returns a mock embedding.

type MockPromptClient

type MockPromptClient struct{}

MockPromptClient is a mock implementation of LLMPromptExecClient for testing.

func (*MockPromptClient) Close

func (m *MockPromptClient) Close() error

Close is a no-op for the mock client.

func (*MockPromptClient) Prompt

func (m *MockPromptClient) Prompt(ctx context.Context, systemInstruction string, temperature float32, prompt string) (string, error)

Prompt returns a mock response.

type MockProvider

type MockProvider struct {
	ID              string
	Name            string
	ContextLength   int
	MaxOutputTokens int
	CanChatFlag     bool
	CanEmbedFlag    bool
	CanStreamFlag   bool
	CanPromptFlag   bool
	Backends        []string
}

MockProvider is a mock implementation of the Provider interface for testing.

func (*MockProvider) CanChat

func (m *MockProvider) CanChat() bool

CanChat returns whether the mock provider can chat.

func (*MockProvider) CanEmbed

func (m *MockProvider) CanEmbed() bool

CanEmbed returns whether the mock provider can embed.

func (*MockProvider) CanPrompt

func (m *MockProvider) CanPrompt() bool

CanPrompt returns whether the mock provider can prompt.

func (*MockProvider) CanStream

func (m *MockProvider) CanStream() bool

CanStream returns whether the mock provider can stream.

func (*MockProvider) CanThink

func (m *MockProvider) CanThink() bool

CanThink returns whether the mock provider can think.

func (*MockProvider) GetBackendIDs

func (m *MockProvider) GetBackendIDs() []string

GetBackendIDs returns the backend IDs for the mock provider.

func (*MockProvider) GetChatConnection

func (m *MockProvider) GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)

GetChatConnection returns a mock chat client.

func (*MockProvider) GetContextLength

func (m *MockProvider) GetContextLength() int

GetContextLength returns the context length for the mock provider.

func (*MockProvider) GetEmbedConnection

func (m *MockProvider) GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)

GetEmbedConnection returns a mock embed client.

func (*MockProvider) GetID

func (m *MockProvider) GetID() string

GetID returns the ID for the mock provider.

func (*MockProvider) GetMaxOutputTokens

func (m *MockProvider) GetMaxOutputTokens() int

GetMaxOutputTokens returns the max output tokens ceiling for the mock provider.

func (*MockProvider) GetPromptConnection

func (m *MockProvider) GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)

GetPromptConnection returns a mock prompt client.

func (*MockProvider) GetStreamConnection

func (m *MockProvider) GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)

GetStreamConnection returns a mock stream client.

func (*MockProvider) GetType

func (m *MockProvider) GetType() string

GetType returns the provider type for the mock provider.

func (*MockProvider) ModelName

func (m *MockProvider) ModelName() string

ModelName returns the model name for the mock provider.

type MockStreamClient

type MockStreamClient struct{}

MockStreamClient is a mock implementation of LLMStreamClient for testing.

func (*MockStreamClient) Close

func (m *MockStreamClient) Close() error

Close is a no-op for the mock client.

func (*MockStreamClient) Stream

func (m *MockStreamClient) Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)

Stream returns a channel with mock stream parcels.

type ObservedModel

type ObservedModel struct {
	Name          string
	ContextLength int
	ModifiedAt    time.Time
	Size          int64
	Digest        string
	CapabilityConfig
	Meta map[string]string
}

ObservedModel is the normalized result of listing models from a backend. Name is the provider-facing model identifier used for selection and execution.

type Provider

type Provider interface {
	GetBackendIDs() []string
	ModelName() string
	GetID() string
	GetType() string
	GetContextLength() int
	// GetMaxOutputTokens returns the provider's hard ceiling on output tokens
	// (maxOutputTokens / max_tokens / max_completion_tokens in the wire format).
	// Returns 0 when the ceiling is unknown or effectively unlimited.
	GetMaxOutputTokens() int
	CanChat() bool
	CanEmbed() bool
	CanStream() bool
	CanPrompt() bool
	CanThink() bool
	GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
	GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
	GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
	GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
}

type StreamParcel

type StreamParcel struct {
	Data string
	// Thinking carries a streamed reasoning/thinking delta separate from the
	// visible output text. Like Message.Thinking, it is provider-facing output
	// and must never be sent back as conversation history.
	Thinking string
	Error    error
}

type Tool

type Tool struct {
	Type     string        `json:"type"`
	Function *FunctionTool `json:"function,omitempty"`
}

type ToolCall

type ToolCall struct {
	ID       string `json:"id,omitempty"`
	Type     string `json:"type"` // only "function" for now
	Function struct {
		Name      string `json:"name"`
		Arguments string `json:"arguments"`
	} `json:"function"`
	// ProviderMeta carries opaque provider-specific data that must be
	// round-tripped back on the next turn (e.g. Gemini thought_signature).
	ProviderMeta map[string]string `json:"provider_meta,omitempty"`
}

type WithShift

type WithShift struct{}

WithShift is a ChatArgument that enables context shift on overflow.

func (WithShift) Apply

func (WithShift) Apply(cfg *ChatConfig)

type WithThink

type WithThink string

WithThink is a ChatArgument that enables/controls reasoning mode.

func (WithThink) Apply

func (w WithThink) Apply(cfg *ChatConfig)

Directories

Path Synopsis
Package llama is the graduated local coding-node runtime: a persistent, workspace-scoped inference session that keeps a stable prefix's KV hot and re-prefills only the changed suffix (the live warm-reuse hot path), distinct from the toy fixed-constant `local` provider.
Package llama is the graduated local coding-node runtime: a persistent, workspace-scoped inference session that keeps a stable prefix's KV hot and re-prefills only the changed suffix (the live warm-reuse hot path), distinct from the toy fixed-constant `local` provider.
Package openvino contains the modelrepo catalog/provider shell for in-process OpenVINO (Intel) inference.
Package openvino contains the modelrepo catalog/provider shell for in-process OpenVINO (Intel) inference.
ovsession
Package ovsession contains the native OpenVINO session/KV bridge used by the openvino modelrepo provider.
Package ovsession contains the native OpenVINO session/KV bridge used by the openvino modelrepo provider.
Package owner manages lease-based ownership of the local runtime's resident state.
Package owner manages lease-based ownership of the local runtime's resident state.
Package transport defines the protocol-agnostic surface that modeld exposes over a wire transport.
Package transport defines the protocol-agnostic surface that modeld exposes over a wire transport.
grpc
Package grpc is the gRPC transport for modeld.
Package grpc is the gRPC transport for modeld.
grpc/modeldpb
Package modeldpb contains the modeld gRPC wire bindings.
Package modeldpb contains the modeld gRPC wire bindings.
leader
Package leader routes follower calls to the current modeld owner advertised in the lease file.
Package leader routes follower calls to the current modeld owner advertised in the lease file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL