Documentation
¶
Overview ¶
Package modeld is the split-out, transport-facing owner of the model repository. It defines the provider-facing contracts for LLM backends: the Provider interface (capabilities + client factories), the per-capability client interfaces (LLMPromptExecClient, LLMChatClient, LLMEmbedClient, LLMStreamClient), and the shared request/response types (Message, ChatResult, StreamParcel, Tool, ChatArgument).
Concrete providers live in subpackages (openai, gemini, vertex, vllm, ollama, local). Higher-level code depends only on the interfaces declared here; provider subpackages are imported for their side effects to register catalogs.
Daemon (see daemon.go) is the in-process singleton that owns backend state behind a mutex, and the transport subpackage exposes that state over the wire (gRPC today, HTTP later).
Index ¶
- Variables
- func CanonicalBackendType(backendType string) string
- func ClampMaxOutputTokens(requested, ceiling int) (int, bool)
- func ClampMaxOutputTokensPtr(tokens *int, ceiling int) *int
- func RegisterCatalogProvider(backendType string, constructor CatalogProviderConstructor)
- func RegisterShutdownHook(fn func() error)
- func SetupVLLMLocalInstance(ctx context.Context, model string, tag string, toolParser string) (string, testcontainers.Container, func(), error)
- func Shutdown() error
- type BackendSpec
- type CapabilityConfig
- type CatalogFactory
- type CatalogOption
- type CatalogOptions
- type CatalogProvider
- type CatalogProviderConstructor
- type ChatArgument
- type ChatConfig
- type ChatResult
- type Daemon
- func (d *Daemon) ListBackends() []string
- func (d *Daemon) ListModels(ctx context.Context, id string) ([]ObservedModel, error)
- func (d *Daemon) ProviderFor(id string, model ObservedModel) (Provider, error)
- func (d *Daemon) RegisterBackend(id string, spec BackendSpec, opts ...CatalogOption) error
- func (d *Daemon) RemoveBackend(id string)
- func (d *Daemon) Stop() error
- type DaemonOption
- type FunctionTool
- type LLMChatClient
- type LLMEmbedClient
- type LLMPromptExecClient
- type LLMStreamClient
- type Message
- type MockChatClient
- type MockEmbedClient
- type MockPromptClient
- type MockProvider
- func (m *MockProvider) CanChat() bool
- func (m *MockProvider) CanEmbed() bool
- func (m *MockProvider) CanPrompt() bool
- func (m *MockProvider) CanStream() bool
- func (m *MockProvider) CanThink() bool
- func (m *MockProvider) GetBackendIDs() []string
- func (m *MockProvider) GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
- func (m *MockProvider) GetContextLength() int
- func (m *MockProvider) GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
- func (m *MockProvider) GetID() string
- func (m *MockProvider) GetMaxOutputTokens() int
- func (m *MockProvider) GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
- func (m *MockProvider) GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
- func (m *MockProvider) GetType() string
- func (m *MockProvider) ModelName() string
- type MockStreamClient
- type ObservedModel
- type Provider
- type StreamParcel
- type Tool
- type ToolCall
- type WithShift
- type WithThink
Constants ¶
This section is empty.
Variables ¶
var ErrNotSupported = errors.New("operation not supported")
ErrNotSupported is returned when an operation is not supported.
var ErrRefused = errors.New("model refused the request")
ErrRefused is returned when the model refuses to generate a response (stop_reason == "refusal"), typically due to a safety filter.
Functions ¶
func CanonicalBackendType ¶
CanonicalBackendType maps compatibility backend keywords to the implementation type used by the runtime. "local" is the historical embedded GGUF keyword; the implementation now lives under the feature-complete "llama" provider.
func ClampMaxOutputTokens ¶
ClampMaxOutputTokens returns the effective output-token request after applying a provider ceiling. A ceiling of 0 means unknown, so no clamp is applied.
func ClampMaxOutputTokensPtr ¶
ClampMaxOutputTokensPtr copies tokens and applies ClampMaxOutputTokens. Returning a fresh pointer avoids mutating ChatConfig values captured by args.
func RegisterCatalogProvider ¶
func RegisterCatalogProvider(backendType string, constructor CatalogProviderConstructor)
RegisterCatalogProvider registers a backend catalog implementation by type. Vendor packages call this from init() to avoid import cycles from modelrepo -> vendor packages.
func RegisterShutdownHook ¶
func RegisterShutdownHook(fn func() error)
RegisterShutdownHook registers fn to be run by Shutdown. It is intended to be called from a backend package's init(). A nil fn is ignored.
Types ¶
type BackendSpec ¶
BackendSpec is the runtime-independent input needed to talk to a model catalog. It deliberately excludes DB/KV concerns; callers resolve those before construction.
type CapabilityConfig ¶
type CatalogFactory ¶
type CatalogFactory interface {
NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)
}
CatalogFactory constructs CatalogProvider implementations from backend specs.
func DefaultCatalogFactory ¶
func DefaultCatalogFactory() CatalogFactory
DefaultCatalogFactory returns the registry-backed factory used by runtimestate.
type CatalogOption ¶
type CatalogOption func(*CatalogOptions)
CatalogOption mutates CatalogOptions before a provider is constructed.
func WithCatalogHTTPClient ¶
func WithCatalogHTTPClient(client *http.Client) CatalogOption
WithCatalogHTTPClient overrides the HTTP client used for observation and Provider construction.
func WithCatalogTracker ¶
func WithCatalogTracker(tracker libtracker.ActivityTracker) CatalogOption
WithCatalogTracker injects the tracker used by ProviderFor when building execution Providers.
type CatalogOptions ¶
type CatalogOptions struct {
HTTPClient *http.Client
Tracker libtracker.ActivityTracker
}
CatalogOptions carries optional construction dependencies used by vendor implementations.
type CatalogProvider ¶
type CatalogProvider interface {
Type() string
ListModels(ctx context.Context) ([]ObservedModel, error)
ProviderFor(model ObservedModel) Provider
}
CatalogProvider observes the models exposed by one backend instance and can turn an observed model into the existing execution Provider abstraction.
func NewCatalogProvider ¶
func NewCatalogProvider(spec BackendSpec, opts ...CatalogOption) (CatalogProvider, error)
NewCatalogProvider constructs a registry-backed catalog provider.
type CatalogProviderConstructor ¶
type CatalogProviderConstructor func(spec BackendSpec, opts CatalogOptions) (CatalogProvider, error)
CatalogProviderConstructor is the registry tools implemented by vendor packages.
type ChatArgument ¶
type ChatArgument interface {
Apply(config *ChatConfig)
}
func WithMaxTokens ¶
func WithMaxTokens(tokens int) ChatArgument
func WithSeed ¶
func WithSeed(seed int) ChatArgument
func WithTemperature ¶
func WithTemperature(temp float64) ChatArgument
func WithTool ¶
func WithTool(tool Tool) ChatArgument
func WithTools ¶
func WithTools(tools ...Tool) ChatArgument
func WithTopP ¶
func WithTopP(p float64) ChatArgument
type ChatConfig ¶
type ChatConfig struct {
Temperature *float64 `json:"temperature,omitempty"`
MaxTokens *int `json:"max_tokens,omitempty"`
TopP *float64 `json:"top_p,omitempty"`
Seed *int `json:"seed,omitempty"`
Tools []Tool `json:"tools,omitempty"`
// Think controls reasoning-model behaviour. nil = use provider default.
// Normalized values are auto, off, minimal, low, medium, high, and xhigh.
Think *string `json:"think,omitempty"`
// Shift instructs the provider to slide the context window on overflow
// instead of returning a token-limit error.
Shift *bool `json:"shift,omitempty"`
// Truncate instructs the provider to truncate history on overflow.
Truncate *bool `json:"truncate,omitempty"`
}
type ChatResult ¶
type Daemon ¶
type Daemon struct {
// contains filtered or unexported fields
}
Daemon is the process-wide owner of model-repository state. It is the in-process singleton that a transport (see the transport subpackage) serves: it holds the configured backends and their resolved catalog providers behind a mutex so concurrent requests observe one authoritative, consistent state.
modeld is being split out of the runtime package to expose the modelrepo API over a wire transport and to own backend lifecycle independently; Daemon is the seam for that split.
func Default ¶
func Default() *Daemon
Default returns the process-wide singleton Daemon, constructing it on first use. Production callers use this; tests construct isolated instances with NewDaemon.
func NewDaemon ¶
func NewDaemon(opts ...DaemonOption) *Daemon
NewDaemon constructs a Daemon with no backends registered.
func (*Daemon) ListBackends ¶
ListBackends returns the ids of all registered backends, sorted.
func (*Daemon) ListModels ¶
ListModels observes the models exposed by the backend registered under id.
func (*Daemon) ProviderFor ¶
func (d *Daemon) ProviderFor(id string, model ObservedModel) (Provider, error)
ProviderFor turns an observed model from backend id into an execution Provider.
func (*Daemon) RegisterBackend ¶
func (d *Daemon) RegisterBackend(id string, spec BackendSpec, opts ...CatalogOption) error
RegisterBackend resolves spec into a catalog provider and stores it under id, replacing any existing backend with the same id.
func (*Daemon) RemoveBackend ¶
RemoveBackend removes the backend registered under id, if any.
type DaemonOption ¶
type DaemonOption func(*Daemon)
DaemonOption configures a Daemon at construction.
func WithCatalogFactory ¶
func WithCatalogFactory(f CatalogFactory) DaemonOption
WithCatalogFactory overrides the CatalogFactory used to resolve backends. Defaults to DefaultCatalogFactory().
type FunctionTool ¶
type LLMChatClient ¶
type LLMChatClient interface {
Chat(ctx context.Context, messages []Message, args ...ChatArgument) (ChatResult, error)
}
Client interfaces
type LLMEmbedClient ¶
type LLMPromptExecClient ¶
type LLMStreamClient ¶
type LLMStreamClient interface {
Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)
}
type Message ¶
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
// Thinking contains the model's internal reasoning trace (thinking tokens).
// Only populated when thinking is enabled. Never sent back to the model.
Thinking string `json:"thinking,omitempty"`
// For tool calling (OpenAI / vLLM compatible).
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"`
}
Message now supports OpenAI-style tool calling: - assistant messages can carry tool_calls - tool messages can carry tool_call_id
type MockChatClient ¶
type MockChatClient struct{}
MockChatClient is a mock implementation of LLMChatClient for testing.
func (*MockChatClient) Chat ¶
func (m *MockChatClient) Chat(ctx context.Context, messages []Message, opts ...ChatArgument) (ChatResult, error)
Chat returns a mock response.
func (*MockChatClient) Close ¶
func (m *MockChatClient) Close() error
Close is a no-op for the mock client.
type MockEmbedClient ¶
type MockEmbedClient struct{}
MockEmbedClient is a mock implementation of LLMEmbedClient for testing.
func (*MockEmbedClient) Close ¶
func (m *MockEmbedClient) Close() error
Close is a no-op for the mock client.
type MockPromptClient ¶
type MockPromptClient struct{}
MockPromptClient is a mock implementation of LLMPromptExecClient for testing.
func (*MockPromptClient) Close ¶
func (m *MockPromptClient) Close() error
Close is a no-op for the mock client.
type MockProvider ¶
type MockProvider struct {
ID string
Name string
ContextLength int
MaxOutputTokens int
CanChatFlag bool
CanEmbedFlag bool
CanStreamFlag bool
CanPromptFlag bool
Backends []string
}
MockProvider is a mock implementation of the Provider interface for testing.
func (*MockProvider) CanChat ¶
func (m *MockProvider) CanChat() bool
CanChat returns whether the mock provider can chat.
func (*MockProvider) CanEmbed ¶
func (m *MockProvider) CanEmbed() bool
CanEmbed returns whether the mock provider can embed.
func (*MockProvider) CanPrompt ¶
func (m *MockProvider) CanPrompt() bool
CanPrompt returns whether the mock provider can prompt.
func (*MockProvider) CanStream ¶
func (m *MockProvider) CanStream() bool
CanStream returns whether the mock provider can stream.
func (*MockProvider) CanThink ¶
func (m *MockProvider) CanThink() bool
CanThink returns whether the mock provider can think.
func (*MockProvider) GetBackendIDs ¶
func (m *MockProvider) GetBackendIDs() []string
GetBackendIDs returns the backend IDs for the mock provider.
func (*MockProvider) GetChatConnection ¶
func (m *MockProvider) GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
GetChatConnection returns a mock chat client.
func (*MockProvider) GetContextLength ¶
func (m *MockProvider) GetContextLength() int
GetContextLength returns the context length for the mock provider.
func (*MockProvider) GetEmbedConnection ¶
func (m *MockProvider) GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
GetEmbedConnection returns a mock embed client.
func (*MockProvider) GetID ¶
func (m *MockProvider) GetID() string
GetID returns the ID for the mock provider.
func (*MockProvider) GetMaxOutputTokens ¶
func (m *MockProvider) GetMaxOutputTokens() int
GetMaxOutputTokens returns the max output tokens ceiling for the mock provider.
func (*MockProvider) GetPromptConnection ¶
func (m *MockProvider) GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
GetPromptConnection returns a mock prompt client.
func (*MockProvider) GetStreamConnection ¶
func (m *MockProvider) GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
GetStreamConnection returns a mock stream client.
func (*MockProvider) GetType ¶
func (m *MockProvider) GetType() string
GetType returns the provider type for the mock provider.
func (*MockProvider) ModelName ¶
func (m *MockProvider) ModelName() string
ModelName returns the model name for the mock provider.
type MockStreamClient ¶
type MockStreamClient struct{}
MockStreamClient is a mock implementation of LLMStreamClient for testing.
func (*MockStreamClient) Close ¶
func (m *MockStreamClient) Close() error
Close is a no-op for the mock client.
func (*MockStreamClient) Stream ¶
func (m *MockStreamClient) Stream(ctx context.Context, messages []Message, args ...ChatArgument) (<-chan *StreamParcel, error)
Stream returns a channel with mock stream parcels.
type ObservedModel ¶
type ObservedModel struct {
Name string
ContextLength int
ModifiedAt time.Time
Size int64
Digest string
CapabilityConfig
Meta map[string]string
}
ObservedModel is the normalized result of listing models from a backend. Name is the provider-facing model identifier used for selection and execution.
type Provider ¶
type Provider interface {
GetBackendIDs() []string
ModelName() string
GetID() string
GetType() string
GetContextLength() int
// GetMaxOutputTokens returns the provider's hard ceiling on output tokens
// (maxOutputTokens / max_tokens / max_completion_tokens in the wire format).
// Returns 0 when the ceiling is unknown or effectively unlimited.
GetMaxOutputTokens() int
CanChat() bool
CanEmbed() bool
CanStream() bool
CanPrompt() bool
CanThink() bool
GetChatConnection(ctx context.Context, backendID string) (LLMChatClient, error)
GetPromptConnection(ctx context.Context, backendID string) (LLMPromptExecClient, error)
GetEmbedConnection(ctx context.Context, backendID string) (LLMEmbedClient, error)
GetStreamConnection(ctx context.Context, backendID string) (LLMStreamClient, error)
}
type StreamParcel ¶
type Tool ¶
type Tool struct {
Type string `json:"type"`
Function *FunctionTool `json:"function,omitempty"`
}
type ToolCall ¶
type ToolCall struct {
ID string `json:"id,omitempty"`
Type string `json:"type"` // only "function" for now
Function struct {
Name string `json:"name"`
Arguments string `json:"arguments"`
} `json:"function"`
// ProviderMeta carries opaque provider-specific data that must be
// round-tripped back on the next turn (e.g. Gemini thought_signature).
ProviderMeta map[string]string `json:"provider_meta,omitempty"`
}
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package llama is the graduated local coding-node runtime: a persistent, workspace-scoped inference session that keeps a stable prefix's KV hot and re-prefills only the changed suffix (the live warm-reuse hot path), distinct from the toy fixed-constant `local` provider.
|
Package llama is the graduated local coding-node runtime: a persistent, workspace-scoped inference session that keeps a stable prefix's KV hot and re-prefills only the changed suffix (the live warm-reuse hot path), distinct from the toy fixed-constant `local` provider. |
|
Package openvino contains the modelrepo catalog/provider shell for in-process OpenVINO (Intel) inference.
|
Package openvino contains the modelrepo catalog/provider shell for in-process OpenVINO (Intel) inference. |
|
ovsession
Package ovsession contains the native OpenVINO session/KV bridge used by the openvino modelrepo provider.
|
Package ovsession contains the native OpenVINO session/KV bridge used by the openvino modelrepo provider. |
|
Package owner manages lease-based ownership of the local runtime's resident state.
|
Package owner manages lease-based ownership of the local runtime's resident state. |
|
Package transport defines the protocol-agnostic surface that modeld exposes over a wire transport.
|
Package transport defines the protocol-agnostic surface that modeld exposes over a wire transport. |
|
grpc
Package grpc is the gRPC transport for modeld.
|
Package grpc is the gRPC transport for modeld. |
|
grpc/modeldpb
Package modeldpb contains the modeld gRPC wire bindings.
|
Package modeldpb contains the modeld gRPC wire bindings. |
|
leader
Package leader routes follower calls to the current modeld owner advertised in the lease file.
|
Package leader routes follower calls to the current modeld owner advertised in the lease file. |