Documentation
¶
Overview ¶
runtimestate implements the core logic for reconciling the declared state of LLM backends (from dbInstance) with their actual observed state. It provides the functionality for synchronizing models and processing downloads, intended to be executed repeatedly within background tasks managed externally.
Index ¶
Constants ¶
const ( ProviderKeyPrefix = "cloud-provider:" OpenaiKey = ProviderKeyPrefix + "openai" GeminiKey = ProviderKeyPrefix + "gemini" )
Variables ¶
This section is empty.
Functions ¶
func LocalProviderAdapter ¶
func LocalProviderAdapter(ctx context.Context, runtime map[string]LLMState) llmresolver.ProviderFromRuntimeState
LocalProviderAdapter creates providers for self-hosted backends (Ollama, vLLM)
Types ¶
type LLMState ¶
type LLMState struct {
ID string `json:"id" example:"backend1"`
Name string `json:"name" example:"Backend Name"`
Models []string `json:"models"`
PulledModels []ListModelResponse `json:"pulledModels" oapiinclude:"runtimestate.ListModelResponse"`
Backend runtimetypes.Backend `json:"backend"`
// Error stores a description of the last encountered error when
// interacting with or reconciling this backend's state, if any.
Error string `json:"error,omitempty"`
// contains filtered or unexported fields
}
LLMState represents the observed state of a single LLM backend.
type ListModelResponse ¶ added in v0.0.17
type ListModelResponse struct {
Name string `json:"name"`
Model string `json:"model"`
ModifiedAt time.Time `json:"modifiedAt"`
Size int64 `json:"size"`
Digest string `json:"digest"`
Details ModelDetails `json:"details" oapiinclude:"runtimestate.ModelDetails"`
ContextLength int `json:"contextLength"`
CanChat bool `json:"canChat"`
CanEmbed bool `json:"canEmbed"`
CanPrompt bool `json:"canPrompt"`
CanStream bool `json:"canStream"`
}
type ModelDetails ¶ added in v0.0.17
type ProviderConfig ¶
func (ProviderConfig) MarshalJSON ¶
func (pc ProviderConfig) MarshalJSON() ([]byte, error)
type State ¶
type State struct {
// contains filtered or unexported fields
}
State manages the overall runtime status of multiple LLM backends. It orchestrates the synchronization between the desired configuration and the actual state of the backends, including providing the mechanism for model downloads via the dwqueue component.
func New ¶
func New(ctx context.Context, dbInstance libdb.DBManager, psInstance libbus.Messenger, options ...Option) (*State, error)
New creates and initializes a new State manager. It requires a database manager (dbInstance) to load the desired configurations and a messenger instance (psInstance) for event handling and progress updates. Options allow enabling experimental features like pool-based reconciliation. Returns an initialized State ready for use.
func (*State) Get ¶
Get returns a copy of the current observed state for all backends. This provides a safe snapshot for reading state without risking modification of the internal structures.
func (*State) RunBackendCycle ¶
RunBackendCycle performs a single reconciliation check for all configured LLM backends. It compares the desired state (from configuration) with the observed state (by communicating with the backends) and schedules necessary actions, such as queuing model downloads or removals, to align them. This method should be called periodically in a background process. DESIGN NOTE: This method executes one complete reconciliation cycle and then returns. It does not manage its own background execution (e.g., via internal goroutines or timers). This deliberate design choice delegates execution management (scheduling, concurrency control, lifecycle via context, error handling, circuit breaking, etc.) entirely to the caller.
Consequently, this method should be called periodically by an external process responsible for its scheduling and lifecycle. When the pool feature is enabled via WithPools option, it uses pool-aware reconciliation.
func (*State) RunDownloadCycle ¶
RunDownloadCycle processes a single pending model download operation, if one exists. It retrieves the next download task, executes the download while providing progress updates, and handles potential cancellation requests. If no download tasks are queued, it returns nil immediately. This method should be called periodically in a background process to drain the download queue. DESIGN NOTE: this method performs one unit of work and returns. The caller is responsible for the execution loop, allowing flexible integration with task management strategies.
This method should be called periodically by an external process to drain the download queue.