runtimestate

package
v0.0.21 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 10, 2025 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

runtimestate implements the core logic for reconciling the declared state of LLM backends (from dbInstance) with their actual observed state. It provides the functionality for synchronizing models and processing downloads, intended to be executed repeatedly within background tasks managed externally.

Index

Constants

View Source
const (
	ProviderKeyPrefix = "cloud-provider:"
	OpenaiKey         = ProviderKeyPrefix + "openai"
	GeminiKey         = ProviderKeyPrefix + "gemini"
)

Variables

This section is empty.

Functions

func LocalProviderAdapter

func LocalProviderAdapter(ctx context.Context, runtime map[string]LLMState) llmresolver.ProviderFromRuntimeState

LocalProviderAdapter creates providers for self-hosted backends (Ollama, vLLM)

Types

type LLMState

type LLMState struct {
	ID           string               `json:"id" example:"backend1"`
	Name         string               `json:"name" example:"Backend Name"`
	Models       []string             `json:"models"`
	PulledModels []ListModelResponse  `json:"pulledModels" oapiinclude:"runtimestate.ListModelResponse"`
	Backend      runtimetypes.Backend `json:"backend"`
	// Error stores a description of the last encountered error when
	// interacting with or reconciling this backend's state, if any.
	Error string `json:"error,omitempty"`
	// contains filtered or unexported fields
}

LLMState represents the observed state of a single LLM backend.

func (*LLMState) GetAPIKey

func (s *LLMState) GetAPIKey() string

type ListModelResponse added in v0.0.17

type ListModelResponse struct {
	Name          string       `json:"name"`
	Model         string       `json:"model"`
	ModifiedAt    time.Time    `json:"modifiedAt"`
	Size          int64        `json:"size"`
	Digest        string       `json:"digest"`
	Details       ModelDetails `json:"details" oapiinclude:"runtimestate.ModelDetails"`
	ContextLength int          `json:"contextLength"`
	CanChat       bool         `json:"canChat"`
	CanEmbed      bool         `json:"canEmbed"`
	CanPrompt     bool         `json:"canPrompt"`
	CanStream     bool         `json:"canStream"`
}

type ModelDetails added in v0.0.17

type ModelDetails struct {
	ParentModel       string   `json:"parentModel"`
	Format            string   `json:"format"`
	Family            string   `json:"family"`
	Families          []string `json:"families"`
	ParameterSize     string   `json:"parameterSize"`
	QuantizationLevel string   `json:"quantizationLevel"`
}

type Option

type Option func(*State)

func WithPools

func WithPools() Option

type ProviderConfig

type ProviderConfig struct {
	APIKey string
	Type   string
}

func (ProviderConfig) MarshalJSON

func (pc ProviderConfig) MarshalJSON() ([]byte, error)

type State

type State struct {
	// contains filtered or unexported fields
}

State manages the overall runtime status of multiple LLM backends. It orchestrates the synchronization between the desired configuration and the actual state of the backends, including providing the mechanism for model downloads via the dwqueue component.

func New

func New(ctx context.Context, dbInstance libdb.DBManager, psInstance libbus.Messenger, options ...Option) (*State, error)

New creates and initializes a new State manager. It requires a database manager (dbInstance) to load the desired configurations and a messenger instance (psInstance) for event handling and progress updates. Options allow enabling experimental features like pool-based reconciliation. Returns an initialized State ready for use.

func (*State) Get

func (s *State) Get(ctx context.Context) map[string]LLMState

Get returns a copy of the current observed state for all backends. This provides a safe snapshot for reading state without risking modification of the internal structures.

func (*State) RunBackendCycle

func (s *State) RunBackendCycle(ctx context.Context) error

RunBackendCycle performs a single reconciliation check for all configured LLM backends. It compares the desired state (from configuration) with the observed state (by communicating with the backends) and schedules necessary actions, such as queuing model downloads or removals, to align them. This method should be called periodically in a background process. DESIGN NOTE: This method executes one complete reconciliation cycle and then returns. It does not manage its own background execution (e.g., via internal goroutines or timers). This deliberate design choice delegates execution management (scheduling, concurrency control, lifecycle via context, error handling, circuit breaking, etc.) entirely to the caller.

Consequently, this method should be called periodically by an external process responsible for its scheduling and lifecycle. When the pool feature is enabled via WithPools option, it uses pool-aware reconciliation.

func (*State) RunDownloadCycle

func (s *State) RunDownloadCycle(ctx context.Context) error

RunDownloadCycle processes a single pending model download operation, if one exists. It retrieves the next download task, executes the download while providing progress updates, and handles potential cancellation requests. If no download tasks are queued, it returns nil immediately. This method should be called periodically in a background process to drain the download queue. DESIGN NOTE: this method performs one unit of work and returns. The caller is responsible for the execution loop, allowing flexible integration with task management strategies.

This method should be called periodically by an external process to drain the download queue.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL