Documentation
¶
Overview ¶
Package serving implements a minimal programmatic serving layer.
Index ¶
- type AgenticLayer
- func (l *AgenticLayer) AddLLMBackend(name string, backend *core.LLMBackend, modelID string)
- func (l *AgenticLayer) CheckBackendAccess(tags map[string]string, backendName string) access.Decision
- func (l *AgenticLayer) CheckDataAccess(tags map[string]string, backendName string, dataLabels []string) access.Decision
- func (l *AgenticLayer) CheckManifestBounds(manifest *core.SkillManifest, backendName string, toolNames []string, ...) access.Decision
- func (l *AgenticLayer) CheckSkillAccess(tags map[string]string, skillName string) access.Decision
- func (l *AgenticLayer) CheckSkillEnvelope(tags map[string]string, skillID string, manifest *core.SkillManifest) access.Decision
- func (l *AgenticLayer) CheckToolAccess(tags map[string]string, tools []*mcp.Tool) access.Decision
- func (l *AgenticLayer) CheckToolAccessByName(tags map[string]string, toolName string) access.Decision
- func (l *AgenticLayer) Execute(ctx context.Context, serverName, stageName string, messages []model.Message, ...) (*model.Response, error)
- func (l *AgenticLayer) ExecuteStream(ctx context.Context, serverName, stageName string, messages []model.Message, ...) (*model.Response, <-chan model.StreamEvent, error)
- func (l *AgenticLayer) GetCostModel(backendName string) *core.CostModel
- func (l *AgenticLayer) GetLLMBackendHealth(ctx context.Context, serverName string) (HealthStatus, error)
- func (l *AgenticLayer) GetModelProvider(ctx context.Context, backendName string) (model.Provider, error)
- func (l *AgenticLayer) ListLLMBackends() []string
- func (l *AgenticLayer) NotifyWorkflowComplete(ctx context.Context, workflowID string, backends []string)
- func (l *AgenticLayer) SelectBackendByAccuracy(accuracy float64, policy string, defaultBackend string) (string, error)
- func (l *AgenticLayer) StartPressureMonitor(ctx context.Context)
- func (l *AgenticLayer) UpdateBackend(name string, update BackendUpdate) error
- func (l *AgenticLayer) ValidateAccess(tags map[string]string, backend string, toolNames []string, ...) (access.Decision, map[string]string)
- type BackendUpdate
- type ChatOptions
- type HealthStatus
- type LLMBackendManager
- func (m *LLMBackendManager) AddLLMBackend(name string, backend *core.LLMBackend, modelID string)
- func (m *LLMBackendManager) GetCostModel(backendName string) *core.CostModel
- func (m *LLMBackendManager) GetHealthStatus(ctx context.Context, backendName string) (HealthStatus, error)
- func (m *LLMBackendManager) GetModelID(backendName string) string
- func (m *LLMBackendManager) GetModelProvider(ctx context.Context, backendName string) (model.Provider, error)
- func (m *LLMBackendManager) ListLLMBackends() []string
- func (m *LLMBackendManager) ScheduleChat(ctx context.Context, backendName, stageName string, messages []model.Message, ...) (*model.Response, <-chan model.StreamEvent, error)
- func (m *LLMBackendManager) SelectBackendByAccuracy(accuracy float64, policy string, defaultBackend string) (string, error)
- func (m *LLMBackendManager) UpdateBackend(name string, update BackendUpdate) error
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AgenticLayer ¶
type AgenticLayer struct {
MemoryManager *memory.DefaultManager
WorkflowManager *core.WorkflowManager
SkillRegistry *core.SkillRegistry
PolicyStore *access.Store
// contains filtered or unexported fields
}
AgenticLayer is the serving layer that manages LLM backends and executes inference.
func NewAgenticLayer ¶
func NewAgenticLayer() *AgenticLayer
NewAgenticLayer creates a new serving layer.
func (*AgenticLayer) AddLLMBackend ¶
func (l *AgenticLayer) AddLLMBackend(name string, backend *core.LLMBackend, modelID string)
AddLLMBackend registers an LLM backend by name.
func (*AgenticLayer) CheckBackendAccess ¶ added in v1.2.13
func (l *AgenticLayer) CheckBackendAccess(tags map[string]string, backendName string) access.Decision
CheckBackendAccess checks whether the given tags permit access to the named backend.
func (*AgenticLayer) CheckDataAccess ¶ added in v1.2.13
func (l *AgenticLayer) CheckDataAccess(tags map[string]string, backendName string, dataLabels []string) access.Decision
CheckDataAccess checks whether data with the given labels may be sent to the named backend.
func (*AgenticLayer) CheckManifestBounds ¶ added in v1.2.15
func (l *AgenticLayer) CheckManifestBounds(manifest *core.SkillManifest, backendName string, toolNames []string, dataLabels []string) access.Decision
CheckManifestBounds verifies that the actual request resources are within the skill's declared manifest.
func (*AgenticLayer) CheckSkillAccess ¶ added in v1.2.15
CheckSkillAccess checks whether the given tags permit invocation of the named skill.
func (*AgenticLayer) CheckSkillEnvelope ¶ added in v1.2.15
func (l *AgenticLayer) CheckSkillEnvelope(tags map[string]string, skillID string, manifest *core.SkillManifest) access.Decision
CheckSkillEnvelope verifies that the skill's manifest is within the intersection of three bounds: the manifest itself, skill-scoped policies, and the invoker's policies.
func (*AgenticLayer) CheckToolAccess ¶ added in v1.2.13
CheckToolAccess checks whether the given tags permit all requested tools. Returns the first denial encountered, or an allowed decision if all tools pass.
func (*AgenticLayer) CheckToolAccessByName ¶ added in v1.2.15
func (l *AgenticLayer) CheckToolAccessByName(tags map[string]string, toolName string) access.Decision
CheckToolAccessByName checks whether the given tags permit a single tool by name.
func (*AgenticLayer) Execute ¶
func (l *AgenticLayer) Execute(ctx context.Context, serverName, stageName string, messages []model.Message, tools []*mcp.Tool, opts model.InferenceOptions, chatOpts ...ChatOptions) (*model.Response, error)
Execute runs a single non-streaming inference call against the named LLM backend. For streaming, use ExecuteStream instead. opts.Stream must be false.
func (*AgenticLayer) ExecuteStream ¶
func (l *AgenticLayer) ExecuteStream(ctx context.Context, serverName, stageName string, messages []model.Message, tools []*mcp.Tool, opts model.InferenceOptions, chatOpts ...ChatOptions) (*model.Response, <-chan model.StreamEvent, error)
ExecuteStream runs inference with streaming. It returns the response (filled as the stream is consumed), a channel of stream events, and an error. The caller must consume the channel until closed; the response content, tool_calls, and metrics are populated by the provider's goroutine as the stream completes. opts.Stream must be true.
func (*AgenticLayer) GetCostModel ¶ added in v1.2.11
func (l *AgenticLayer) GetCostModel(backendName string) *core.CostModel
GetCostModel returns the CostModel for a registered backend, or nil if not found or unset.
func (*AgenticLayer) GetLLMBackendHealth ¶
func (l *AgenticLayer) GetLLMBackendHealth(ctx context.Context, serverName string) (HealthStatus, error)
GetLLMBackendHealth returns the health status of a named LLM backend.
func (*AgenticLayer) GetModelProvider ¶
func (l *AgenticLayer) GetModelProvider(ctx context.Context, backendName string) (model.Provider, error)
GetModelProvider returns the model provider for a named LLM backend.
func (*AgenticLayer) ListLLMBackends ¶
func (l *AgenticLayer) ListLLMBackends() []string
ListLLMBackends returns all registered LLM backend names.
func (*AgenticLayer) NotifyWorkflowComplete ¶
func (l *AgenticLayer) NotifyWorkflowComplete(ctx context.Context, workflowID string, backends []string)
NotifyWorkflowComplete emits TransitionWorkflowComplete signals for each backend the workflow used, then deregisters the workflow from the tracker.
func (*AgenticLayer) SelectBackendByAccuracy ¶ added in v1.2.11
func (l *AgenticLayer) SelectBackendByAccuracy(accuracy float64, policy string, defaultBackend string) (string, error)
SelectBackendByAccuracy picks the cheapest backend whose Quality >= accuracy. Policy controls fallback behavior (see AccuracyPolicyPrefer, AccuracyPolicyStrict).
func (*AgenticLayer) StartPressureMonitor ¶
func (l *AgenticLayer) StartPressureMonitor(ctx context.Context)
StartPressureMonitor launches the background memory pressure polling loop. It dynamically queries the current set of backends on each tick and stops when ctx is cancelled.
func (*AgenticLayer) UpdateBackend ¶ added in v1.2.11
func (l *AgenticLayer) UpdateBackend(name string, update BackendUpdate) error
UpdateBackend applies a partial update to a registered backend.
func (*AgenticLayer) ValidateAccess ¶ added in v1.2.15
func (l *AgenticLayer) ValidateAccess( tags map[string]string, backend string, toolNames []string, dataLabels []string, skillID string, ) (access.Decision, map[string]string)
ValidateAccess runs all access control checks for a request and returns the first denial, or an allowed decision if everything passes. Both handleExecute and handleAccessCheck call this method.
If skillID is non-empty, the skill must be registered. The method performs skill visibility, envelope, and manifest bounds checks before the standard backend/tool/data checks. On success with a skill, the returned tags map includes the injected "skill" tag for downstream policy matching.
type BackendUpdate ¶ added in v1.2.11
type BackendUpdate struct {
CostModel *core.CostModel `json:"cost_model,omitempty"`
Quality *float64 `json:"quality,omitempty"`
MaxConcurrency *int `json:"max_concurrency,omitempty"`
}
BackendUpdate holds the optional fields that can be live-updated on a registered backend. Nil fields are left unchanged.
type ChatOptions ¶
ChatOptions carries optional metadata for a scheduled chat request.
type HealthStatus ¶
type HealthStatus string
HealthStatus represents the health status of an LLM backend
const ( HealthStatusHealthy HealthStatus = "healthy" HealthStatusDegraded HealthStatus = "degraded" )
type LLMBackendManager ¶
type LLMBackendManager struct {
// contains filtered or unexported fields
}
LLMBackendManager manages a pool of LLM backend configurations and their providers
func NewLLMBackendManager ¶
func NewLLMBackendManager(mm *memory.DefaultManager) *LLMBackendManager
NewLLMBackendManager creates a new LLM backend manager.
func (*LLMBackendManager) AddLLMBackend ¶
func (m *LLMBackendManager) AddLLMBackend(name string, backend *core.LLMBackend, modelID string)
AddLLMBackend registers an LLM backend by name.
func (*LLMBackendManager) GetCostModel ¶ added in v1.2.11
func (m *LLMBackendManager) GetCostModel(backendName string) *core.CostModel
GetCostModel returns the CostModel for a registered backend, or nil if not found or unset.
func (*LLMBackendManager) GetHealthStatus ¶
func (m *LLMBackendManager) GetHealthStatus(ctx context.Context, backendName string) (HealthStatus, error)
GetHealthStatus returns the health status of an LLM backend
func (*LLMBackendManager) GetModelID ¶
func (m *LLMBackendManager) GetModelID(backendName string) string
GetModelID returns the modelID string for a registered backend, or "" if not found.
func (*LLMBackendManager) GetModelProvider ¶
func (m *LLMBackendManager) GetModelProvider(ctx context.Context, backendName string) (model.Provider, error)
GetModelProvider returns a cached provider for an LLM backend, creating it if necessary
func (*LLMBackendManager) ListLLMBackends ¶
func (m *LLMBackendManager) ListLLMBackends() []string
ListLLMBackends returns a list of all LLM backend names
func (*LLMBackendManager) ScheduleChat ¶
func (m *LLMBackendManager) ScheduleChat(ctx context.Context, backendName, stageName string, messages []model.Message, tools []*mcp.Tool, opts model.InferenceOptions, chatOpts ...ChatOptions) (*model.Response, <-chan model.StreamEvent, error)
ScheduleChat queues a request for execution under the backend's scheduling policy. stageName identifies the stage queue inside the backend. Empty uses "default".
func (*LLMBackendManager) SelectBackendByAccuracy ¶ added in v1.2.11
func (m *LLMBackendManager) SelectBackendByAccuracy(accuracy float64, policy string, defaultBackend string) (string, error)
SelectBackendByAccuracy returns the cheapest registered backend whose Quality >= accuracy and that has a CostModel set. Ties are broken by ascending output cost, then input cost, then backend name.
The policy parameter controls fallback behavior when no backend meets the threshold:
- "strict": return an error.
- "prefer" (or empty, the default): fall back to the cheapest costed backend, or defaultBackend if no backends have cost models.
func (*LLMBackendManager) UpdateBackend ¶ added in v1.2.11
func (m *LLMBackendManager) UpdateBackend(name string, update BackendUpdate) error
UpdateBackend applies a partial update to an existing backend's mutable fields. Returns an error if the backend is not registered.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package access implements access control policy evaluation for the Orla serving layer.
|
Package access implements access control policy evaluation for the Orla serving layer. |
|
Package api provides the HTTP API for the serving layer daemon.
|
Package api provides the HTTP API for the serving layer daemon. |
|
Package cost provides helpers for token-based cost estimation.
|
Package cost provides helpers for token-based cost estimation. |
|
Package memory implements the Memory Manager for Orla's agentic serving layer.
|
Package memory implements the Memory Manager for Orla's agentic serving layer. |
|
Package metrics provides Prometheus metrics for the Orla serving layer.
|
Package metrics provides Prometheus metrics for the Orla serving layer. |