model

package
v1.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 30, 2026 License: MIT Imports: 16 Imported by: 0

Documentation

Overview

Package model provides cache control for LLM backends (RFC 5).

Package model provides model integration for Orla Agent Mode (RFC 4).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ParseModelIdentifier

func ParseModelIdentifier(modelID string) (provider, modelName string, err error)

ParseModelIdentifier parses a model identifier string (e.g., "ollama:llama3") and returns the provider name and model name

Types

type CacheController added in v1.0.0

type CacheController interface {
	// FlushCache flushes the KV cache for the backend
	FlushCache(ctx context.Context) error
	// GetCacheState returns the current cache state
	GetCacheState() CacheState
	// GetMemoryPressure returns the current memory pressure (0.0-1.0)
	// Returns 0.0 if memory pressure cannot be determined
	GetMemoryPressure(ctx context.Context) (float64, error)
}

CacheController is the interface for backend-specific cache control

func NewCacheController added in v1.0.0

func NewCacheController(serverConfig *config.LLMServerConfig) (CacheController, error)

NewCacheController creates a cache controller based on the LLM server configuration

type CacheState added in v1.0.0

type CacheState struct {
	// IsFlushed indicates whether the cache has been flushed
	IsFlushed bool
	// LastFlushTime is the timestamp of the last flush
	LastFlushTime int64
}

CacheState represents the state of a cache

type ContentEvent

type ContentEvent struct {
	Content string
}

ContentEvent represents a content chunk in the stream

func (*ContentEvent) Type

func (e *ContentEvent) Type() StreamEventType

type Message

type Message struct {
	Role       MessageRole `json:"role"`                   // "user", "assistant", "system", or "tool"
	Content    string      `json:"content"`                // Message content
	ToolName   string      `json:"tool_name,omitempty"`    // Tool name, this is required when role is "tool" for Ollama)
	ToolCallID string      `json:"tool_call_id,omitempty"` // Tool call ID, this is required when role is "tool" for OpenAI API
}

Message represents a chat message in a conversation

type MessageRole

type MessageRole string
const (
	MessageRoleUser      MessageRole = "user"
	MessageRoleAssistant MessageRole = "assistant"
	MessageRoleSystem    MessageRole = "system"
	MessageRoleTool      MessageRole = "tool"
)

func (MessageRole) String

func (r MessageRole) String() string

type OllamaProvider

type OllamaProvider struct {
	// contains filtered or unexported fields
}

OllamaProvider implements the Provider interface for Ollama

func NewOllamaProvider

func NewOllamaProvider(modelName string, cfg *config.OrlaConfig) (*OllamaProvider, error)

NewOllamaProvider creates a new Ollama provider

func (*OllamaProvider) Chat

func (p *OllamaProvider) Chat(ctx context.Context, messages []Message, tools []*mcp.Tool, stream bool, maxTokens int) (*Response, <-chan StreamEvent, error)

Chat sends a chat request to Ollama

func (*OllamaProvider) EnsureReady

func (p *OllamaProvider) EnsureReady(ctx context.Context) error

EnsureReady ensures Ollama is running and ready It checks if Ollama is accessible via HTTP health check.

func (*OllamaProvider) Name

func (p *OllamaProvider) Name() string

Name returns the provider name

func (*OllamaProvider) SetTimeout

func (p *OllamaProvider) SetTimeout(timeout time.Duration)

SetTimeout sets the timeout for the Ollama provider

type OpenAIProvider added in v1.0.0

type OpenAIProvider struct {
	// contains filtered or unexported fields
}

OpenAIProvider implements the Provider interface for OpenAI-compatible APIs. This provider is intended to work with any server that implements the OpenAI Chat Completions API format such as LM Studio, vLLM, and even ollama (even though we have a separate Ollama provider). For ollama, this goes through Ollama's Open-AI compatible API [1]. [1] https://docs.ollama.com/api/openai-compatibility

func NewOpenAIProvider added in v1.0.0

func NewOpenAIProvider(modelName string, cfg *config.OrlaConfig) (*OpenAIProvider, error)

NewOpenAIProvider creates a new OpenAI-compatible provider. This works with any server that implements the OpenAI Chat Completions API format.

func (*OpenAIProvider) Chat added in v1.0.0

func (p *OpenAIProvider) Chat(ctx context.Context, messages []Message, tools []*mcp.Tool, stream bool, maxTokens int) (*Response, <-chan StreamEvent, error)

Chat sends a chat request to the OpenAI-compatible API. This works with any server implementing the OpenAI Chat Completions API format.

func (*OpenAIProvider) EnsureReady added in v1.0.0

func (p *OpenAIProvider) EnsureReady(ctx context.Context) error

EnsureReady is a no-op for the OpenAI-compatible provider.

func (*OpenAIProvider) Name added in v1.0.0

func (p *OpenAIProvider) Name() string

Name returns the provider name

type Provider

type Provider interface {
	// Name returns the provider name (e.g., "ollama", "openai", "anthropic")
	Name() string

	// Chat sends a chat request to the model and returns the response
	// messages: conversation history
	// tools: available tools (for tool calling) - uses mcp.Tool for MCP compatibility
	// stream: if true, stream responses via the returned channel
	// maxTokens: maximum number of tokens to generate; 0 means no limit (use provider default)
	Chat(ctx context.Context, messages []Message, tools []*mcp.Tool, stream bool, maxTokens int) (*Response, <-chan StreamEvent, error)

	// EnsureReady ensures the model provider is ready (e.g., starts Ollama if needed)
	// Returns an error if the provider cannot be made ready
	EnsureReady(ctx context.Context) error
}

Provider is the interface that all model providers must implement

func NewProvider

func NewProvider(cfg *config.OrlaConfig) (Provider, error)

NewProvider creates a new model provider based on the configuration

func NewProviderFromLLMServerConfig added in v1.0.0

func NewProviderFromLLMServerConfig(serverConfig *config.LLMServerConfig) (Provider, error)

NewProviderFromLLMServerConfig creates a new model provider from an LLM server configuration (RFC 5)

type Response

type Response struct {
	Content     string             `json:"content"`      // Text content from the model
	Thinking    string             `json:"thinking"`     // Thinking trace from the model (if supported)
	ToolCalls   []ToolCallWithID   `json:"tool_calls"`   // Tool calls requested by the model
	ToolResults []ToolResultWithID `json:"tool_results"` // Tool results returned by the model
	Metrics     *ResponseMetrics   `json:"metrics"`      // Response metrics
}

Response represents a model response

type ResponseMetrics added in v1.1.0

type ResponseMetrics struct {
	// TTFTMs is time to first token in milliseconds. Only set when task was executed with streaming.
	TTFTMs int64 `json:"ttft_ms,omitempty"`
	// TPOTMs is time per output token in milliseconds. Only set when task was executed with streaming.
	TPOTMs int64 `json:"tpot_ms,omitempty"`
}

type SGLangCacheController added in v1.0.0

type SGLangCacheController struct {
	// contains filtered or unexported fields
}

SGLangCacheController implements cache control for SGLang backends

func NewSGLangCacheController added in v1.0.0

func NewSGLangCacheController(baseURL string, client *http.Client) *SGLangCacheController

NewSGLangCacheController creates a new SGLang cache controller

func (*SGLangCacheController) FlushCache added in v1.0.0

func (c *SGLangCacheController) FlushCache(ctx context.Context) error

FlushCache flushes the KV cache by calling SGLang's /flush_cache endpoint

func (*SGLangCacheController) GetCacheState added in v1.0.0

func (c *SGLangCacheController) GetCacheState() CacheState

GetCacheState returns the current cache state

func (*SGLangCacheController) GetMemoryPressure added in v1.0.2

func (c *SGLangCacheController) GetMemoryPressure(ctx context.Context) (float64, error)

GetMemoryPressure queries SGLang for current KV cache memory pressure Returns the KV cache utilization as a fraction (0.0-1.0)

type StreamEvent

type StreamEvent interface {
	// Type returns the type of stream event
	Type() StreamEventType
}

StreamEvent represents a single event in the streaming response

type StreamEventType

type StreamEventType string

StreamEventType represents the type of stream event

const (
	StreamEventTypeContent  StreamEventType = "content"  // Text content chunk
	StreamEventTypeToolCall StreamEventType = "toolcall" // Tool call notification
	StreamEventTypeThinking StreamEventType = "thinking" // Thinking trace chunk
)

type StreamWriter

type StreamWriter interface {
	io.Writer
	Flush() error
}

StreamWriter is an interface for writing streaming responses

type ThinkingEvent

type ThinkingEvent struct {
	Content string
}

ThinkingEvent represents a thinking trace chunk in the stream

func (*ThinkingEvent) Type

func (e *ThinkingEvent) Type() StreamEventType

type ToolCallEvent

type ToolCallEvent struct {
	Name      string
	Arguments map[string]any
}

ToolCallEvent represents a tool call notification in the stream

func (*ToolCallEvent) Type

func (e *ToolCallEvent) Type() StreamEventType

type ToolCallWithID

type ToolCallWithID struct {
	ID                string `json:"id"` // Unique identifier for this tool call
	McpCallToolParams mcp.CallToolParams
}

ToolCallWithID represents a tool invocation request from the model. It embeds mcp.CallToolParams for MCP compatibility, and adds an ID for tracking in the agent loop (to match results back to calls).

type ToolResultWithID

type ToolResultWithID struct {
	ID                string `json:"id"` // Tool call ID this result corresponds to
	McpCallToolResult mcp.CallToolResult
}

ToolResultWithID represents the result of a tool execution. It embeds mcp.CallToolResult for MCP compatibility, and adds an ID to match back to the original ToolCall.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL