Documentation
¶
Overview ¶
claude.go handles Claude CLI argument construction and MCP bridge configuration.
BuildClaudeArgs is the main entry point — it takes an InferenceRequest and produces the argument list for `claude -p ... --output-format stream-json`. This includes system prompt chaining (TAA context + client prompt), model mapping, tool forwarding (--allowed-tools), and MCP config.
GenerateMCPConfig creates a temporary JSON file for --mcp-config that tells Claude CLI to spawn `cog mcp serve --bridge` as an MCP subprocess.
config.go implements node-level inference configuration.
CogOS supports a three-tier config resolution for inference settings:
~/.cog/etc/inference.yaml node (shared across all workspaces) $workspace/.cog/conf/inference.yaml workspace (overrides node) Environment variables OPENAI_API_KEY, OPENROUTER_API_KEY, etc. Compiled defaults DefaultProviders() in providers.go
LoadInferenceConfig merges node + workspace layers. Environment variables and compiled defaults are handled separately by DefaultProviders().
Package harness provides the inference execution engine for CogOS.
Architecture ¶
The harness is a separate Go module that owns all inference logic: model routing, Claude CLI execution, HTTP provider dispatch, streaming, retries, and tool pipeline management. It connects to the kernel through a single KernelServices interface — the harness never imports package main.
┌──────────────────────────────────────────────────┐ │ kernel (package main) │ │ │ │ serve.go ──► kernel_harness.go ──► harness pkg │ │ cog.go ──► kernel_harness.go ──► harness pkg │ │ │ │ │ kernelServicesAdapter │ │ implements KernelServices │ └──────────────────────────────────────────────────┘
Common Paths ¶
HTTP request (POST /v1/chat/completions):
serve.go:handleChatCompletions
→ kernel_harness.go:HarnessRunInference / HarnessRunInferenceStream
→ converts kernel types → harness types
→ harness.Harness.RunInference / RunInferenceStream
→ ParseModelProvider (routes claude vs http)
→ Claude path: BuildClaudeArgs → exec claude CLI
→ HTTP path: runHTTPInference → OpenAI-compatible API
→ converts harness response → kernel types
→ serve.go formats OpenAI-compatible HTTP response
CLI request (cog infer "prompt"):
inference.go:cmdInfer
→ kernel_harness.go:HarnessRunInference
→ (same path as above)
File Layout ¶
harness.go Harness struct, New(), RunInference, RunInferenceStream, RunInferenceWithRetry interfaces.go KernelServices interface — the only bridge contract types.go InferenceRequest/Response, ContextState, ChatMessage, API response types stream.go StreamChunkInference, Claude CLI wire types, OpenAI wire types providers.go ProviderType, ProviderConfig, ParseModelProvider, DefaultProviders claude.go BuildClaudeArgs, GenerateMCPConfig, chainSystemPrompt, BuildContextMetrics http.go runHTTPInference, runHTTPInferenceStream (OpenAI-compatible providers) tools.go MapToolsToCLINames — OpenAI tool defs → Claude CLI --allowed-tools registry.go RequestRegistry — tracks in-flight requests for cancellation/listing retry.go ErrorType classification, ClassifyError, ClassifyHTTPError config.go Node-level config resolution (~/.cog/etc/ → workspace → env → defaults) otel.go Package-scoped OTEL tracer
http.go handles inference via OpenAI-compatible HTTP APIs (OpenAI, OpenRouter, Ollama, and custom endpoints).
These functions are called by Harness.RunInference and Harness.RunInferenceStream when ParseModelProvider returns a non-Claude provider. They build an OpenAI chat completion request, send it to the provider's /chat/completions endpoint, and parse the response (sync or SSE stream).
interfaces.go defines the bridge contract between the harness and the kernel.
The harness never imports package main. All kernel dependencies flow through KernelServices, which the kernel implements in kernel_harness.go via kernelServicesAdapter.
To add a new kernel capability to the harness:
- Add a method to KernelServices here
- Implement it on kernelServicesAdapter in kernel_harness.go
- Call it from harness code via h.kernel.NewMethod()
otel.go initializes the OpenTelemetry tracer for the harness package. All harness spans (inference dispatch, CLI execution, HTTP provider calls) are recorded under the "cogos-harness" instrumentation scope.
providers.go defines inference provider types and model routing.
ParseModelProvider is the routing function — it takes a model string from the request and returns which provider to use. This is called early in both RunInference and RunInferenceStream to decide the Claude CLI vs HTTP path.
DefaultProviders returns the built-in provider configs. API keys come from environment variables (OPENAI_API_KEY, OPENROUTER_API_KEY). Ollama and local providers don't require keys.
registry.go tracks in-flight inference requests for visibility and cancellation.
Each Harness owns a RequestRegistry. When a request starts, it's registered with status "running". On completion it moves to "completed", "failed", or "cancelled". The kernel exposes registry contents via GET /v1/requests and supports cancellation via DELETE /v1/requests/:id.
StartRegistryCleanup should be called once at startup to periodically remove stale entries (completed/failed/cancelled older than 1 hour).
retry.go provides error classification and retry constants.
ClassifyError inspects error messages to determine whether an error is retryable. ClassifyHTTPError does the same for HTTP status codes. These are used by RunInferenceWithRetry to decide whether to retry, and by RunInference to tag responses with ErrorType for callers.
stream.go defines streaming types and Claude CLI wire format types.
StreamChunkInference is the harness's unified streaming event. It carries text deltas, tool calls, tool results, usage data, and session metadata through a single channel from RunInferenceStream to the caller.
The Claude* types (ClaudeStreamMessage, ClaudeMessage, ClaudeContent, etc.) represent the JSON wire format emitted by `claude --output-format stream-json`. The OpenAI* types represent the wire format for HTTP provider communication.
tools.go maps OpenAI-format tool definitions to Claude CLI tool names and classifies tool ownership (internal = CogOS executes, external = client executes). This is the plumbing that lets BrowserOS-style clients forward their `browser_*` tool definitions through the kernel without the harness silently dropping them.
Key entry points:
- MapToolsToCLINames — extract Claude CLI `--allowed-tools` names from OpenAI-format tool defs. Only internal tools have CLI names; external tools are not returned (by design — they're registered through the MCP bridge, not `--allowed-tools`).
- ClassifyTool — ownership lookup for a single tool name.
- PartitionTools — split a raw tool list into (internal, external).
- ExtractToolName — pull the function name out of one OpenAI tool def.
The internal-tool set is closed (the CogOS kernel has a fixed set of built-ins: Bash, Read, Write, Edit, Grep, Glob). Anything else is assumed external — the model is told about them via an MCP bridge and any `tool_use` events Claude emits for them are returned to the client as OpenAI-format `tool_calls` rather than executed server-side.
types.go defines the shared data types used across the harness package.
Key types and where they flow:
- InferenceRequest — input to RunInference / RunInferenceStream
- InferenceResponse — output from RunInference
- ContextState — four-tier context pipeline (identity/temporal/present/semantic)
- ChatMessage — OpenAI-format message (used in ChatCompletionRequest)
- ChatCompletionRequest — the HTTP request body for /v1/chat/completions
API response types (ModelListResponse, ProviderListResponse, etc.) are also defined here so the kernel's HTTP handlers can return harness-typed responses in future waves.
Index ¶
- Constants
- func BuildClaudeArgs(req *InferenceRequest) []string
- func BuildCodexArgs(req *InferenceRequest, schemaPath string, kernel KernelServices) ([]string, error)
- func DefaultProviders() map[ProviderType]*ProviderConfig
- func ExtractToolName(raw json.RawMessage) string
- func GenerateMCPConfig(req *InferenceRequest, kernel KernelServices) (string, error)
- func GenerateRequestID(origin string) string
- func MapToolsToCLINames(tools []json.RawMessage) []string
- func ParseModelProvider(model string) (ProviderType, string, *ProviderConfig)
- func PartitionTools(tools []json.RawMessage) (internal, external []json.RawMessage)
- func StartRegistryCleanup(registry *RequestRegistry)
- func StringToRawContent(s string) json.RawMessage
- type AgentToolPolicy
- type ChatCompletionRequest
- type ChatMessage
- type ClaudeContent
- type ClaudeMessage
- type ClaudeStreamMessage
- type ClaudeUsage
- type ContextMetrics
- type ContextState
- type ContextTier
- type ContinuationState
- type ErrorDetail
- type ErrorResponse
- type ErrorType
- type Harness
- func (h *Harness) GetActiveProvider() ProviderType
- func (h *Harness) Registry() *RequestRegistry
- func (h *Harness) RunInference(req *InferenceRequest) (*InferenceResponse, error)
- func (h *Harness) RunInferenceStream(req *InferenceRequest) (<-chan StreamChunkInference, error)
- func (h *Harness) RunInferenceWithRetry(req *InferenceRequest) (*InferenceResponse, error)
- func (h *Harness) SetActiveProvider(pt ProviderType) ProviderType
- type HookResult
- type InferenceConfig
- type InferenceRequest
- type InferenceResponse
- type KernelServices
- type MCPTool
- type ModelInfo
- type ModelListResponse
- type OpenAIChatMessage
- type OpenAIChatRequest
- type OpenAIChatResponse
- type OpenAIStreamChunk
- type OpenAIUsage
- type ProviderConfig
- type ProviderHealth
- type ProviderInfo
- type ProviderListResponse
- type ProviderPublicConfig
- type ProviderType
- type RequestEntry
- type RequestRegistry
- func (r *RequestRegistry) Cancel(id string) bool
- func (r *RequestRegistry) Cleanup(maxAge time.Duration) int
- func (r *RequestRegistry) Complete(id string, status string)
- func (r *RequestRegistry) Get(id string) *RequestEntry
- func (r *RequestRegistry) List() []RequestEntry
- func (r *RequestRegistry) ListRunning() []RequestEntry
- func (r *RequestRegistry) Register(req *InferenceRequest, cancel context.CancelFunc) *RequestEntry
- func (r *RequestRegistry) Remove(id string)
- type ResponseFormat
- type SessionInfo
- type StreamChunkInference
- type StreamOptions
- type ToolCallData
- type ToolOwnership
- type ToolResultData
- type ToolUseResultEx
- type UsageData
Constants ¶
const ( DefaultMaxRetries = 3 DefaultTimeout = 2 * time.Minute BaseRetryDelay = time.Second )
Default retry configuration.
const ClaudeCommand = "claude"
ClaudeCommand is the name of the Claude CLI binary.
const (
CodexCommand = "codex"
)
Variables ¶
This section is empty.
Functions ¶
func BuildClaudeArgs ¶
func BuildClaudeArgs(req *InferenceRequest) []string
BuildClaudeArgs constructs the Claude CLI arguments from an InferenceRequest. Supports both legacy mode (SystemPrompt) and new context-aware mode (ContextState).
func BuildCodexArgs ¶
func BuildCodexArgs(req *InferenceRequest, schemaPath string, kernel KernelServices) ([]string, error)
func DefaultProviders ¶
func DefaultProviders() map[ProviderType]*ProviderConfig
DefaultProviders returns the default provider configurations. API keys are read from environment variables.
func ExtractToolName ¶
func ExtractToolName(raw json.RawMessage) string
ExtractToolName pulls the function name out of a single OpenAI-format tool definition. Returns "" for malformed input.
func GenerateMCPConfig ¶
func GenerateMCPConfig(req *InferenceRequest, kernel KernelServices) (string, error)
GenerateMCPConfig creates a temporary MCP config JSON file for Claude CLI's --mcp-config flag. The config tells Claude CLI to spawn `cog mcp serve --bridge` as an MCP server, enabling access to both CogOS and OpenClaw tools.
func GenerateRequestID ¶
GenerateRequestID creates a unique request ID with format: req-{origin}-{timestamp}-{random}
func MapToolsToCLINames ¶
func MapToolsToCLINames(tools []json.RawMessage) []string
MapToolsToCLINames extracts function names from OpenAI-format tool definitions and maps them to Claude CLI built-in tool names where possible.
Only internal tools produce output. External tools are silently dropped — they're registered through `--mcp-config` (or forwarded to the client as `tool_calls`) rather than `--allowed-tools`, so including them here would just make Claude CLI fail to start.
func ParseModelProvider ¶
func ParseModelProvider(model string) (ProviderType, string, *ProviderConfig)
ParseModelProvider extracts the provider and model from a model string. Formats:
- "claude" or "" -> (ProviderClaude, "claude")
- "codex" -> (ProviderCodex, "codex")
- "codex/gpt-5-codex" -> (ProviderCodex, "gpt-5-codex")
- "openai/gpt-4o" -> (ProviderOpenAI, "gpt-4o")
- "openrouter/anthropic/claude-3-haiku" -> (ProviderOpenRouter, "anthropic/claude-3-haiku")
- "ollama/llama3.2" -> (ProviderOllama, "llama3.2")
- "local/claude" -> (ProviderLocal, "claude")
- "http://localhost:8080|model-name" -> (ProviderCustom, model with custom URL)
func PartitionTools ¶
func PartitionTools(tools []json.RawMessage) (internal, external []json.RawMessage)
PartitionTools splits a list of OpenAI-format tool definitions into two disjoint slices preserving input order:
- internal: tools the harness will execute itself (mapped via Claude CLI `--allowed-tools` or the MCP bridge for CogOS built-ins).
- external: tools to be forwarded to the client as `tool_calls`.
Malformed entries (bad JSON, missing function.name) are dropped from both outputs — identical to the old MapToolsToCLINames silent-skip behaviour.
func StartRegistryCleanup ¶
func StartRegistryCleanup(registry *RequestRegistry)
StartRegistryCleanup starts a background goroutine that periodically removes completed/failed/cancelled entries older than 1 hour.
func StringToRawContent ¶
func StringToRawContent(s string) json.RawMessage
StringToRawContent converts a string to json.RawMessage for ChatMessage.Content
Types ¶
type AgentToolPolicy ¶
type AgentToolPolicy struct {
AllowedTools []string // Claude CLI --allowed-tools patterns
DenyTools []string // Tools explicitly denied
DangerouslySkipPermissions bool // Whether to pass --dangerously-skip-permissions
}
AgentToolPolicy contains the resolved tool policy for an agent from its CRD.
type ChatCompletionRequest ¶
type ChatCompletionRequest struct {
Model string `json:"model"`
Messages []ChatMessage `json:"messages"`
Stream bool `json:"stream,omitempty"`
Temperature *float64 `json:"temperature,omitempty"`
MaxTokens *int `json:"max_tokens,omitempty"`
ResponseFormat *ResponseFormat `json:"response_format,omitempty"`
SystemPrompt string `json:"system_prompt,omitempty"` // Extension for explicit system
TAA json.RawMessage `json:"taa,omitempty"` // TAA context: false/absent=none, true=default, "name"=profile
Tools []json.RawMessage `json:"tools,omitempty"` // OpenAI-format tool definitions
}
ChatCompletionRequest represents an OpenAI-compatible chat completion request
func (*ChatCompletionRequest) GetTAAProfile ¶
func (r *ChatCompletionRequest) GetTAAProfile() (string, bool)
GetTAAProfile extracts TAA profile from the request body field
func (*ChatCompletionRequest) GetTAAProfileWithHeader ¶
func (r *ChatCompletionRequest) GetTAAProfileWithHeader(header string) (string, bool)
GetTAAProfileWithHeader extracts TAA profile with header taking precedence
type ChatMessage ¶
type ChatMessage struct {
Role string `json:"role"`
Content json.RawMessage `json:"content"` // Can be string or array
}
ChatMessage represents a message in the chat format. Content is json.RawMessage to handle both string and array-of-parts formats.
func (*ChatMessage) GetContent ¶
func (m *ChatMessage) GetContent() string
GetContent extracts the text content from a ChatMessage. Handles both string format and array-of-parts format (OpenAI SDK).
type ClaudeContent ¶
type ClaudeContent struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
ID string `json:"id,omitempty"` // For tool_use blocks
Name string `json:"name,omitempty"` // For tool_use blocks
Input json.RawMessage `json:"input,omitempty"` // For tool_use blocks
ToolUseID string `json:"tool_use_id,omitempty"` // For tool_result blocks
Content string `json:"content,omitempty"` // For tool_result blocks
IsError bool `json:"is_error,omitempty"` // For tool_result blocks
}
ClaudeContent represents a content block in the message
type ClaudeMessage ¶
type ClaudeMessage struct {
Content []ClaudeContent `json:"content,omitempty"`
StopReason string `json:"stop_reason,omitempty"`
Usage *ClaudeUsage `json:"usage,omitempty"`
}
ClaudeMessage represents the nested message in assistant responses
type ClaudeStreamMessage ¶
type ClaudeStreamMessage struct {
Type string `json:"type"`
Subtype string `json:"subtype,omitempty"`
Message *ClaudeMessage `json:"message,omitempty"`
Result string `json:"result,omitempty"`
StructuredOutput json.RawMessage `json:"structured_output,omitempty"`
Usage *ClaudeUsage `json:"usage,omitempty"`
ToolUseResult *ToolUseResultEx `json:"tool_use_result,omitempty"`
}
ClaudeStreamMessage represents a message from the Claude CLI stream-json output
type ClaudeUsage ¶
type ClaudeUsage struct {
InputTokens int `json:"input_tokens,omitempty"`
OutputTokens int `json:"output_tokens,omitempty"`
CacheReadTokens int `json:"cache_read_input_tokens,omitempty"`
CacheCreateTokens int `json:"cache_creation_input_tokens,omitempty"`
CostUSD float64 `json:"cost_usd,omitempty"`
}
ClaudeUsage represents token usage info from Claude CLI output. The cache fields are populated in both result messages and message_start events.
type ContextMetrics ¶
type ContextMetrics struct {
TotalTokens int `json:"total_tokens"`
TierBreakdown map[string]int `json:"tier_breakdown"`
CoherenceScore float64 `json:"coherence_score"`
CompressionUsed bool `json:"compression_used"`
}
ContextMetrics captures metrics about context used in inference
func BuildContextMetrics ¶
func BuildContextMetrics(ctx *ContextState) *ContextMetrics
BuildContextMetrics extracts metrics from ContextState for response
type ContextState ¶
type ContextState struct {
// Tier 1: Identity (stable, ~1/3 of budget)
Tier1Identity *ContextTier `json:"tier1_identity,omitempty"`
// Tier 2: Temporal (session state, signals, history)
Tier2Temporal *ContextTier `json:"tier2_temporal,omitempty"`
// Tier 3: Present (current message context)
Tier3Present *ContextTier `json:"tier3_present,omitempty"`
// Tier 4: Semantic (constellation knowledge graph)
Tier4Semantic *ContextTier `json:"tier4_semantic,omitempty"`
// Model selection (optional override)
Model string `json:"model,omitempty"`
// Metadata
TotalTokens int `json:"total_tokens,omitempty"`
CoherenceScore float64 `json:"coherence_score,omitempty"`
ShouldRefresh bool `json:"should_refresh,omitempty"`
// TAA signals (from Tier 2 temporal analysis)
Anchor string `json:"anchor,omitempty"` // Current conversation topic
Goal string `json:"goal,omitempty"` // Detected user intent
}
ContextState represents the full context.cog.json structure Used for context-aware invocation of the inference engine
func (*ContextState) BuildContextString ¶
func (cs *ContextState) BuildContextString() string
BuildContextString assembles the full context string from tiers
type ContextTier ¶
type ContextTier struct {
Content string `json:"content"`
Tokens int `json:"tokens"`
Source string `json:"source,omitempty"`
}
ContextTier represents a single tier of context with metadata
type ContinuationState ¶
type ContinuationState struct {
SessionID string `json:"session_id"`
Timestamp string `json:"timestamp"`
Trigger string `json:"trigger"`
Focus string `json:"focus"`
ContinuationPrompt string `json:"continuation_prompt"`
}
ContinuationState represents the eigenfield continuation state
type ErrorDetail ¶
type ErrorDetail struct {
Message string `json:"message"`
Type string `json:"type"`
Code string `json:"code,omitempty"`
}
ErrorDetail contains error information
type ErrorResponse ¶
type ErrorResponse struct {
Error ErrorDetail `json:"error"`
}
ErrorResponse represents an API error
type ErrorType ¶
type ErrorType int
ErrorType classifies inference errors for smart recovery
func ClassifyError ¶
ClassifyError determines the error type from an error message
func ClassifyHTTPError ¶
ClassifyHTTPError maps HTTP status codes to ErrorType
type Harness ¶
type Harness struct {
// Debug mode
DebugMode bool
// contains filtered or unexported fields
}
Harness is the inference execution engine. Create one with New and call Harness.RunInference (sync) or Harness.RunInferenceStream (streaming).
The Harness owns a RequestRegistry for tracking in-flight requests and delegates kernel-specific operations (hooks, events, signals, workspace resolution) through the KernelServices interface passed to New.
func New ¶
func New(kernel KernelServices) *Harness
New creates a new Harness connected to the kernel via the KernelServices interface.
func (*Harness) GetActiveProvider ¶
func (h *Harness) GetActiveProvider() ProviderType
GetActiveProvider returns the currently active provider
func (*Harness) Registry ¶
func (h *Harness) Registry() *RequestRegistry
Registry returns the harness's request registry for external use (e.g., HTTP handlers).
func (*Harness) RunInference ¶
func (h *Harness) RunInference(req *InferenceRequest) (*InferenceResponse, error)
RunInference executes a non-streaming inference request and blocks until complete.
Routing is determined by the Model field in the request:
- "" or "claude" → Claude CLI (default path)
- "openai/gpt-4o" → OpenAI API via HTTP
- "openrouter/claude-3" → OpenRouter API via HTTP
- "ollama/llama3.2" → Local Ollama via HTTP
- "http://host|model" → Custom OpenAI-compatible endpoint
The full lifecycle for each request:
- Inject continuation context (eigenfield persistence)
- Dispatch PreInference hook (may block or inject context)
- Register in RequestRegistry (for /v1/requests visibility)
- Route to Claude CLI or HTTP provider
- Emit INFERENCE_START / INFERENCE_COMPLETE / INFERENCE_ERROR events
- Dispatch PostInference hook (artifact extraction, logging)
- Clean up signal field
func (*Harness) RunInferenceStream ¶
func (h *Harness) RunInferenceStream(req *InferenceRequest) (<-chan StreamChunkInference, error)
RunInferenceStream executes a streaming inference request and returns immediately. The returned channel receives StreamChunkInference values and closes when the inference completes (look for chunk.Done == true as the final message).
Routing follows the same rules as Harness.RunInference. For the Claude CLI path, the stream includes rich events: text deltas, tool_use start/delta/stop, tool_result, session_info, and a final Done chunk with usage data.
The caller should drain the channel fully. Context cancellation is respected — cancelling the request context will terminate the underlying CLI process or HTTP connection and close the channel.
func (*Harness) RunInferenceWithRetry ¶
func (h *Harness) RunInferenceWithRetry(req *InferenceRequest) (*InferenceResponse, error)
RunInferenceWithRetry wraps Harness.RunInference with automatic retry for transient errors. Uses exponential backoff (1s, 2s, 4s…) capped at 30s. Rate limit errors (429) get 2x longer delays. Auth and fatal errors are never retried. Set InferenceRequest.MaxRetries to override the default (3).
func (*Harness) SetActiveProvider ¶
func (h *Harness) SetActiveProvider(pt ProviderType) ProviderType
SetActiveProvider sets the active provider, returns the previous one
type HookResult ¶
type HookResult struct {
Decision string `json:"decision"` // "allow" or "block"
Reason string `json:"reason,omitempty"` // Why blocked
Message string `json:"message,omitempty"` // Human-readable message
Fallback bool `json:"fallback,omitempty"` // Used default behavior
AdditionalContext string `json:"additionalContext,omitempty"` // Context to inject (for PreInference)
}
HookResult represents the result of a hook dispatch
type InferenceConfig ¶
type InferenceConfig struct {
DefaultProvider string `yaml:"default_provider,omitempty"`
Providers map[string]*ProviderConfig `yaml:"providers,omitempty"`
}
InferenceConfig represents the inference configuration file. Resolution order: node (~/.cog/etc/) → workspace (.cog/conf/) → env → defaults.
func LoadInferenceConfig ¶
func LoadInferenceConfig(workspaceRoot string) *InferenceConfig
LoadInferenceConfig loads inference configuration with three-tier resolution:
- ~/.cog/etc/inference.yaml (node — shared across workspaces)
- $workspaceRoot/.cog/conf/inference.yaml (workspace — overrides)
- Environment variables (OPENAI_API_KEY, OPENROUTER_API_KEY, etc.)
- Compiled defaults (claude, localhost:11434)
type InferenceRequest ¶
type InferenceRequest struct {
ID string // Unique request ID (auto-generated if empty)
Prompt string // User prompt
SystemPrompt string // Optional system prompt
Model string // Model to use (empty = default)
Schema json.RawMessage // Optional JSON schema for structured output
MaxTokens *int // Optional max tokens
Origin string // Where request came from: "cli", "http", "hook", "fleet"
Stream bool // Whether to stream
Context context.Context // For cancellation
// Context pipeline
ContextState *ContextState // Four-tier context for context-aware invocation
// Tool definitions
//
// Tools is the full set of OpenAI-format tool definitions from the
// client. The harness classifies each tool as internal or external via
// ClassifyTool/PartitionTools; internal tools are executed by Claude
// CLI or the CogOS kernel, external tools are forwarded to the client
// as `tool_calls` on the response.
//
// ExternalTools, when non-nil, is a pre-partitioned list of
// client-owned tools (equivalent to the `external` return of
// PartitionTools(Tools)). Callers that have already partitioned can
// populate this directly; otherwise the harness partitions Tools on
// its own. This is the plumbing BrowserOS uses to register
// `browser_*` tools it will execute itself.
Tools []json.RawMessage // OpenAI-format tool definitions from client
ExternalTools []json.RawMessage // Client-owned tool definitions (subset of Tools)
AllowedTools []string // Claude CLI --allowed-tools patterns (e.g. "Bash", "Bash(git:*)")
SkipPermissions bool // Pass --dangerously-skip-permissions to Claude CLI
// Workspace override — when set, Claude CLI runs in this directory
// instead of the kernel's workspace.
WorkspaceRoot string
// MCP bridge configuration
MCPConfig string // Path to generated --mcp-config JSON file
OpenClawURL string // OpenClaw gateway URL for bridge proxy
OpenClawToken string // Auth token for OpenClaw
SessionID string // Session context for tool execution
// Claude CLI session continuity
// When set, the harness passes --resume <ClaudeSessionID> instead of
// starting a new session. Claude Code loads the prior conversation from
// disk and continues where it left off.
ClaudeSessionID string
// Retry configuration
MaxRetries int // Max retry attempts (0 = use default)
Timeout time.Duration // Request timeout (0 = use default)
}
InferenceRequest represents input to the inference engine
type InferenceResponse ¶
type InferenceResponse struct {
ID string `json:"id"`
Content string `json:"content"`
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
FinishReason string `json:"finish_reason"`
Error error `json:"-"`
ErrorMessage string `json:"error,omitempty"`
// Anthropic cache metrics (zero for non-Claude providers)
CacheReadTokens int `json:"cache_read_input_tokens,omitempty"`
CacheCreateTokens int `json:"cache_creation_input_tokens,omitempty"`
CostUSD float64 `json:"cost_usd,omitempty"`
// Context metrics (from context pipeline)
ContextMetrics *ContextMetrics `json:"context_metrics,omitempty"`
// Error classification (for smart recovery)
ErrorType ErrorType `json:"error_type,omitempty"`
// Claude CLI session ID returned by the process. Callers should store
// this and pass it back as ClaudeSessionID on the next request to
// enable --resume continuity.
ClaudeSessionID string `json:"claude_session_id,omitempty"`
// ToolCalls lists client-owned tool invocations the model asked for
// but the harness did not execute. The caller (typically an HTTP
// handler) is expected to surface these as `tool_calls` on the
// assistant message so a BrowserOS-style client can execute them and
// send back the result on the next turn. FinishReason is set to
// "tool_calls" when this slice is non-empty.
ToolCalls []ToolCallData `json:"tool_calls,omitempty"`
}
InferenceResponse represents output from the inference engine
type KernelServices ¶
type KernelServices interface {
// WorkspaceRoot returns the resolved workspace root path.
// Used for MCP config generation and default working directory.
WorkspaceRoot() string
// DispatchHook dispatches a lifecycle hook event (e.g., "PreInference",
// "PostInference"). Returns nil if no hook matched or the hook allowed
// the action. A "block" decision aborts inference.
DispatchHook(event string, data map[string]any) *HookResult
// EmitEvent writes a timestamped event to .cog/run/events/.
// Event types: INFERENCE_START, INFERENCE_COMPLETE, INFERENCE_ERROR.
EmitEvent(eventType string, data map[string]any) error
// ReadContinuationState reads .cog/run/continuation.json for eigenfield
// persistence across context compaction. Returns (nil, err) if no state.
ReadContinuationState() (*ContinuationState, error)
// DepositSignal places a signal in the signal field at the given location.
// Used to mark inference as active (location="inference", type="active").
DepositSignal(location, signalType, agentID string, halfLife float64, meta map[string]any) error
// RemoveSignal removes a signal. Used to clear the inference-active signal
// when a request completes.
RemoveSignal(location, signalType string) error
// ResolveWorkDir determines the working directory for Claude CLI execution.
// Priority: requestWorkspace → DEFAULT_CLIENT_WORKSPACE env → kernel workspace.
ResolveWorkDir(requestWorkspace string) string
// ConvertOpenAIToolsToMCP converts OpenAI-format tool definitions to MCP
// format for the bridge subprocess. Delegates to mcp.go in the kernel.
ConvertOpenAIToolsToMCP(tools []json.RawMessage) []MCPTool
// GetAgentToolPolicy returns the tool policy for an agent from its CRD.
// Returns nil if no CRD is found (backward-compatible — no restriction).
// Used by the inference path to enforce agent-specific tool restrictions.
GetAgentToolPolicy(agentID string) (*AgentToolPolicy, error)
}
KernelServices is the single interface the harness uses to call back into the kernel. The kernel implements this with kernelServicesAdapter (kernel_harness.go).
Methods are grouped by purpose:
- Workspace: WorkspaceRoot, ResolveWorkDir
- Lifecycle: DispatchHook (PreInference/PostInference)
- Observability: EmitEvent (INFERENCE_START/COMPLETE/ERROR)
- Signal field: DepositSignal, RemoveSignal
- State: ReadContinuationState (eigenfield persistence)
- Tools: ConvertOpenAIToolsToMCP (bridge config generation)
- Agents: GetAgentToolPolicy (CRD-based tool policy enforcement)
type MCPTool ¶
type MCPTool struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
InputSchema map[string]interface{} `json:"inputSchema,omitempty"`
}
MCPTool represents a tool in MCP format (mirrors kernel's definition)
type ModelInfo ¶
type ModelInfo struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
OwnedBy string `json:"owned_by"`
}
ModelInfo represents a single model entry
type ModelListResponse ¶
ModelListResponse represents the /v1/models response
type OpenAIChatMessage ¶
OpenAIChatMessage is a single message in the chat
type OpenAIChatRequest ¶
type OpenAIChatRequest struct {
Model string `json:"model"`
Messages []OpenAIChatMessage `json:"messages"`
MaxTokens *int `json:"max_tokens,omitempty"`
Temperature *float64 `json:"temperature,omitempty"`
Stream bool `json:"stream"`
StreamOptions *StreamOptions `json:"stream_options,omitempty"`
}
OpenAIChatRequest is the request format for OpenAI-compatible APIs
type OpenAIChatResponse ¶
type OpenAIChatResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []struct {
Index int `json:"index"`
Message struct {
Role string `json:"role"`
Content string `json:"content"`
} `json:"message"`
FinishReason string `json:"finish_reason"`
} `json:"choices"`
Usage struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
} `json:"usage"`
}
OpenAIChatResponse is the response format for OpenAI-compatible APIs
type OpenAIStreamChunk ¶
type OpenAIStreamChunk struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []struct {
Index int `json:"index"`
Delta struct {
Role string `json:"role,omitempty"`
Content string `json:"content,omitempty"`
} `json:"delta"`
FinishReason string `json:"finish_reason,omitempty"`
} `json:"choices"`
Usage *OpenAIUsage `json:"usage,omitempty"`
}
OpenAIStreamChunk is a single chunk in a streaming response
type OpenAIUsage ¶
type OpenAIUsage struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
}
OpenAIUsage represents the usage object in an OpenAI streaming chunk.
type ProviderConfig ¶
type ProviderConfig struct {
Type ProviderType `json:"type"`
BaseURL string `json:"base_url"`
APIKey string `json:"api_key"`
Model string `json:"model"` // Default model for this provider
}
ProviderConfig holds configuration for an inference provider
type ProviderHealth ¶
type ProviderHealth struct {
LastCheck *string `json:"last_check"` // ISO8601 timestamp or null
LatencyMs *int `json:"latency_ms"` // Latency in ms or null
Error *string `json:"error"` // Error message or null
}
ProviderHealth represents the health status of a provider
type ProviderInfo ¶
type ProviderInfo struct {
ID string `json:"id"`
Name string `json:"name"`
Status string `json:"status"` // "online", "offline", "unknown", "degraded"
Active bool `json:"active"`
Models []string `json:"models"`
Config ProviderPublicConfig `json:"config"`
Health ProviderHealth `json:"health"`
}
ProviderInfo represents a single provider in the API response
type ProviderListResponse ¶
type ProviderListResponse struct {
Object string `json:"object"`
Data []ProviderInfo `json:"data"`
Active string `json:"active"`
FallbackChain []string `json:"fallback_chain"`
}
ProviderListResponse represents the /v1/providers response
type ProviderPublicConfig ¶
type ProviderPublicConfig struct {
BaseURL string `json:"base_url"`
HasAPIKey bool `json:"has_api_key"`
}
ProviderPublicConfig represents publicly-visible provider configuration
type ProviderType ¶
type ProviderType string
ProviderType identifies the inference provider.
const ( ProviderClaude ProviderType = "claude" // Claude CLI (default) ProviderCodex ProviderType = "codex" // Codex CLI ProviderOpenAI ProviderType = "openai" // OpenAI API ProviderOpenRouter ProviderType = "openrouter" // OpenRouter API ProviderOllama ProviderType = "ollama" // Ollama (local) ProviderLocal ProviderType = "local" // Local kernel endpoint (self-reference) ProviderCustom ProviderType = "custom" // Any OpenAI-compatible endpoint )
type RequestEntry ¶
type RequestEntry struct {
ID string `json:"id"`
Origin string `json:"origin"`
Model string `json:"model"`
Started time.Time `json:"started"`
Status string `json:"status"` // "running", "completed", "cancelled", "failed"
Cancel context.CancelFunc `json:"-"`
Prompt string `json:"prompt,omitempty"` // First 100 chars for display
}
RequestEntry represents a tracked request in the registry.
type RequestRegistry ¶
type RequestRegistry struct {
// contains filtered or unexported fields
}
RequestRegistry tracks in-flight inference requests
func NewRequestRegistry ¶
func NewRequestRegistry() *RequestRegistry
NewRequestRegistry creates a new request registry
func (*RequestRegistry) Cancel ¶
func (r *RequestRegistry) Cancel(id string) bool
Cancel cancels a request by ID, returns true if found and cancelled
func (*RequestRegistry) Cleanup ¶
func (r *RequestRegistry) Cleanup(maxAge time.Duration) int
Cleanup removes completed/failed/cancelled requests older than duration
func (*RequestRegistry) Complete ¶
func (r *RequestRegistry) Complete(id string, status string)
Complete marks a request as completed with given status
func (*RequestRegistry) Get ¶
func (r *RequestRegistry) Get(id string) *RequestEntry
Get retrieves a request entry by ID (returns a copy to prevent data races)
func (*RequestRegistry) List ¶
func (r *RequestRegistry) List() []RequestEntry
List returns all request entries (copies to prevent data races)
func (*RequestRegistry) ListRunning ¶
func (r *RequestRegistry) ListRunning() []RequestEntry
ListRunning returns only running request entries (copies to prevent data races)
func (*RequestRegistry) Register ¶
func (r *RequestRegistry) Register(req *InferenceRequest, cancel context.CancelFunc) *RequestEntry
Register adds a new request to the registry
func (*RequestRegistry) Remove ¶
func (r *RequestRegistry) Remove(id string)
Remove removes a request from the registry
type ResponseFormat ¶
type ResponseFormat struct {
Type string `json:"type"`
JSONSchema json.RawMessage `json:"json_schema,omitempty"`
}
ResponseFormat represents the response_format field in a chat completion request
type SessionInfo ¶
type SessionInfo struct {
SessionID string `json:"session_id"`
Model string `json:"model"`
Tools []string `json:"tools,omitempty"`
ClaudeSessionID string `json:"claude_session_id,omitempty"` // Claude CLI session for --resume
}
SessionInfo represents session metadata in streaming
type StreamChunkInference ¶
type StreamChunkInference struct {
ID string `json:"id"`
Content string `json:"content"`
Done bool `json:"done"`
FinishReason string `json:"finish_reason,omitempty"`
Error error `json:"-"`
// Rich streaming fields
EventType string `json:"event_type,omitempty"` // text, tool_use, tool_result
ToolCall *ToolCallData `json:"tool_call,omitempty"` // Tool call information
ToolResult *ToolResultData `json:"tool_result,omitempty"` // Tool result information
Usage *UsageData `json:"usage,omitempty"` // Token usage data
SessionInfo *SessionInfo `json:"session_info,omitempty"` // Session metadata
// ExternalToolCalls carries client-owned tool invocations the harness
// captured from the model's stream but did NOT execute. Populated on
// the final chunk (Done=true) when the stream contained external tool
// uses — e.g. BrowserOS's `browser_*` tools. The HTTP layer should
// emit these as OpenAI-format `tool_calls` delta events and set
// finish_reason="tool_calls" before terminating the stream.
ExternalToolCalls []ToolCallData `json:"external_tool_calls,omitempty"`
}
StreamChunkInference represents a single chunk in a streaming response.
type StreamOptions ¶
type StreamOptions struct {
IncludeUsage bool `json:"include_usage"`
}
StreamOptions controls streaming behavior (OpenAI extension). Setting IncludeUsage causes the final chunk to contain token usage.
type ToolCallData ¶
type ToolCallData struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments json.RawMessage `json:"arguments"`
}
ToolCallData represents a tool call in streaming
type ToolOwnership ¶
type ToolOwnership int
ToolOwnership describes who is responsible for executing a tool call.
const ( // ToolInternal indicates the CogOS kernel or Claude CLI subprocess will // execute this tool directly (filesystem, shell, etc). ToolInternal ToolOwnership = iota // ToolExternal indicates the calling client (e.g. BrowserOS) owns // execution. The harness returns `tool_calls` to the client and expects // the client to send back a `role: "tool"` message with the result on // the next turn. ToolExternal )
func ClassifyTool ¶
func ClassifyTool(name string) ToolOwnership
ClassifyTool reports whether a tool name is internal (executed by CogOS / Claude CLI) or external (executed by the client). Matching is case-insensitive and name-only — schema is irrelevant.
func (ToolOwnership) String ¶
func (o ToolOwnership) String() string
String returns a human-readable ownership label, mainly for logs/telemetry.
type ToolResultData ¶
type ToolResultData struct {
ToolCallID string `json:"tool_call_id"`
Content string `json:"content"`
IsError bool `json:"is_error"`
}
ToolResultData represents a tool result in streaming
type ToolUseResultEx ¶
type ToolUseResultEx struct {
Stdout string `json:"stdout,omitempty"`
Stderr string `json:"stderr,omitempty"`
Interrupted bool `json:"interrupted,omitempty"`
IsImage bool `json:"isImage,omitempty"`
}
ToolUseResultEx contains extended tool result info from Claude CLI
type UsageData ¶
type UsageData struct {
InputTokens int `json:"input_tokens"`
OutputTokens int `json:"output_tokens"`
CacheReadTokens int `json:"cache_read_tokens,omitempty"`
CacheCreateTokens int `json:"cache_create_tokens,omitempty"`
CostUSD float64 `json:"cost_usd,omitempty"`
}
UsageData represents token usage in streaming