harness

package module
v0.0.0-...-dddc2fe Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: MIT Imports: 22 Imported by: 0

Documentation

Overview

claude.go handles Claude CLI argument construction and MCP bridge configuration.

BuildClaudeArgs is the main entry point — it takes an InferenceRequest and produces the argument list for `claude -p ... --output-format stream-json`. This includes system prompt chaining (TAA context + client prompt), model mapping, tool forwarding (--allowed-tools), and MCP config.

GenerateMCPConfig creates a temporary JSON file for --mcp-config that tells Claude CLI to spawn `cog mcp serve --bridge` as an MCP subprocess.

config.go implements node-level inference configuration.

CogOS supports a three-tier config resolution for inference settings:

~/.cog/etc/inference.yaml           node (shared across all workspaces)
$workspace/.cog/conf/inference.yaml workspace (overrides node)
Environment variables               OPENAI_API_KEY, OPENROUTER_API_KEY, etc.
Compiled defaults                   DefaultProviders() in providers.go

LoadInferenceConfig merges node + workspace layers. Environment variables and compiled defaults are handled separately by DefaultProviders().

Package harness provides the inference execution engine for CogOS.

Architecture

The harness is a separate Go module that owns all inference logic: model routing, Claude CLI execution, HTTP provider dispatch, streaming, retries, and tool pipeline management. It connects to the kernel through a single KernelServices interface — the harness never imports package main.

┌──────────────────────────────────────────────────┐
│  kernel (package main)                           │
│                                                  │
│  serve.go ──► kernel_harness.go ──► harness pkg  │
│  cog.go   ──► kernel_harness.go ──► harness pkg  │
│                     │                            │
│          kernelServicesAdapter                    │
│          implements KernelServices               │
└──────────────────────────────────────────────────┘

Common Paths

HTTP request (POST /v1/chat/completions):

serve.go:handleChatCompletions
  → kernel_harness.go:HarnessRunInference / HarnessRunInferenceStream
    → converts kernel types → harness types
    → harness.Harness.RunInference / RunInferenceStream
      → ParseModelProvider (routes claude vs http)
      → Claude path: BuildClaudeArgs → exec claude CLI
      → HTTP path:   runHTTPInference → OpenAI-compatible API
    → converts harness response → kernel types
  → serve.go formats OpenAI-compatible HTTP response

CLI request (cog infer "prompt"):

inference.go:cmdInfer
  → kernel_harness.go:HarnessRunInference
    → (same path as above)

File Layout

harness.go      Harness struct, New(), RunInference, RunInferenceStream, RunInferenceWithRetry
interfaces.go   KernelServices interface — the only bridge contract
types.go        InferenceRequest/Response, ContextState, ChatMessage, API response types
stream.go       StreamChunkInference, Claude CLI wire types, OpenAI wire types
providers.go    ProviderType, ProviderConfig, ParseModelProvider, DefaultProviders
claude.go       BuildClaudeArgs, GenerateMCPConfig, chainSystemPrompt, BuildContextMetrics
http.go         runHTTPInference, runHTTPInferenceStream (OpenAI-compatible providers)
tools.go        MapToolsToCLINames — OpenAI tool defs → Claude CLI --allowed-tools
registry.go     RequestRegistry — tracks in-flight requests for cancellation/listing
retry.go        ErrorType classification, ClassifyError, ClassifyHTTPError
config.go       Node-level config resolution (~/.cog/etc/ → workspace → env → defaults)
otel.go         Package-scoped OTEL tracer

http.go handles inference via OpenAI-compatible HTTP APIs (OpenAI, OpenRouter, Ollama, and custom endpoints).

These functions are called by Harness.RunInference and Harness.RunInferenceStream when ParseModelProvider returns a non-Claude provider. They build an OpenAI chat completion request, send it to the provider's /chat/completions endpoint, and parse the response (sync or SSE stream).

interfaces.go defines the bridge contract between the harness and the kernel.

The harness never imports package main. All kernel dependencies flow through KernelServices, which the kernel implements in kernel_harness.go via kernelServicesAdapter.

To add a new kernel capability to the harness:

  1. Add a method to KernelServices here
  2. Implement it on kernelServicesAdapter in kernel_harness.go
  3. Call it from harness code via h.kernel.NewMethod()

otel.go initializes the OpenTelemetry tracer for the harness package. All harness spans (inference dispatch, CLI execution, HTTP provider calls) are recorded under the "cogos-harness" instrumentation scope.

providers.go defines inference provider types and model routing.

ParseModelProvider is the routing function — it takes a model string from the request and returns which provider to use. This is called early in both RunInference and RunInferenceStream to decide the Claude CLI vs HTTP path.

DefaultProviders returns the built-in provider configs. API keys come from environment variables (OPENAI_API_KEY, OPENROUTER_API_KEY). Ollama and local providers don't require keys.

registry.go tracks in-flight inference requests for visibility and cancellation.

Each Harness owns a RequestRegistry. When a request starts, it's registered with status "running". On completion it moves to "completed", "failed", or "cancelled". The kernel exposes registry contents via GET /v1/requests and supports cancellation via DELETE /v1/requests/:id.

StartRegistryCleanup should be called once at startup to periodically remove stale entries (completed/failed/cancelled older than 1 hour).

retry.go provides error classification and retry constants.

ClassifyError inspects error messages to determine whether an error is retryable. ClassifyHTTPError does the same for HTTP status codes. These are used by RunInferenceWithRetry to decide whether to retry, and by RunInference to tag responses with ErrorType for callers.

stream.go defines streaming types and Claude CLI wire format types.

StreamChunkInference is the harness's unified streaming event. It carries text deltas, tool calls, tool results, usage data, and session metadata through a single channel from RunInferenceStream to the caller.

The Claude* types (ClaudeStreamMessage, ClaudeMessage, ClaudeContent, etc.) represent the JSON wire format emitted by `claude --output-format stream-json`. The OpenAI* types represent the wire format for HTTP provider communication.

tools.go maps OpenAI-format tool definitions to Claude CLI tool names and classifies tool ownership (internal = CogOS executes, external = client executes). This is the plumbing that lets BrowserOS-style clients forward their `browser_*` tool definitions through the kernel without the harness silently dropping them.

Key entry points:

  • MapToolsToCLINames — extract Claude CLI `--allowed-tools` names from OpenAI-format tool defs. Only internal tools have CLI names; external tools are not returned (by design — they're registered through the MCP bridge, not `--allowed-tools`).
  • ClassifyTool — ownership lookup for a single tool name.
  • PartitionTools — split a raw tool list into (internal, external).
  • ExtractToolName — pull the function name out of one OpenAI tool def.

The internal-tool set is closed (the CogOS kernel has a fixed set of built-ins: Bash, Read, Write, Edit, Grep, Glob). Anything else is assumed external — the model is told about them via an MCP bridge and any `tool_use` events Claude emits for them are returned to the client as OpenAI-format `tool_calls` rather than executed server-side.

types.go defines the shared data types used across the harness package.

Key types and where they flow:

  • InferenceRequest — input to RunInference / RunInferenceStream
  • InferenceResponse — output from RunInference
  • ContextState — four-tier context pipeline (identity/temporal/present/semantic)
  • ChatMessage — OpenAI-format message (used in ChatCompletionRequest)
  • ChatCompletionRequest — the HTTP request body for /v1/chat/completions

API response types (ModelListResponse, ProviderListResponse, etc.) are also defined here so the kernel's HTTP handlers can return harness-typed responses in future waves.

Index

Constants

View Source
const (
	DefaultMaxRetries = 3
	DefaultTimeout    = 2 * time.Minute
	BaseRetryDelay    = time.Second
)

Default retry configuration.

View Source
const ClaudeCommand = "claude"

ClaudeCommand is the name of the Claude CLI binary.

View Source
const (
	CodexCommand = "codex"
)

Variables

This section is empty.

Functions

func BuildClaudeArgs

func BuildClaudeArgs(req *InferenceRequest) []string

BuildClaudeArgs constructs the Claude CLI arguments from an InferenceRequest. Supports both legacy mode (SystemPrompt) and new context-aware mode (ContextState).

func BuildCodexArgs

func BuildCodexArgs(req *InferenceRequest, schemaPath string, kernel KernelServices) ([]string, error)

func DefaultProviders

func DefaultProviders() map[ProviderType]*ProviderConfig

DefaultProviders returns the default provider configurations. API keys are read from environment variables.

func ExtractToolName

func ExtractToolName(raw json.RawMessage) string

ExtractToolName pulls the function name out of a single OpenAI-format tool definition. Returns "" for malformed input.

func GenerateMCPConfig

func GenerateMCPConfig(req *InferenceRequest, kernel KernelServices) (string, error)

GenerateMCPConfig creates a temporary MCP config JSON file for Claude CLI's --mcp-config flag. The config tells Claude CLI to spawn `cog mcp serve --bridge` as an MCP server, enabling access to both CogOS and OpenClaw tools.

func GenerateRequestID

func GenerateRequestID(origin string) string

GenerateRequestID creates a unique request ID with format: req-{origin}-{timestamp}-{random}

func MapToolsToCLINames

func MapToolsToCLINames(tools []json.RawMessage) []string

MapToolsToCLINames extracts function names from OpenAI-format tool definitions and maps them to Claude CLI built-in tool names where possible.

Only internal tools produce output. External tools are silently dropped — they're registered through `--mcp-config` (or forwarded to the client as `tool_calls`) rather than `--allowed-tools`, so including them here would just make Claude CLI fail to start.

func ParseModelProvider

func ParseModelProvider(model string) (ProviderType, string, *ProviderConfig)

ParseModelProvider extracts the provider and model from a model string. Formats:

  • "claude" or "" -> (ProviderClaude, "claude")
  • "codex" -> (ProviderCodex, "codex")
  • "codex/gpt-5-codex" -> (ProviderCodex, "gpt-5-codex")
  • "openai/gpt-4o" -> (ProviderOpenAI, "gpt-4o")
  • "openrouter/anthropic/claude-3-haiku" -> (ProviderOpenRouter, "anthropic/claude-3-haiku")
  • "ollama/llama3.2" -> (ProviderOllama, "llama3.2")
  • "local/claude" -> (ProviderLocal, "claude")
  • "http://localhost:8080|model-name" -> (ProviderCustom, model with custom URL)

func PartitionTools

func PartitionTools(tools []json.RawMessage) (internal, external []json.RawMessage)

PartitionTools splits a list of OpenAI-format tool definitions into two disjoint slices preserving input order:

  • internal: tools the harness will execute itself (mapped via Claude CLI `--allowed-tools` or the MCP bridge for CogOS built-ins).
  • external: tools to be forwarded to the client as `tool_calls`.

Malformed entries (bad JSON, missing function.name) are dropped from both outputs — identical to the old MapToolsToCLINames silent-skip behaviour.

func StartRegistryCleanup

func StartRegistryCleanup(registry *RequestRegistry)

StartRegistryCleanup starts a background goroutine that periodically removes completed/failed/cancelled entries older than 1 hour.

func StringToRawContent

func StringToRawContent(s string) json.RawMessage

StringToRawContent converts a string to json.RawMessage for ChatMessage.Content

Types

type AgentToolPolicy

type AgentToolPolicy struct {
	AllowedTools               []string // Claude CLI --allowed-tools patterns
	DenyTools                  []string // Tools explicitly denied
	DangerouslySkipPermissions bool     // Whether to pass --dangerously-skip-permissions
}

AgentToolPolicy contains the resolved tool policy for an agent from its CRD.

type ChatCompletionRequest

type ChatCompletionRequest struct {
	Model          string            `json:"model"`
	Messages       []ChatMessage     `json:"messages"`
	Stream         bool              `json:"stream,omitempty"`
	Temperature    *float64          `json:"temperature,omitempty"`
	MaxTokens      *int              `json:"max_tokens,omitempty"`
	ResponseFormat *ResponseFormat   `json:"response_format,omitempty"`
	SystemPrompt   string            `json:"system_prompt,omitempty"` // Extension for explicit system
	TAA            json.RawMessage   `json:"taa,omitempty"`           // TAA context: false/absent=none, true=default, "name"=profile
	Tools          []json.RawMessage `json:"tools,omitempty"`         // OpenAI-format tool definitions
}

ChatCompletionRequest represents an OpenAI-compatible chat completion request

func (*ChatCompletionRequest) GetTAAProfile

func (r *ChatCompletionRequest) GetTAAProfile() (string, bool)

GetTAAProfile extracts TAA profile from the request body field

func (*ChatCompletionRequest) GetTAAProfileWithHeader

func (r *ChatCompletionRequest) GetTAAProfileWithHeader(header string) (string, bool)

GetTAAProfileWithHeader extracts TAA profile with header taking precedence

type ChatMessage

type ChatMessage struct {
	Role    string          `json:"role"`
	Content json.RawMessage `json:"content"` // Can be string or array
}

ChatMessage represents a message in the chat format. Content is json.RawMessage to handle both string and array-of-parts formats.

func (*ChatMessage) GetContent

func (m *ChatMessage) GetContent() string

GetContent extracts the text content from a ChatMessage. Handles both string format and array-of-parts format (OpenAI SDK).

type ClaudeContent

type ClaudeContent struct {
	Type      string          `json:"type"`
	Text      string          `json:"text,omitempty"`
	ID        string          `json:"id,omitempty"`          // For tool_use blocks
	Name      string          `json:"name,omitempty"`        // For tool_use blocks
	Input     json.RawMessage `json:"input,omitempty"`       // For tool_use blocks
	ToolUseID string          `json:"tool_use_id,omitempty"` // For tool_result blocks
	Content   string          `json:"content,omitempty"`     // For tool_result blocks
	IsError   bool            `json:"is_error,omitempty"`    // For tool_result blocks
}

ClaudeContent represents a content block in the message

type ClaudeMessage

type ClaudeMessage struct {
	Content    []ClaudeContent `json:"content,omitempty"`
	StopReason string          `json:"stop_reason,omitempty"`
	Usage      *ClaudeUsage    `json:"usage,omitempty"`
}

ClaudeMessage represents the nested message in assistant responses

type ClaudeStreamMessage

type ClaudeStreamMessage struct {
	Type             string           `json:"type"`
	Subtype          string           `json:"subtype,omitempty"`
	Message          *ClaudeMessage   `json:"message,omitempty"`
	Result           string           `json:"result,omitempty"`
	StructuredOutput json.RawMessage  `json:"structured_output,omitempty"`
	Usage            *ClaudeUsage     `json:"usage,omitempty"`
	ToolUseResult    *ToolUseResultEx `json:"tool_use_result,omitempty"`
}

ClaudeStreamMessage represents a message from the Claude CLI stream-json output

type ClaudeUsage

type ClaudeUsage struct {
	InputTokens       int     `json:"input_tokens,omitempty"`
	OutputTokens      int     `json:"output_tokens,omitempty"`
	CacheReadTokens   int     `json:"cache_read_input_tokens,omitempty"`
	CacheCreateTokens int     `json:"cache_creation_input_tokens,omitempty"`
	CostUSD           float64 `json:"cost_usd,omitempty"`
}

ClaudeUsage represents token usage info from Claude CLI output. The cache fields are populated in both result messages and message_start events.

type ContextMetrics

type ContextMetrics struct {
	TotalTokens     int            `json:"total_tokens"`
	TierBreakdown   map[string]int `json:"tier_breakdown"`
	CoherenceScore  float64        `json:"coherence_score"`
	CompressionUsed bool           `json:"compression_used"`
}

ContextMetrics captures metrics about context used in inference

func BuildContextMetrics

func BuildContextMetrics(ctx *ContextState) *ContextMetrics

BuildContextMetrics extracts metrics from ContextState for response

type ContextState

type ContextState struct {
	// Tier 1: Identity (stable, ~1/3 of budget)
	Tier1Identity *ContextTier `json:"tier1_identity,omitempty"`

	// Tier 2: Temporal (session state, signals, history)
	Tier2Temporal *ContextTier `json:"tier2_temporal,omitempty"`

	// Tier 3: Present (current message context)
	Tier3Present *ContextTier `json:"tier3_present,omitempty"`

	// Tier 4: Semantic (constellation knowledge graph)
	Tier4Semantic *ContextTier `json:"tier4_semantic,omitempty"`

	// Model selection (optional override)
	Model string `json:"model,omitempty"`

	// Metadata
	TotalTokens    int     `json:"total_tokens,omitempty"`
	CoherenceScore float64 `json:"coherence_score,omitempty"`
	ShouldRefresh  bool    `json:"should_refresh,omitempty"`

	// TAA signals (from Tier 2 temporal analysis)
	Anchor string `json:"anchor,omitempty"` // Current conversation topic
	Goal   string `json:"goal,omitempty"`   // Detected user intent
}

ContextState represents the full context.cog.json structure Used for context-aware invocation of the inference engine

func (*ContextState) BuildContextString

func (cs *ContextState) BuildContextString() string

BuildContextString assembles the full context string from tiers

type ContextTier

type ContextTier struct {
	Content string `json:"content"`
	Tokens  int    `json:"tokens"`
	Source  string `json:"source,omitempty"`
}

ContextTier represents a single tier of context with metadata

type ContinuationState

type ContinuationState struct {
	SessionID          string `json:"session_id"`
	Timestamp          string `json:"timestamp"`
	Trigger            string `json:"trigger"`
	Focus              string `json:"focus"`
	ContinuationPrompt string `json:"continuation_prompt"`
}

ContinuationState represents the eigenfield continuation state

type ErrorDetail

type ErrorDetail struct {
	Message string `json:"message"`
	Type    string `json:"type"`
	Code    string `json:"code,omitempty"`
}

ErrorDetail contains error information

type ErrorResponse

type ErrorResponse struct {
	Error ErrorDetail `json:"error"`
}

ErrorResponse represents an API error

type ErrorType

type ErrorType int

ErrorType classifies inference errors for smart recovery

const (
	ErrorNone            ErrorType = iota
	ErrorRateLimit                 // 429 - retry with backoff
	ErrorContextOverflow           // Context too long - compress and retry
	ErrorAuth                      // Authentication failure - fail fast
	ErrorTransient                 // Transient failure - retry with backoff
	ErrorFatal                     // Fatal error - don't retry
)

func ClassifyError

func ClassifyError(err error) ErrorType

ClassifyError determines the error type from an error message

func ClassifyHTTPError

func ClassifyHTTPError(statusCode int) ErrorType

ClassifyHTTPError maps HTTP status codes to ErrorType

func (ErrorType) String

func (e ErrorType) String() string

String returns human-readable error type

type Harness

type Harness struct {

	// Debug mode
	DebugMode bool
	// contains filtered or unexported fields
}

Harness is the inference execution engine. Create one with New and call Harness.RunInference (sync) or Harness.RunInferenceStream (streaming).

The Harness owns a RequestRegistry for tracking in-flight requests and delegates kernel-specific operations (hooks, events, signals, workspace resolution) through the KernelServices interface passed to New.

func New

func New(kernel KernelServices) *Harness

New creates a new Harness connected to the kernel via the KernelServices interface.

func (*Harness) GetActiveProvider

func (h *Harness) GetActiveProvider() ProviderType

GetActiveProvider returns the currently active provider

func (*Harness) Registry

func (h *Harness) Registry() *RequestRegistry

Registry returns the harness's request registry for external use (e.g., HTTP handlers).

func (*Harness) RunInference

func (h *Harness) RunInference(req *InferenceRequest) (*InferenceResponse, error)

RunInference executes a non-streaming inference request and blocks until complete.

Routing is determined by the Model field in the request:

  • "" or "claude" → Claude CLI (default path)
  • "openai/gpt-4o" → OpenAI API via HTTP
  • "openrouter/claude-3" → OpenRouter API via HTTP
  • "ollama/llama3.2" → Local Ollama via HTTP
  • "http://host|model" → Custom OpenAI-compatible endpoint

The full lifecycle for each request:

  1. Inject continuation context (eigenfield persistence)
  2. Dispatch PreInference hook (may block or inject context)
  3. Register in RequestRegistry (for /v1/requests visibility)
  4. Route to Claude CLI or HTTP provider
  5. Emit INFERENCE_START / INFERENCE_COMPLETE / INFERENCE_ERROR events
  6. Dispatch PostInference hook (artifact extraction, logging)
  7. Clean up signal field

func (*Harness) RunInferenceStream

func (h *Harness) RunInferenceStream(req *InferenceRequest) (<-chan StreamChunkInference, error)

RunInferenceStream executes a streaming inference request and returns immediately. The returned channel receives StreamChunkInference values and closes when the inference completes (look for chunk.Done == true as the final message).

Routing follows the same rules as Harness.RunInference. For the Claude CLI path, the stream includes rich events: text deltas, tool_use start/delta/stop, tool_result, session_info, and a final Done chunk with usage data.

The caller should drain the channel fully. Context cancellation is respected — cancelling the request context will terminate the underlying CLI process or HTTP connection and close the channel.

func (*Harness) RunInferenceWithRetry

func (h *Harness) RunInferenceWithRetry(req *InferenceRequest) (*InferenceResponse, error)

RunInferenceWithRetry wraps Harness.RunInference with automatic retry for transient errors. Uses exponential backoff (1s, 2s, 4s…) capped at 30s. Rate limit errors (429) get 2x longer delays. Auth and fatal errors are never retried. Set InferenceRequest.MaxRetries to override the default (3).

func (*Harness) SetActiveProvider

func (h *Harness) SetActiveProvider(pt ProviderType) ProviderType

SetActiveProvider sets the active provider, returns the previous one

type HookResult

type HookResult struct {
	Decision          string `json:"decision"`                    // "allow" or "block"
	Reason            string `json:"reason,omitempty"`            // Why blocked
	Message           string `json:"message,omitempty"`           // Human-readable message
	Fallback          bool   `json:"fallback,omitempty"`          // Used default behavior
	AdditionalContext string `json:"additionalContext,omitempty"` // Context to inject (for PreInference)
}

HookResult represents the result of a hook dispatch

type InferenceConfig

type InferenceConfig struct {
	DefaultProvider string                     `yaml:"default_provider,omitempty"`
	Providers       map[string]*ProviderConfig `yaml:"providers,omitempty"`
}

InferenceConfig represents the inference configuration file. Resolution order: node (~/.cog/etc/) → workspace (.cog/conf/) → env → defaults.

func LoadInferenceConfig

func LoadInferenceConfig(workspaceRoot string) *InferenceConfig

LoadInferenceConfig loads inference configuration with three-tier resolution:

  1. ~/.cog/etc/inference.yaml (node — shared across workspaces)
  2. $workspaceRoot/.cog/conf/inference.yaml (workspace — overrides)
  3. Environment variables (OPENAI_API_KEY, OPENROUTER_API_KEY, etc.)
  4. Compiled defaults (claude, localhost:11434)

type InferenceRequest

type InferenceRequest struct {
	ID           string          // Unique request ID (auto-generated if empty)
	Prompt       string          // User prompt
	SystemPrompt string          // Optional system prompt
	Model        string          // Model to use (empty = default)
	Schema       json.RawMessage // Optional JSON schema for structured output
	MaxTokens    *int            // Optional max tokens
	Origin       string          // Where request came from: "cli", "http", "hook", "fleet"
	Stream       bool            // Whether to stream
	Context      context.Context // For cancellation

	// Context pipeline
	ContextState *ContextState // Four-tier context for context-aware invocation

	// Tool definitions
	//
	// Tools is the full set of OpenAI-format tool definitions from the
	// client. The harness classifies each tool as internal or external via
	// ClassifyTool/PartitionTools; internal tools are executed by Claude
	// CLI or the CogOS kernel, external tools are forwarded to the client
	// as `tool_calls` on the response.
	//
	// ExternalTools, when non-nil, is a pre-partitioned list of
	// client-owned tools (equivalent to the `external` return of
	// PartitionTools(Tools)). Callers that have already partitioned can
	// populate this directly; otherwise the harness partitions Tools on
	// its own. This is the plumbing BrowserOS uses to register
	// `browser_*` tools it will execute itself.
	Tools           []json.RawMessage // OpenAI-format tool definitions from client
	ExternalTools   []json.RawMessage // Client-owned tool definitions (subset of Tools)
	AllowedTools    []string          // Claude CLI --allowed-tools patterns (e.g. "Bash", "Bash(git:*)")
	SkipPermissions bool              // Pass --dangerously-skip-permissions to Claude CLI

	// Workspace override — when set, Claude CLI runs in this directory
	// instead of the kernel's workspace.
	WorkspaceRoot string

	// MCP bridge configuration
	MCPConfig     string // Path to generated --mcp-config JSON file
	OpenClawURL   string // OpenClaw gateway URL for bridge proxy
	OpenClawToken string // Auth token for OpenClaw
	SessionID     string // Session context for tool execution

	// Claude CLI session continuity
	// When set, the harness passes --resume <ClaudeSessionID> instead of
	// starting a new session. Claude Code loads the prior conversation from
	// disk and continues where it left off.
	ClaudeSessionID string

	// Retry configuration
	MaxRetries int           // Max retry attempts (0 = use default)
	Timeout    time.Duration // Request timeout (0 = use default)
}

InferenceRequest represents input to the inference engine

type InferenceResponse

type InferenceResponse struct {
	ID               string `json:"id"`
	Content          string `json:"content"`
	PromptTokens     int    `json:"prompt_tokens"`
	CompletionTokens int    `json:"completion_tokens"`
	FinishReason     string `json:"finish_reason"`
	Error            error  `json:"-"`
	ErrorMessage     string `json:"error,omitempty"`

	// Anthropic cache metrics (zero for non-Claude providers)
	CacheReadTokens   int     `json:"cache_read_input_tokens,omitempty"`
	CacheCreateTokens int     `json:"cache_creation_input_tokens,omitempty"`
	CostUSD           float64 `json:"cost_usd,omitempty"`

	// Context metrics (from context pipeline)
	ContextMetrics *ContextMetrics `json:"context_metrics,omitempty"`

	// Error classification (for smart recovery)
	ErrorType ErrorType `json:"error_type,omitempty"`

	// Claude CLI session ID returned by the process. Callers should store
	// this and pass it back as ClaudeSessionID on the next request to
	// enable --resume continuity.
	ClaudeSessionID string `json:"claude_session_id,omitempty"`

	// ToolCalls lists client-owned tool invocations the model asked for
	// but the harness did not execute. The caller (typically an HTTP
	// handler) is expected to surface these as `tool_calls` on the
	// assistant message so a BrowserOS-style client can execute them and
	// send back the result on the next turn. FinishReason is set to
	// "tool_calls" when this slice is non-empty.
	ToolCalls []ToolCallData `json:"tool_calls,omitempty"`
}

InferenceResponse represents output from the inference engine

type KernelServices

type KernelServices interface {
	// WorkspaceRoot returns the resolved workspace root path.
	// Used for MCP config generation and default working directory.
	WorkspaceRoot() string

	// DispatchHook dispatches a lifecycle hook event (e.g., "PreInference",
	// "PostInference"). Returns nil if no hook matched or the hook allowed
	// the action. A "block" decision aborts inference.
	DispatchHook(event string, data map[string]any) *HookResult

	// EmitEvent writes a timestamped event to .cog/run/events/.
	// Event types: INFERENCE_START, INFERENCE_COMPLETE, INFERENCE_ERROR.
	EmitEvent(eventType string, data map[string]any) error

	// ReadContinuationState reads .cog/run/continuation.json for eigenfield
	// persistence across context compaction. Returns (nil, err) if no state.
	ReadContinuationState() (*ContinuationState, error)

	// DepositSignal places a signal in the signal field at the given location.
	// Used to mark inference as active (location="inference", type="active").
	DepositSignal(location, signalType, agentID string, halfLife float64, meta map[string]any) error

	// RemoveSignal removes a signal. Used to clear the inference-active signal
	// when a request completes.
	RemoveSignal(location, signalType string) error

	// ResolveWorkDir determines the working directory for Claude CLI execution.
	// Priority: requestWorkspace → DEFAULT_CLIENT_WORKSPACE env → kernel workspace.
	ResolveWorkDir(requestWorkspace string) string

	// ConvertOpenAIToolsToMCP converts OpenAI-format tool definitions to MCP
	// format for the bridge subprocess. Delegates to mcp.go in the kernel.
	ConvertOpenAIToolsToMCP(tools []json.RawMessage) []MCPTool

	// GetAgentToolPolicy returns the tool policy for an agent from its CRD.
	// Returns nil if no CRD is found (backward-compatible — no restriction).
	// Used by the inference path to enforce agent-specific tool restrictions.
	GetAgentToolPolicy(agentID string) (*AgentToolPolicy, error)
}

KernelServices is the single interface the harness uses to call back into the kernel. The kernel implements this with kernelServicesAdapter (kernel_harness.go).

Methods are grouped by purpose:

  • Workspace: WorkspaceRoot, ResolveWorkDir
  • Lifecycle: DispatchHook (PreInference/PostInference)
  • Observability: EmitEvent (INFERENCE_START/COMPLETE/ERROR)
  • Signal field: DepositSignal, RemoveSignal
  • State: ReadContinuationState (eigenfield persistence)
  • Tools: ConvertOpenAIToolsToMCP (bridge config generation)
  • Agents: GetAgentToolPolicy (CRD-based tool policy enforcement)

type MCPTool

type MCPTool struct {
	Name        string                 `json:"name"`
	Description string                 `json:"description,omitempty"`
	InputSchema map[string]interface{} `json:"inputSchema,omitempty"`
}

MCPTool represents a tool in MCP format (mirrors kernel's definition)

type ModelInfo

type ModelInfo struct {
	ID      string `json:"id"`
	Object  string `json:"object"`
	Created int64  `json:"created"`
	OwnedBy string `json:"owned_by"`
}

ModelInfo represents a single model entry

type ModelListResponse

type ModelListResponse struct {
	Object string      `json:"object"`
	Data   []ModelInfo `json:"data"`
}

ModelListResponse represents the /v1/models response

type OpenAIChatMessage

type OpenAIChatMessage struct {
	Role    string `json:"role"`
	Content string `json:"content"`
}

OpenAIChatMessage is a single message in the chat

type OpenAIChatRequest

type OpenAIChatRequest struct {
	Model         string              `json:"model"`
	Messages      []OpenAIChatMessage `json:"messages"`
	MaxTokens     *int                `json:"max_tokens,omitempty"`
	Temperature   *float64            `json:"temperature,omitempty"`
	Stream        bool                `json:"stream"`
	StreamOptions *StreamOptions      `json:"stream_options,omitempty"`
}

OpenAIChatRequest is the request format for OpenAI-compatible APIs

type OpenAIChatResponse

type OpenAIChatResponse struct {
	ID      string `json:"id"`
	Object  string `json:"object"`
	Created int64  `json:"created"`
	Model   string `json:"model"`
	Choices []struct {
		Index   int `json:"index"`
		Message struct {
			Role    string `json:"role"`
			Content string `json:"content"`
		} `json:"message"`
		FinishReason string `json:"finish_reason"`
	} `json:"choices"`
	Usage struct {
		PromptTokens     int `json:"prompt_tokens"`
		CompletionTokens int `json:"completion_tokens"`
		TotalTokens      int `json:"total_tokens"`
	} `json:"usage"`
}

OpenAIChatResponse is the response format for OpenAI-compatible APIs

type OpenAIStreamChunk

type OpenAIStreamChunk struct {
	ID      string `json:"id"`
	Object  string `json:"object"`
	Created int64  `json:"created"`
	Model   string `json:"model"`
	Choices []struct {
		Index int `json:"index"`
		Delta struct {
			Role    string `json:"role,omitempty"`
			Content string `json:"content,omitempty"`
		} `json:"delta"`
		FinishReason string `json:"finish_reason,omitempty"`
	} `json:"choices"`
	Usage *OpenAIUsage `json:"usage,omitempty"`
}

OpenAIStreamChunk is a single chunk in a streaming response

type OpenAIUsage

type OpenAIUsage struct {
	PromptTokens     int `json:"prompt_tokens"`
	CompletionTokens int `json:"completion_tokens"`
	TotalTokens      int `json:"total_tokens"`
}

OpenAIUsage represents the usage object in an OpenAI streaming chunk.

type ProviderConfig

type ProviderConfig struct {
	Type    ProviderType `json:"type"`
	BaseURL string       `json:"base_url"`
	APIKey  string       `json:"api_key"`
	Model   string       `json:"model"` // Default model for this provider
}

ProviderConfig holds configuration for an inference provider

type ProviderHealth

type ProviderHealth struct {
	LastCheck *string `json:"last_check"` // ISO8601 timestamp or null
	LatencyMs *int    `json:"latency_ms"` // Latency in ms or null
	Error     *string `json:"error"`      // Error message or null
}

ProviderHealth represents the health status of a provider

type ProviderInfo

type ProviderInfo struct {
	ID     string               `json:"id"`
	Name   string               `json:"name"`
	Status string               `json:"status"` // "online", "offline", "unknown", "degraded"
	Active bool                 `json:"active"`
	Models []string             `json:"models"`
	Config ProviderPublicConfig `json:"config"`
	Health ProviderHealth       `json:"health"`
}

ProviderInfo represents a single provider in the API response

type ProviderListResponse

type ProviderListResponse struct {
	Object        string         `json:"object"`
	Data          []ProviderInfo `json:"data"`
	Active        string         `json:"active"`
	FallbackChain []string       `json:"fallback_chain"`
}

ProviderListResponse represents the /v1/providers response

type ProviderPublicConfig

type ProviderPublicConfig struct {
	BaseURL   string `json:"base_url"`
	HasAPIKey bool   `json:"has_api_key"`
}

ProviderPublicConfig represents publicly-visible provider configuration

type ProviderType

type ProviderType string

ProviderType identifies the inference provider.

const (
	ProviderClaude     ProviderType = "claude"     // Claude CLI (default)
	ProviderCodex      ProviderType = "codex"      // Codex CLI
	ProviderOpenAI     ProviderType = "openai"     // OpenAI API
	ProviderOpenRouter ProviderType = "openrouter" // OpenRouter API
	ProviderOllama     ProviderType = "ollama"     // Ollama (local)
	ProviderLocal      ProviderType = "local"      // Local kernel endpoint (self-reference)
	ProviderCustom     ProviderType = "custom"     // Any OpenAI-compatible endpoint
)

type RequestEntry

type RequestEntry struct {
	ID      string             `json:"id"`
	Origin  string             `json:"origin"`
	Model   string             `json:"model"`
	Started time.Time          `json:"started"`
	Status  string             `json:"status"` // "running", "completed", "cancelled", "failed"
	Cancel  context.CancelFunc `json:"-"`
	Prompt  string             `json:"prompt,omitempty"` // First 100 chars for display
}

RequestEntry represents a tracked request in the registry.

type RequestRegistry

type RequestRegistry struct {
	// contains filtered or unexported fields
}

RequestRegistry tracks in-flight inference requests

func NewRequestRegistry

func NewRequestRegistry() *RequestRegistry

NewRequestRegistry creates a new request registry

func (*RequestRegistry) Cancel

func (r *RequestRegistry) Cancel(id string) bool

Cancel cancels a request by ID, returns true if found and cancelled

func (*RequestRegistry) Cleanup

func (r *RequestRegistry) Cleanup(maxAge time.Duration) int

Cleanup removes completed/failed/cancelled requests older than duration

func (*RequestRegistry) Complete

func (r *RequestRegistry) Complete(id string, status string)

Complete marks a request as completed with given status

func (*RequestRegistry) Get

func (r *RequestRegistry) Get(id string) *RequestEntry

Get retrieves a request entry by ID (returns a copy to prevent data races)

func (*RequestRegistry) List

func (r *RequestRegistry) List() []RequestEntry

List returns all request entries (copies to prevent data races)

func (*RequestRegistry) ListRunning

func (r *RequestRegistry) ListRunning() []RequestEntry

ListRunning returns only running request entries (copies to prevent data races)

func (*RequestRegistry) Register

func (r *RequestRegistry) Register(req *InferenceRequest, cancel context.CancelFunc) *RequestEntry

Register adds a new request to the registry

func (*RequestRegistry) Remove

func (r *RequestRegistry) Remove(id string)

Remove removes a request from the registry

type ResponseFormat

type ResponseFormat struct {
	Type       string          `json:"type"`
	JSONSchema json.RawMessage `json:"json_schema,omitempty"`
}

ResponseFormat represents the response_format field in a chat completion request

type SessionInfo

type SessionInfo struct {
	SessionID       string   `json:"session_id"`
	Model           string   `json:"model"`
	Tools           []string `json:"tools,omitempty"`
	ClaudeSessionID string   `json:"claude_session_id,omitempty"` // Claude CLI session for --resume
}

SessionInfo represents session metadata in streaming

type StreamChunkInference

type StreamChunkInference struct {
	ID           string `json:"id"`
	Content      string `json:"content"`
	Done         bool   `json:"done"`
	FinishReason string `json:"finish_reason,omitempty"`
	Error        error  `json:"-"`

	// Rich streaming fields
	EventType   string          `json:"event_type,omitempty"`   // text, tool_use, tool_result
	ToolCall    *ToolCallData   `json:"tool_call,omitempty"`    // Tool call information
	ToolResult  *ToolResultData `json:"tool_result,omitempty"`  // Tool result information
	Usage       *UsageData      `json:"usage,omitempty"`        // Token usage data
	SessionInfo *SessionInfo    `json:"session_info,omitempty"` // Session metadata

	// ExternalToolCalls carries client-owned tool invocations the harness
	// captured from the model's stream but did NOT execute. Populated on
	// the final chunk (Done=true) when the stream contained external tool
	// uses — e.g. BrowserOS's `browser_*` tools. The HTTP layer should
	// emit these as OpenAI-format `tool_calls` delta events and set
	// finish_reason="tool_calls" before terminating the stream.
	ExternalToolCalls []ToolCallData `json:"external_tool_calls,omitempty"`
}

StreamChunkInference represents a single chunk in a streaming response.

type StreamOptions

type StreamOptions struct {
	IncludeUsage bool `json:"include_usage"`
}

StreamOptions controls streaming behavior (OpenAI extension). Setting IncludeUsage causes the final chunk to contain token usage.

type ToolCallData

type ToolCallData struct {
	ID        string          `json:"id"`
	Name      string          `json:"name"`
	Arguments json.RawMessage `json:"arguments"`
}

ToolCallData represents a tool call in streaming

type ToolOwnership

type ToolOwnership int

ToolOwnership describes who is responsible for executing a tool call.

const (
	// ToolInternal indicates the CogOS kernel or Claude CLI subprocess will
	// execute this tool directly (filesystem, shell, etc).
	ToolInternal ToolOwnership = iota

	// ToolExternal indicates the calling client (e.g. BrowserOS) owns
	// execution. The harness returns `tool_calls` to the client and expects
	// the client to send back a `role: "tool"` message with the result on
	// the next turn.
	ToolExternal
)

func ClassifyTool

func ClassifyTool(name string) ToolOwnership

ClassifyTool reports whether a tool name is internal (executed by CogOS / Claude CLI) or external (executed by the client). Matching is case-insensitive and name-only — schema is irrelevant.

func (ToolOwnership) String

func (o ToolOwnership) String() string

String returns a human-readable ownership label, mainly for logs/telemetry.

type ToolResultData

type ToolResultData struct {
	ToolCallID string `json:"tool_call_id"`
	Content    string `json:"content"`
	IsError    bool   `json:"is_error"`
}

ToolResultData represents a tool result in streaming

type ToolUseResultEx

type ToolUseResultEx struct {
	Stdout      string `json:"stdout,omitempty"`
	Stderr      string `json:"stderr,omitempty"`
	Interrupted bool   `json:"interrupted,omitempty"`
	IsImage     bool   `json:"isImage,omitempty"`
}

ToolUseResultEx contains extended tool result info from Claude CLI

type UsageData

type UsageData struct {
	InputTokens       int     `json:"input_tokens"`
	OutputTokens      int     `json:"output_tokens"`
	CacheReadTokens   int     `json:"cache_read_tokens,omitempty"`
	CacheCreateTokens int     `json:"cache_create_tokens,omitempty"`
	CostUSD           float64 `json:"cost_usd,omitempty"`
}

UsageData represents token usage in streaming

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL