harnesses

package
v0.10.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Index

Constants

View Source
const (
	UsageSourceNativeStream     = "native_stream"
	UsageSourceNativeTokenCount = "native_token_count" // #nosec G101 -- usage source identifier, not a credential.
	UsageSourceTranscript       = "transcript"
	UsageSourceStatusOutput     = "status_output"
	UsageSourceFallback         = "fallback"

	UsageWarningMalformed    = "usage_malformed"
	UsageWarningDisagreement = "usage_source_disagreement"
)

Variables

View Source
var PreferenceOrder = []string{"codex", "claude", "opencode", "agent", "pi", "openrouter", "lmstudio", "omlx", "lucebox", "vllm", "gemini"}

PreferenceOrder defines the default harness preference when multiple are available.

Functions

func AdapterReasoningValue

func AdapterReasoningValue(req ExecuteRequest) string

AdapterReasoningValue resolves the public reasoning scalar into the value subprocess harnesses should pass to their native CLI flag. Empty, auto, off, and numeric 0 intentionally emit no flag.

func BoolPtr

func BoolPtr(v bool) *bool

func HarnessCommand

func HarnessCommand(ctx context.Context, binary string, args ...string) *exec.Cmd

HarnessCommand constructs an *exec.Cmd for a known harness binary.

binary must be a path resolved by the runner from a HarnessConfig.Binary (looked up via LookPathFunc / exec.LookPath against a fixed allowlist of builtin harness names: "codex", "claude", "gemini", "opencode", "pi", ...). args are the harness-specific argument vector assembled from the runner's HarnessConfig + per-request fields.

This helper exists to localize the gosec G204 (subprocess launched with variable) safety contract in one place rather than annotating each caller.

func IntPtr

func IntPtr(v int) *int

func MirrorEvents added in v0.10.8

func MirrorEvents(dst chan<- Event, log io.Writer, ctx context.Context) (chan Event, <-chan struct{})

func OpenProgressLog added in v0.10.8

func OpenProgressLog(sessionLogDir, sessionID, prefix string) (*os.File, error)

func QuotaStateFromUsedPercent

func QuotaStateFromUsedPercent(usedPercent int) string

QuotaStateFromUsedPercent maps a usage percentage to a quota state string.

func ResolveFinalUsage

func ResolveFinalUsage(candidates []UsageCandidate) (*FinalUsage, []FinalWarning)

ResolveFinalUsage applies the documented source precedence: native_stream > transcript > status_output > fallback. It returns nil usage when no source reported a token count, while still returning warnings for malformed sources or source disagreements.

func ResolveHarnessAlias

func ResolveHarnessAlias(name string) string

ResolveHarnessAlias returns the canonical harness name for an alias, or the input unchanged if it is not an alias.

func WriteProgressEvent added in v0.10.8

func WriteProgressEvent(log io.Writer, ev Event)

Types

type AccountInfo

type AccountInfo struct {
	Email    string `json:"email,omitempty"`
	PlanType string `json:"plan_type,omitempty"`
	OrgName  string `json:"org_name,omitempty"`
}

AccountInfo captures provider account metadata from local auth files.

type ClaudeQuotaRoutingDecision

type ClaudeQuotaRoutingDecision struct {
	Fresh        bool   `json:"fresh"`
	PreferClaude bool   `json:"prefer_claude"`
	Reason       string `json:"reason,omitempty"`
}

ClaudeQuotaRoutingDecision is the result of evaluating the durable Claude quota cache for foreground routing decisions.

type Event

type Event struct {
	Type     EventType         `json:"type"`
	Sequence int64             `json:"sequence"`
	Time     time.Time         `json:"time"`
	Metadata map[string]string `json:"metadata,omitempty"`
	Data     json.RawMessage   `json:"data"`
}

Event is the structured event a harness emits during Execute. It mirrors the shape defined in CONTRACT-003 §"Event JSON shapes". The Data field is a JSON-encoded payload whose schema is determined by Type.

type EventType

type EventType string

EventType identifies the kind of event a harness emits during execution.

The set is the closed union defined by CONTRACT-003 ("Event JSON shapes"): every backend (native + subprocess) emits these identically so the agent loop can multiplex them onto a single channel.

const (
	EventTypeTextDelta       EventType = "text_delta"
	EventTypeToolCall        EventType = "tool_call"
	EventTypeToolResult      EventType = "tool_result"
	EventTypeCompaction      EventType = "compaction"
	EventTypeProgress        EventType = "progress"
	EventTypeRoutingDecision EventType = "routing_decision"
	EventTypeStall           EventType = "stall"
	EventTypeFinal           EventType = "final"
)

type ExecuteRequest

type ExecuteRequest struct {
	// Prompt is the resolved user prompt sent to the model.
	Prompt string

	// SystemPrompt is the resolved system prompt; empty means harness default.
	SystemPrompt string

	// Provider is the resolved provider identifier when applicable. May be
	// empty for harnesses that have no provider concept (e.g. claude CLI).
	Provider string

	// Model is the resolved model identifier; empty means harness default.
	Model string

	// WorkDir is the working directory for tool operations. Required when
	// the chosen harness uses tools.
	WorkDir string

	// Permissions is "safe" | "supervised" | "unrestricted". Empty defaults to "safe".
	Permissions string

	// Temperature is the model sampling temperature requested by the caller.
	// Harness adapters may ignore it when their CLI has no equivalent control.
	Temperature float32

	// Seed is the requested sampling seed. Zero means unset/provider chooses.
	// Harness adapters may ignore it when their CLI has no equivalent control.
	Seed int64

	// Reasoning is the normalized public reasoning scalar. Empty/off means no
	// adapter flag should be emitted.
	Reasoning string

	// Timeout is the wall-clock cap for the entire request. 0 disables.
	Timeout time.Duration

	// IdleTimeout is the streaming-quiet cap. 0 uses harness default.
	IdleTimeout time.Duration

	// SessionLogDir overrides the per-run session-log directory; harness
	// uses this to direct progress traces into a per-bundle evidence dir.
	SessionLogDir string

	// SessionID is a stable identifier for the run, used in progress log
	// filenames and event metadata. Empty means the harness generates one.
	SessionID string

	// Metadata is echoed back into Event.Metadata (e.g. bead_id, attempt_id).
	Metadata map[string]string
}

ExecuteRequest is the internal request carried into Harness.Execute. It is intentionally narrower than the public ExecuteRequest in CONTRACT-003: the agent's routing layer is expected to resolve provider/model/reasoning /permissions/timeouts before invoking a harness, so the harness sees a concrete, ready-to-run request.

type FinalData

type FinalData struct {
	Status         string            `json:"status"` // success|iteration_limit|failed|stalled|timed_out|cancelled
	ExitCode       int               `json:"exit_code"`
	Error          string            `json:"error,omitempty"`
	FinalText      string            `json:"final_text,omitempty"`
	DurationMS     int64             `json:"duration_ms"`
	Usage          *FinalUsage       `json:"usage,omitempty"`
	Warnings       []FinalWarning    `json:"warnings,omitempty"`
	CostUSD        float64           `json:"cost_usd,omitempty"`
	SessionLogPath string            `json:"session_log_path,omitempty"`
	RoutingActual  *RoutingActual    `json:"routing_actual,omitempty"`
	Extra          map[string]string `json:"-"`
}

FinalData is the payload for type=final events.

type FinalUsage

type FinalUsage struct {
	InputTokens      *int                  `json:"input_tokens,omitempty"`
	OutputTokens     *int                  `json:"output_tokens,omitempty"`
	CacheReadTokens  *int                  `json:"cache_read_tokens,omitempty"`
	CacheWriteTokens *int                  `json:"cache_write_tokens,omitempty"`
	CacheTokens      *int                  `json:"cache_tokens,omitempty"`
	ReasoningTokens  *int                  `json:"reasoning_tokens,omitempty"`
	TotalTokens      *int                  `json:"total_tokens,omitempty"`
	Source           string                `json:"source,omitempty"`
	Fresh            *bool                 `json:"fresh,omitempty"`
	CapturedAt       string                `json:"captured_at,omitempty"`
	Sources          []UsageSourceEvidence `json:"sources,omitempty"`
}

FinalUsage carries token totals on a final event. Count fields are pointers so unavailable token dimensions are omitted instead of serialized as zero. A present pointer to 0 means the harness explicitly reported zero usage.

type FinalWarning

type FinalWarning struct {
	Code    string                `json:"code"`
	Message string                `json:"message,omitempty"`
	Sources []UsageSourceEvidence `json:"sources,omitempty"`
}

FinalWarning is normalized metadata about non-fatal final-event issues.

type Harness

type Harness interface {
	// Info returns identity + capability metadata for this harness.
	Info() HarnessInfo

	// HealthCheck triggers a fresh probe (binary present, auth ok, etc.)
	// and returns nil if the harness is ready to execute.
	HealthCheck(ctx context.Context) error

	// Execute runs one resolved request. Events stream on the returned
	// channel; a single final event closes the stream. The first error
	// return is reserved for setup failures (binary missing, etc.) — once
	// the channel is returned, all per-run failures are reported via a
	// final event with Status != "success".
	Execute(ctx context.Context, req ExecuteRequest) (<-chan Event, error)
}

Harness is the internal contract every harness implementation in internal/harnesses/<name> satisfies. It is the minimal surface the agent dispatcher needs to route a resolved request into a backend.

A Harness is responsible for emitting events on the returned channel until execution completes; the channel MUST be closed after the final event so downstream consumers can detect end-of-stream. The final event is always of type EventTypeFinal.

type HarnessConfig

type HarnessConfig struct {
	Name                string              // e.g. "codex", "claude", "gemini"
	Binary              string              // binary name to exec
	BaseArgs            []string            // args always included regardless of permission level
	PermissionArgs      map[string][]string // extra args keyed by permission level: "safe", "supervised", "unrestricted"
	PromptMode          string              // "arg" (final arg), "stdin" (pipe)
	DefaultModel        string              // built-in model choice when no config override exists
	Models              []string            // known valid models for this harness
	ReasoningLevels     []string            // supported reasoning levels in preference order
	MaxReasoningTokens  int                 // numeric reasoning budget max; 0 = unsupported/unknown
	ModelFlag           string              // flag for model override (e.g. "-m", "--model"), empty if unsupported
	WorkDirFlag         string              // flag for working directory (e.g. "-C", "--cwd"), empty if unsupported
	ReasoningFlag       string              // adapter flag for reasoning control, empty if unsupported
	ReasoningFormat     string              // format string for adapter reasoning value, empty = use value directly
	TokenPattern        string              // regex to extract token count from output, must have one capture group
	Surface             string              // catalog surface identifier: "codex", "claude", "embedded-openai", "embedded-anthropic"
	CostClass           string              // local, cheap, medium, expensive
	IsLocal             bool                // true for embedded/local harnesses (no cloud cost)
	ExactPinSupport     bool                // true if harness can accept an exact concrete model pin
	QuotaCommand        string              // CLI args for non-interactive quota introspection; empty = skip probe
	TUIQuotaCommand     string              // Slash command to send as a prompt when native quota signal is unavailable
	IsHTTPProvider      bool                // true for API-only providers (openrouter, lmstudio) that have no CLI binary
	IsSubscription      bool                // true for fixed-subscription harnesses (codex, claude)
	AutoRoutingEligible bool                // true when this harness has full coverage and may be selected by unattended profile routing
	TestOnly            bool                // true for sentinel/test harnesses that must never be selected by production tier routing
}

HarnessConfig defines a known agent harness's invocation metadata. This is a configuration struct (not an interface) that captures binary, args, flags, and capability metadata for each builtin harness.

type HarnessInfo

type HarnessInfo struct {
	Name                 string
	Type                 string // "native" | "subprocess"
	Available            bool
	Path                 string
	Error                string
	IsLocal              bool
	IsSubscription       bool
	AutoRoutingEligible  bool
	ExactPinSupport      bool
	DefaultModel         string
	SupportedPermissions []string
	SupportedReasoning   []string
	CostClass            string
}

HarnessInfo describes a registered harness. Mirrors the public HarnessInfo type defined in CONTRACT-003. Internal callers use this to implement the public ListHarnesses surface without re-declaring the shape.

type HarnessState

type HarnessState struct {
	Installed           bool                        `json:"installed"`
	Reachable           bool                        `json:"reachable"`
	Authenticated       bool                        `json:"authenticated"`
	QuotaOK             bool                        `json:"quota_ok"`
	QuotaState          string                      `json:"quota_state,omitempty"` // ok, blocked, unknown
	Degraded            bool                        `json:"degraded"`
	PolicyOK            bool                        `json:"policy_ok"`
	LastCheckedUnix     int64                       `json:"last_checked_unix,omitempty"`
	Error               string                      `json:"error,omitempty"`
	Quota               *QuotaInfo                  `json:"quota,omitempty"`
	ClaudeQuotaDecision *ClaudeQuotaRoutingDecision `json:"claude_quota_decision,omitempty"`
}

HarnessState captures the runtime routing-relevant state of a harness.

type HarnessStatus

type HarnessStatus struct {
	Name      string `json:"name"`
	Available bool   `json:"available"`
	Binary    string `json:"binary"`
	Path      string `json:"path,omitempty"` // resolved binary path
	Error     string `json:"error,omitempty"`
}

HarnessStatus reports availability of a harness.

type LookPathFunc

type LookPathFunc func(file string) (string, error)

LookPathFunc abstracts binary discovery for testability.

var DefaultLookPath LookPathFunc = exec.LookPath

DefaultLookPath is the production implementation.

type ModelDiscoverySnapshot

type ModelDiscoverySnapshot struct {
	CapturedAt      time.Time `json:"captured_at"`
	Models          []string  `json:"models,omitempty"`
	ReasoningLevels []string  `json:"reasoning_levels,omitempty"`
	Source          string    `json:"source"`
	FreshnessWindow string    `json:"freshness_window,omitempty"`
	Detail          string    `json:"detail,omitempty"`
}

ModelDiscoverySnapshot captures model and reasoning capability evidence for harnesses whose source of truth is a CLI/TUI surface instead of /v1/models.

type QuotaInfo

type QuotaInfo struct {
	PercentUsed int    `json:"percent_used"`
	LimitWindow string `json:"limit_window,omitempty"` // e.g. "5h", "7 day"
	ResetDate   string `json:"reset_date,omitempty"`   // e.g. "April 12"
}

QuotaInfo holds parsed quota data from CLI introspection.

func ParseQuotaOutput

func ParseQuotaOutput(output string) *QuotaInfo

ParseQuotaOutput parses the text output of a harness quota command. It extracts percent_used, limit_window, and reset_date. Returns nil if no quota data is found.

type QuotaWindow

type QuotaWindow struct {
	Name          string  `json:"name"`               // e.g. "5h", "7d", "spark"
	LimitID       string  `json:"limit_id,omitempty"` // provider limit_id
	LimitName     string  `json:"limit_name,omitempty"`
	WindowMinutes int     `json:"window_minutes"`
	UsedPercent   float64 `json:"used_percent"`
	ResetsAt      string  `json:"resets_at,omitempty"`      // human-readable
	ResetsAtUnix  int64   `json:"resets_at_unix,omitempty"` // unix timestamp
	State         string  `json:"state"`
}

QuotaWindow captures one quota window (e.g. 5h, weekly, model-specific).

type Registry

type Registry struct {
	LookPath LookPathFunc
	// contains filtered or unexported fields
}

Registry manages known harnesses.

func NewRegistry

func NewRegistry() *Registry

NewRegistry creates a registry with builtin harnesses.

func (*Registry) Discover

func (r *Registry) Discover() []HarnessStatus

Discover checks which harnesses are available on the system.

func (*Registry) FirstAvailable

func (r *Registry) FirstAvailable() (string, bool)

FirstAvailable returns the first available harness in preference order.

func (*Registry) Get

func (r *Registry) Get(name string) (HarnessConfig, bool)

Get returns a harness config by name.

func (*Registry) Has

func (r *Registry) Has(name string) bool

Has returns true if the harness is registered.

func (*Registry) Names

func (r *Registry) Names() []string

Names returns all registered harness names in preference order.

type RoutingActual

type RoutingActual struct {
	Harness            string   `json:"harness"`
	Provider           string   `json:"provider,omitempty"`
	Model              string   `json:"model"`
	FallbackChainFired []string `json:"fallback_chain_fired,omitempty"`
	FailureClass       string   `json:"failure_class,omitempty"`
	// Power is the catalog-projected power of the actually-dispatched
	// Model. 0 means unknown/exact-pin-only/no catalog entry.
	Power int `json:"power,omitempty"`
}

RoutingActual captures the resolved fallback chain on a final event.

type TextDeltaData

type TextDeltaData struct {
	Text string `json:"text"`
}

TextDeltaData is the payload for type=text_delta events.

type ToolCallData

type ToolCallData struct {
	ID    string          `json:"id"`
	Name  string          `json:"name"`
	Input json.RawMessage `json:"input,omitempty"`
}

ToolCallData is the payload for type=tool_call events.

type ToolResultData

type ToolResultData struct {
	ID         string `json:"id"`
	Output     string `json:"output,omitempty"`
	Error      string `json:"error,omitempty"`
	DurationMS int64  `json:"duration_ms,omitempty"`
}

ToolResultData is the payload for type=tool_result events.

type UsageCandidate

type UsageCandidate struct {
	Source     string
	Fresh      *bool
	CapturedAt string
	Counts     UsageTokenCounts
	Warning    string
}

UsageCandidate is one candidate source considered for final token usage.

type UsageSourceEvidence

type UsageSourceEvidence struct {
	Source     string            `json:"source"`
	Fresh      *bool             `json:"fresh,omitempty"`
	CapturedAt string            `json:"captured_at,omitempty"`
	Usage      *UsageTokenCounts `json:"usage,omitempty"`
	Warning    string            `json:"warning,omitempty"`
}

UsageSourceEvidence records one usage source considered by the resolver.

type UsageTokenCounts

type UsageTokenCounts struct {
	InputTokens      *int `json:"input_tokens,omitempty"`
	OutputTokens     *int `json:"output_tokens,omitempty"`
	CacheReadTokens  *int `json:"cache_read_tokens,omitempty"`
	CacheWriteTokens *int `json:"cache_write_tokens,omitempty"`
	CacheTokens      *int `json:"cache_tokens,omitempty"`
	ReasoningTokens  *int `json:"reasoning_tokens,omitempty"`
	TotalTokens      *int `json:"total_tokens,omitempty"`
}

UsageTokenCounts is the normalized token-count vocabulary shared by subprocess harnesses and CONTRACT-003 final metadata.

func ParseUsageJSON

func ParseUsageJSON(raw json.RawMessage) (UsageTokenCounts, error)

ParseUsageJSON normalizes common Claude/Codex/OpenAI-style usage objects. Unknown dimensions remain nil. A present zero remains present.

func (UsageTokenCounts) Any

func (c UsageTokenCounts) Any() bool

Any reports whether at least one token dimension is known.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL