Documentation
¶
Overview ¶
Package llm defines Harbor's LLM-client interface and the runtime-wide invariants that guard every `Complete` call.
The interface is **one method**, `Complete(ctx, req) (resp, error)` (RFC §6.5). Tool dispatch is the runtime's job (RFC §6.4 + brief 07 "code-level tool calling"); the LLM client is reduced to a JSON- producing chat-completion adapter. Provider-native tool-calling shapes (the `tools=` request parameter, the `tool_choice=` mode selector, OpenAI's `function_call`, Anthropic's `tool_use` blocks, Gemini's function-calling protocol, etc.) never appear in this package — the static guard in `scripts/smoke/phase-32.sh` enforces the boundary by greppping for the canonical symbol names.
The message envelope is provider-agnostic: `ChatMessage.Content` is a sum-type that carries either `Text *string` (the common case) or `Parts []ContentPart` for multimodal input (D-021). Multimodal parts (`ImagePart`, `AudioPart`, `FilePart`) each carry one of three supply forms — `URL`, `DataURL`, or `Artifact` — and the runtime auto-materializes inline `DataURL` content above the heavy-output threshold into `ArtifactRef`s before persistence and emit (D-022).
**Context-window safety net (D-026).** Every `Complete` call routes through a catch-all pass at the LLM-client edge that (a) auto- materializes oversize `DataURL` content, (b) asserts no raw heavy content survived ANY producer's normalization step (else `ErrContextLeak`), (c) estimates token usage against the configured `ModelProfile.ContextWindowTokens` cap and fails with `ErrContextWindowExceeded` when the estimate is within `ContextWindowReserve` of the cap. **V1 fails loudly**; auto-cascading recovery is post-V1 work.
The safety pass is **mandatory by construction**: `Open` returns a wrapped client (`safetyClient`) that runs the pass before delegating to the underlying `Driver`. Drivers cannot bypass the pass through the registry; a hand-constructed `Driver` would likewise have to compose `enforceContextSafety` to maintain the runtime invariant.
Concurrent-reuse contract (D-025): one `LLMClient` is safe to share across N concurrent goroutines. Mutable state on the client (or the `Driver`) is forbidden; per-call state lives in `ctx` and the request value. The package-level `concurrent_test.go` pins this with N=128 invocations under `-race`.
Index ¶
- Constants
- Variables
- func HasIdentity(ctx context.Context) bool
- func IsInvalidJSONSchemaError(err error) bool
- func Register(name string, factory Factory)
- func RegisterCorrectionsWrapper(fn func(LLMClient, ConfigSnapshot) LLMClient)
- func RegisterDefaultOutputModeResolver(fn func(model string) OutputMode)
- func RegisterDowngradeWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
- func RegisterGovernanceWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
- func RegisterMockModeCaptured(v bool)
- func RegisterRetryWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
- func RegisteredDrivers() []string
- type ArtifactStub
- type AudioPart
- type ChatMessage
- type CompleteRequest
- type CompleteResponse
- type CompletionChunkPayload
- type ConfigSnapshot
- type Content
- type ContentPart
- type ContextLeakPayload
- type ContextWindowExceededPayload
- type CorrectionsProfile
- type Cost
- type CostRecordedPayload
- type CostTable
- type CustomProviderSpec
- type Deps
- type Driver
- type Factory
- type FilePart
- type ImageMaterializedPayload
- type ImagePart
- type LLMClient
- type MessageOrderingPolicy
- type ModeDowngradedPayload
- type ModelProfile
- type NetworkDefaults
- type OutputMode
- type PartType
- type PostureProvider
- type PostureReadAdminPayload
- type PostureSnapshot
- type ReasoningEffort
- type ReasoningRouting
- type ResponseFormat
- type ResponseFormatKind
- type ResponseFormatProfile
- type RetryWithFeedbackPayload
- type Role
- type SchemaSanitizationMode
- type StubFetch
- type ToolCallStructured
- type ToolDeclaration
- type Usage
Constants ¶
const ( // EventTypeImageMaterialized — emitted when the safety-pass's // auto-materialize step rewrites an inline DataURL ≥ heavy-output // threshold to an ArtifactRef (D-022). Carries the source // CompleteRequest's model name + the new ref's id + size. EventTypeImageMaterialized events.EventType = "llm.image.materialized" // EventTypeContextLeak — emitted when the safety-pass detects // raw heavy content that survived every upstream producer's // normalization step (D-026 violation). The bus event lets // operators trace the offending producer. EventTypeContextLeak events.EventType = "llm.context_leak" // EventTypeContextWindowExceeded — emitted when the safety-pass // token-budget guard fires (D-026). Payload carries the // estimated token count + the model's cap + the reserve // fraction so operators can quantify how often planner-side // recovery (truncate / summarize) needs to engage. EventTypeContextWindowExceeded events.EventType = "llm.context_window_exceeded" // EventTypeCostRecorded — emitted by the runtime AFTER a // successful Complete. Phase 36a (governance accumulator) // subscribes; Phase 32 registers the type + ships the payload // shape so Phase 36a's emit site lands clean. EventTypeCostRecorded events.EventType = "llm.cost.recorded" // EventTypeModeDowngraded — emitted by Phase 35's structured- // output downgrade chain (`json_schema → json_object → text`). // Phase 32 registers the type as a forward-compat seam; no // downgrade logic ships in Phase 32. EventTypeModeDowngraded events.EventType = "llm.mode_downgraded" // EventTypeRetryWithFeedback (Phase 36) — emitted by the retry // wrapper per corrective re-ask. Carries the attempt index and a // truncated `Reason` derived from the validator's error. EventTypeRetryWithFeedback events.EventType = "llm.retry_with_feedback" // EventTypePostureReadAdmin — Phase 72g (D-112). Emitted when an // admin-scoped caller reads ANOTHER tenant's LLM posture via the // `llm.posture` Protocol method. An own-tenant read does NOT emit. // The cross-tenant read is a privileged action and lands on the // audit trail per CLAUDE.md §7 + RFC §6.15. EventTypePostureReadAdmin events.EventType = "llm.posture_read_admin" // EventTypeCompletionChunk — Phase 107 streaming completion event. // Emitted per token delta from the LLM provider under the originating // run's identity quadruple. The `Done=true` chunk fires exactly once // per stream (terminator marker). SafePayload — deltas are per-session // operator-visible content. EventTypeCompletionChunk events.EventType = "llm.completion.chunk" )
Phase 32 LLM-edge event types. Registered via init() so the canonical events registry stays the single source of truth (see internal/events/events.go and AGENTS.md §17.6's "wiring gap" lesson — register at declaration time, publish at use time).
All payloads are SafePayload (compose events.SafeSealed): they carry no secret-shaped data. Identity is the Harbor quadruple; content payloads (artifact refs, MIME types, byte counts, model names) are operator-visible by design.
const ( DefaultContextWindowReserve = 0.05 // 5% DefaultHeavyOutputThreshold = 32_768 // 32 KiB; matches D-022 / RFC §6.10 // DefaultMaxRetries (Phase 36) — the retry-with-feedback bound // when `ModelProfile.MaxRetries` is zero. Conservative: one // corrective re-ask after the original attempt. DefaultMaxRetries = 1 )
Defaults applied when the snapshot's corresponding field is zero. Kept here (not in `validate.go`) so an operator who constructs a snapshot programmatically still gets reasonable behaviour without every test wiring also touching the config layer.
const DefaultDriver = "bifrost"
DefaultDriver names the production LLM driver Phase 64 (D-089) flipped this constant to point at — `"bifrost"`, the pure-Go LLM gateway shipped by Phase 33. Before Phase 64 this was `"mock"`; the flip closes the §13 "test stubs as production defaults" amendment for the LLM seam.
Operators in production set `llm.driver` explicitly to `"bifrost"` (the same value the config defaults to). The `mock` driver still self-registers via init() — its package init runs when an importer (a test that builds a deterministic LLM stack) blank-imports it — but the production `cmd/harbor` binary never imports the mock package, so a config that lists `driver: mock` in a production build hits `ErrUnknownDriver: "mock" (registered: bifrost)` rather than silently routing through a stub.
Dev-only escape hatch (D-089): the `harbor dev` subcommand reads `HARBOR_DEV_ALLOW_MOCK=1` and, when set, blank-imports the mock driver itself (the conditional blank-import lives at the subcommand boundary, not in `main.go`) AND prints a stderr banner `[DEV-ONLY MOCK LLM — DO NOT USE IN PRODUCTION]` on every boot. Outside that one explicit dev path, the mock is unreachable.
Variables ¶
var ( // ErrUnknownDriver — Open was asked for a driver name no // registered factory handles. The error's message names the // registered drivers so misconfigurations are obvious (§4.4). ErrUnknownDriver = errors.New("llm: unknown driver") // ErrClientClosed — Complete called after Close. The wrapped // driver returns this; the safetyClient propagates it verbatim. ErrClientClosed = errors.New("llm: client is closed") // ErrIdentityMissing — Complete called with a ctx that does not // carry an `identity.Identity` (or `identity.Quadruple`). // AGENTS.md §6 rule 9 — identity is mandatory at every Harbor // boundary; the runtime fails closed. ErrIdentityMissing = errors.New("llm: identity missing from ctx") // ErrInvalidContent — a `ChatMessage.Content` is malformed: both // `Text` and `Parts` set, or neither, or a `ContentPart` whose // `Type` discriminator doesn't match its payload (e.g. Type=image // with `Image == nil`). The safety pass rejects loudly rather than // papering over the inconsistency. ErrInvalidContent = errors.New("llm: invalid message content") // ErrContextLeak — runtime-wide invariant violation (D-026). A // raw byte / string / DataURL ≥ heavy-output threshold survived // every producer's normalization step and reached the LLM-client // edge. The safety pass fails the request; the bus emits // `llm.context_leak` so operators can find the offending // producer. ErrContextLeak = errors.New("llm: raw heavy content reached LLM-client edge — D-026 violation") // ErrContextWindowExceeded — the token-budget guard fired (D-026). // The assembled `CompleteRequest`'s estimated token count is // within `ContextWindowReserve` of the model's configured // `ContextWindowTokens` cap. V1 fails loudly; auto-cascade is // post-V1 work — the planner is responsible for recovery (drop // older turns, summarize, etc.). ErrContextWindowExceeded = errors.New("llm: estimated tokens within reserve of model context window") // ErrInvalidConfig — `Open` called with a `ConfigSnapshot` that // fails structural validation (driver name empty, model profile // missing for the request's model, etc.). Distinct from // ErrUnknownDriver — that's a registry miss, this is a // configuration miss. ErrInvalidConfig = errors.New("llm: invalid configuration") // ErrUnsupportedModel — the safety net or driver hit a model // name with no matching `ModelProfile`. Required because the // token-budget guard depends on a profile's context-window cap. ErrUnsupportedModel = errors.New("llm: model has no configured ModelProfile") // ErrInvalidJSONSchema (Phase 35) — the provider returned a // `Complete` whose JSON output did not validate against the // requested schema (or rejected the schema itself at the wire // layer). The downgrade wrapper observes this via // `IsInvalidJSONSchemaError` and steps the request down the chain. // Drivers MAY wrap their provider-specific schema errors with this // sentinel; the classifier also matches a small allowlist of error // substrings to handle providers that surface only a free-form // `error` string. ErrInvalidJSONSchema = errors.New("llm: response failed JSON-schema validation") // ErrDowngradeExhausted (Phase 35) — the downgrade wrapper ran // every step in the chain and the inner call STILL produced // `ErrInvalidJSONSchema`. Surfaces with the wrapped chain history // so operators can correlate against `llm.mode_downgraded` events. ErrDowngradeExhausted = errors.New("llm: structured-output downgrade chain exhausted") // ErrRetryExhausted (Phase 36) — the retry wrapper exceeded the // per-model `MaxRetries` bound. Wraps the chain of validator // failures so operators can see why each attempt failed. ErrRetryExhausted = errors.New("llm: retry-with-feedback budget exhausted") // ErrValidationFailed (Phase 36) — surfaces when the validator // returns non-nil AND the retry wrapper is NOT registered (the // caller asked for validation without a wrapper to retry). The // wrapper-registered path uses the validator's own error verbatim // in `RetryWithFeedbackPayload.Reason`. ErrValidationFailed = errors.New("llm: response validator rejected output") // ErrOrphanToolCall — an assistant message with `ToolCalls` is // not followed by the corresponding `RoleTool` messages whose // `ToolCallID` matches each `ToolCalls[i].ID`. OpenAI's wire // spec requires the pairing; the safety pass rejects loudly so // the producer is forced to fix the upstream omission rather // than silently shipping an invalid wire shape. ErrOrphanToolCall = errors.New("llm: assistant message with ToolCalls is not followed by matching RoleTool messages") )
Sentinel errors. Callers compare via errors.Is.
Functions ¶
func HasIdentity ¶
HasIdentity reports whether `ctx` carries a complete Harbor identity. The LLM-client edge MUST validate this before invoking any driver — the runtime fails closed on missing identity (AGENTS.md §6 rule 9, AGENTS.md §13 forbidden-practices).
Used by `safetyClient.Complete`; exposed so test helpers can pin the check at the call site.
func IsInvalidJSONSchemaError ¶
IsInvalidJSONSchemaError reports whether `err` represents a schema-class failure that the Phase 35 downgrade chain should treat as a signal to step the request down to the next `OutputMode`.
The classifier checks two paths:
- `errors.Is(err, ErrInvalidJSONSchema)` — drivers / wrappers that classify upstream errors and wrap with the sentinel.
- A small case-insensitive substring scan against `invalidJSONSchemaErrorMarkers`. This handles providers that surface only a free-form error string.
The substring allowlist is deliberately narrow to avoid false positives on transient / IO / auth failures. Returns false for nil.
func Register ¶
Register installs a driver factory under `name`. Drivers self- register from their package `init()`; `cmd/harbor` blank-imports the production driver to trigger registration (Phase 33+).
Re-registering the same name panics — the registration model is write-once-at-init and a duplicate signals a build misconfig.
func RegisterCorrectionsWrapper ¶
func RegisterCorrectionsWrapper(fn func(LLMClient, ConfigSnapshot) LLMClient)
RegisterCorrectionsWrapper installs the Phase 34 corrections wrapper hook. Called once from `internal/llm/corrections.init()`; the production binary picks up the registration by blank-importing the corrections package.
The hook signature mirrors `corrections.Wrap` — given the inner `LLMClient` (the safety wrapper) and the config snapshot, returns the corrections-wrapped client.
Re-registering panics — the registration model is write-once-at- init and a duplicate signals a build misconfig.
func RegisterDefaultOutputModeResolver ¶
func RegisterDefaultOutputModeResolver(fn func(model string) OutputMode)
RegisterDefaultOutputModeResolver installs the per-known-provider `OutputMode` resolver from `internal/llm/corrections`. Called once from `corrections.init()`; the production binary blank-imports the corrections package so the registration fires at boot. Re-registering panics — write-once-at-init.
func RegisterDowngradeWrapper ¶
func RegisterDowngradeWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
RegisterDowngradeWrapper installs the Phase 35 structured-output downgrade wrapper hook. Called once from `internal/llm/output.init()`; the production binary blank-imports `internal/llm/output` so the registration fires at boot.
The hook receives the inner `LLMClient` (typically `corrections(safety(driver))`), the config snapshot, and the Deps so the wrapper can emit events on the shared bus.
Re-registering panics — write-once-at-init.
func RegisterGovernanceWrapper ¶
func RegisterGovernanceWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
RegisterGovernanceWrapper installs the Phase 36a/36b governance wrapper hook. Called once from `internal/governance.init()`; the production binary blank-imports the package so the hook lands at boot. Governance composes OUTSIDE the entire downstream chain (D-043 + D-044) — the wrapper sits at the outermost layer in `Open` so `PreCall` fires before retry / downgrade / corrections / safety even reach the driver.
The hook receives the inner `LLMClient` (typically `retry(downgrade(corrections(safety(driver))))`), the config snapshot, and the Deps so the wrapper can build its Subsystem if a factory has been registered via `governance.SetFactory`. Latent default: with no factory set, the hook returns `inner` unchanged.
Re-registering panics — write-once-at-init.
func RegisterMockModeCaptured ¶
func RegisterMockModeCaptured(v bool)
RegisterMockModeCaptured records that the runtime booted with `HARBOR_DEV_ALLOW_MOCK=1` (D-089). It is called exactly once from `cmd/harbor/devmock.go::registerMockIfDevAllowMock` at boot — the SAME call site that prints the `[DEV-ONLY MOCK LLM — DO NOT USE IN PRODUCTION]` stderr banner. Calling it with `true` flips the captured flag so `llm.posture` surfaces `MockMode: true`; calling it with `false` (or never calling it — the zero value) leaves the flag false.
A future PR that re-routes the dev-hatch path (e.g. promotes the env var to a CLI flag) MUST keep this call reciprocal with the banner emit — otherwise `LLMPostureResponse.MockMode` silently desyncs from the banner. The Phase 72g integration + smoke tests assert both paths fire together.
func RegisterRetryWrapper ¶
func RegisterRetryWrapper(fn func(LLMClient, ConfigSnapshot, Deps) LLMClient)
RegisterRetryWrapper installs the Phase 36 retry-with-feedback wrapper hook. Called once from `internal/llm/retry.init()`; the production binary blank-imports `internal/llm/retry`.
The hook signature mirrors `RegisterDowngradeWrapper`.
Re-registering panics — write-once-at-init.
func RegisteredDrivers ¶
func RegisteredDrivers() []string
RegisteredDrivers returns a sorted list of driver names. Useful for boot-log emission and for surfacing in error messages.
Types ¶
type ArtifactStub ¶
type ArtifactStub struct {
Ref string `json:"artifact_ref"`
MIME string `json:"mime"`
SizeBytes int64 `json:"size_bytes"`
Hash string `json:"hash,omitempty"`
Summary string `json:"summary,omitempty"`
Fetch *StubFetch `json:"fetch,omitempty"`
}
ArtifactStub is the model-agnostic JSON shape the LLM sees in place of heavy content during prompt assembly (RFC §6.5, D-026). The same shape is used whether the substituted content originated from a tool result, a memory turn, or a multimodal input.
Operators can override `Summary` per-producer; the rest is runtime-stamped at materialization time. The stub's JSON rendering is byte-stable across providers — no per-provider swapping.
JSON shape (omitempty on optional fields, no extra fields):
{"artifact_ref":"ref-abc-def","mime":"image/png","size_bytes":65536,
"hash":"sha256:...","summary":"User-uploaded screenshot at turn 3",
"fetch":{"tool":"artifact.fetch","id":"ref-abc-def"}}
func (ArtifactStub) MarshalJSON ¶
func (s ArtifactStub) MarshalJSON() ([]byte, error)
MarshalJSON ensures the canonical render of an `ArtifactStub` — stable field order, `omitempty` honored, no extra fields. The runtime's `ObservationRenderer` and the safety-net materialization both go through this method, so producers and the LLM-side audit see byte-identical output.
Implemented explicitly (rather than relying on Go's default struct marshaling) so the contract is stable across Go version field- ordering changes.
type AudioPart ¶
type AudioPart struct {
URL string
DataURL string
Artifact *ArtifactStub
MIME string
}
AudioPart is a multimodal audio input. Same supply forms as `ImagePart`; `MIME` is the audio MIME type.
type ChatMessage ¶
type ChatMessage struct {
Role Role
Content Content
Name *string
// ToolCallID (Phase 107c / D-167) is the provider-assigned
// tool-call identifier carried on RoleTool messages. Rendered
// as the native tool-result role with matching call ID when
// the provider supports it; falls back to user-role rendering
// on providers without native tool-result roles.
ToolCallID *string
// ToolCalls (Phase 107c / D-167) is the per-message structured
// tool-call slice carried on RoleAssistant messages that replay
// a prior planner step's CallTool emission into the next turn's
// thread. When non-empty, the bifrost translator emits an
// assistant message with the provider-native `tool_calls`
// block (OpenAI / Anthropic / Gemini all consume this shape).
// The matching tool result is threaded back via a sibling
// RoleTool message whose `ToolCallID` matches `ToolCalls[i].ID`.
// Empty for every non-assistant message and for assistant
// messages whose content is the model's final answer.
ToolCalls []ToolCallStructured
}
ChatMessage is one entry in the chat thread.
`Content` is a sum-type: exactly one of `Text` or `Parts` is set. `Text` is the common case (text-only conversation). `Parts` is set when the message carries multimodal content. `Name` is optional — used by some providers for participant naming.
type CompleteRequest ¶
type CompleteRequest struct {
Model string
Messages []ChatMessage
ResponseFormat *ResponseFormat
Stream bool
OnContent func(delta string, done bool)
OnReasoning func(delta string, done bool)
Temperature *float32
MaxTokens *int
Stops []string
ReasoningEffort ReasoningEffort
Extra map[string]any
// Validator (Phase 36) is the caller-supplied post-response
// validation hook. When non-nil, the retry wrapper invokes it
// after each successful `Complete`; a non-nil return triggers a
// corrective re-ask bounded by `ModelProfile.MaxRetries`. The
// validator is opaque to the wrapper — return any error type;
// the wrapper truncates and includes its `Error()` in the retry
// sub-prompt + the `llm.retry_with_feedback` event payload.
//
// `nil` Validator (the default) disables the retry loop entirely;
// the wrapper is a no-op pass-through. Validators MUST be safe for
// concurrent invocation against the same compiled artifact (the
// wrapper itself enforces D-025; the validator runs once per call).
Validator func(CompleteResponse) error
// Tools (Phase 107c / D-167) is the per-turn tool catalog. When
// nil the driver calls the provider without the tool-calling
// block (text-only completion — preserves non-React planner
// behavior).
Tools []ToolDeclaration
// ToolChoice (Phase 107c / D-167) is the per-provider tool-choice
// passthrough. "" means "do not emit a tool_choice field"; "auto"
// lets the provider decide; "required" forces the model to emit at
// least one tool call; "none" suppresses tool calls entirely.
ToolChoice string
// ParallelToolCalls (Phase 107c / D-167) is the per-turn knob for
// parallel function-calling (default true for supporting providers;
// bifrost maps it per provider). The planner sets this per the
// operator's yaml knob + the runloop executor's capability signal.
ParallelToolCalls bool
}
CompleteRequest is the LLM-call payload. Settled in RFC §6.5; shaped by D-021 (multimodal sum-type), D-026 (safety-net invariants).
`Messages` is the chat thread — role + content only. The system / user / assistant roles are the entire vocabulary; tool-result rendering happens at the `ObservationRenderer` layer as user-role messages (RFC §6.4 + brief 07 §5).
`ResponseFormat` is an optional structured-output hint. `nil` means "plain text"; `json_object` requests provider JSON mode; `json_schema` carries a caller-supplied JSON Schema. Phase 35 owns the per-provider downgrade chain `json_schema → json_object → text`.
`Stream` + `OnContent` / `OnReasoning` cooperate: when `Stream` is true, the driver invokes the callbacks for each delta. `OnReasoning` fires only for thinking-class providers that expose a separate reasoning channel (`o1`, `o3`, `deepseek-reasoner`, etc.).
`Temperature` / `MaxTokens` / `Stops` map directly onto provider sampler controls. Pointer types (`*float32`, `*int`) distinguish "unset (use provider default)" from "set to zero".
`ReasoningEffort` is a request-level hint mapped to per-provider reasoning controls (bifrost's `ChatReasoning`). `""` means "do not touch the provider default."
`Extra` is provider-passthrough sanitized by Phase 34's correction layer. Phase 32 stores the field but does not interpret it.
type CompleteResponse ¶
type CompleteResponse struct {
Content string
ToolCalls []ToolCallStructured
Reasoning string
Cost Cost
Usage Usage
}
CompleteResponse is the LLM-call return shape.
`Content` is the full assembled assistant message — for streaming calls the driver concatenates `OnContent` deltas into `Content` before returning. The runtime parses `Content` into a `PlannerAction` per brief 07; the LLM never emits provider-native tool calls.
`ToolCalls` (Phase 107c / D-167) carries provider-validated structured tool-call entries. When non-empty, the planner reads ToolCalls as its primary decision discriminator (native tool-calling path). Empty for text-only responses and for providers without native tool-calling support.
`Reasoning` carries the provider-side thinking trace (Anthropic extended thinking, OpenAI o-series, DeepSeek native, Gemini `thought:true` parts) normalised by the driver. It is the canonical captured trace for BOTH unary and streaming calls — distinct from the per-delta `OnReasoning` streaming callback, which exists for live UX. Empty when the provider did not surface reasoning, or when the driver does not read a reasoning channel. Reasoning is captured content, NOT replayed into prompts: the planner persists it on `trajectory.Step.ReasoningTrace` and only re-injects it when an operator opts into replay (D-148). Phase 83e (RFC §6.2 + §6.5).
`Cost` + `Usage` propagate the provider's reported figures. Governance (Phase 36a/36b) subscribes to `llm.cost.recorded` events emitted by the runtime when a `Complete` returns; the event payload re-stamps these shapes.
type CompletionChunkPayload ¶ added in v1.2.0
type CompletionChunkPayload struct {
events.SafePayload
Identity identity.Quadruple
TaskID string
RunID string
Delta string
Done bool
Kind string
OccurredAt time.Time
}
CompletionChunkPayload is the typed payload for EventTypeCompletionChunk (Phase 107). SafePayload — the delta is per-session operator-visible content (the LLM's own output), not a secret. Kind is "content" or "reasoning".
type ConfigSnapshot ¶
type ConfigSnapshot struct {
Driver string
ContextWindowReserve float64
HeavyOutputThreshold int
ModelProfiles map[string]ModelProfile
// DisableCorrections opts OUT of the Phase 34 per-provider
// correction layer. Zero-value (false) = corrections enabled —
// production callers wire `corrections.Wrap(safetyClient(driver))`
// so quirks like NIM message reordering, OpenAI strict-schema
// mode, thinking-class reasoning routing, Anthropic envelope
// translation, and usage backfill all apply automatically. Tests
// that need to exercise the safety pass in isolation set this to
// true.
//
// Inverse-named so the zero-value matches the production default
// — direct callers (tests, programmatic snapshot construction)
// don't have to flip an extra knob to get correct behaviour. The
// config loader resolves the operator-facing `corrections.enabled`
// yaml field (default true) into this inverse.
DisableCorrections bool
// DisableDowngrade opts OUT of the Phase 35 structured-output
// downgrade chain. Zero-value (false) = enabled. Inverse-named so
// production callers get the right behaviour by default.
DisableDowngrade bool
// DisableRetry opts OUT of the Phase 36 retry-with-feedback
// wrapper. Zero-value (false) = enabled. The wrapper is a no-op
// when `CompleteRequest.Validator` is nil, so disabling is only
// useful for tests that need to isolate the downgrade layer.
DisableRetry bool
// DisableGovernance opts OUT of the Phase 36a/36b governance
// wrapper. Zero-value (false) = enabled — but the wrapper is also
// a no-op pass-through when no `governance.Factory` has been
// registered, so the latent default (Wave 7b scoping) requires
// neither flag flip nor factory wiring. Tests that want to bypass
// even a registered factory flip this true.
DisableGovernance bool
// Bifrost-driver knobs (Phase 33).
Provider string
Model string
APIKey string
BaseURL string
Timeout time.Duration
// CustomProviders is the operator-declared registry of
// OpenAI-compatible providers (Phase 33a). When `Provider`
// matches a custom entry's `Name`, the entry's `BaseURL` /
// `APIKeyEnvVar` / `Models` / network knobs apply (legacy
// `APIKey` / `BaseURL` / `Timeout` ignored for that case). The
// list is keyed only by `Name`; the bifrost driver iterates and
// registers all entries with bifrost's `Account`.
CustomProviders []CustomProviderSpec
// NetworkDefaults applies to every provider when the per-provider
// override is absent. Zero-valued fields fall through to
// bifrost's package-level defaults at construction. Restart-
// required.
NetworkDefaults NetworkDefaults
}
ConfigSnapshot is the strict subset of `config.LLMConfig` the LLM package consumes. Keeping a snapshot decouples drivers from the config package's type evolution (mirrors `internal/memory`'s pattern).
- `Driver` selects the §4.4 factory. Empty defaults to `DefaultDriver` (Phase 32 = "mock"; Phase 33 will leave the default explicit at the caller — operator must opt-in to `bifrost`).
- `ContextWindowReserve` is the safety-net token-budget margin (default 0.05 / 5%). Range [0.0, 1.0); validated at the config layer + at construction.
- `HeavyOutputThreshold` mirrors `config.ArtifactsConfig.HeavyOutputThresholdBytes` so the LLM package does not re-import the artifact-config struct. Default 32 KiB.
- `ModelProfiles` is keyed by canonical model name. The safety net's token-budget guard requires a profile entry for the model in the `CompleteRequest`; missing → `ErrUnsupportedModel`.
`Provider` / `Model` / `APIKey` / `BaseURL` / `Timeout` are the Phase-33 bifrost-driver knobs. Phase 32 stores them so the snapshot's shape is stable across phases; the mock driver ignores them. Phase 33's bifrost driver will read them.
type Content ¶
type Content struct {
Text *string
Parts []ContentPart
}
Content is the multimodal sum-type. Exactly one of `Text` or `Parts` must be set; both-set and both-nil are invalid and rejected by the safety net with `ErrInvalidContent`.
type ContentPart ¶
type ContentPart struct {
Type PartType
Text string // when Type == PartText
Image *ImagePart // when Type == PartImage
Audio *AudioPart // when Type == PartAudio
File *FilePart // when Type == PartFile
}
ContentPart is one element of a multimodal `Content.Parts` slice. Exactly one of `Text` / `Image` / `Audio` / `File` is set per the `Type` discriminator.
type ContextLeakPayload ¶
type ContextLeakPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
LeakSite string
SizeBytes int64
Threshold int
OccurredAt time.Time
}
ContextLeakPayload is the typed payload for EventTypeContextLeak. SafePayload — the leak-site identifier (a short structural fingerprint like "Messages[2].Content.Text") is operator-visible debug data, not secret-shaped.
`SizeBytes` is the size of the offending payload; `Threshold` is the runtime's configured heavy-output threshold at the time of the emit, so an operator can correlate config-change-time drift.
type ContextWindowExceededPayload ¶
type ContextWindowExceededPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
EstimatedTokens int
ContextWindowTokens int
ContextWindowReserve float64
OccurredAt time.Time
}
ContextWindowExceededPayload is the typed payload for EventTypeContextWindowExceeded. SafePayload — token counts + configured cap are operator-visible.
type CorrectionsProfile ¶
type CorrectionsProfile struct {
// MessageOrdering controls how the request's chat-message slice
// is reordered before reaching the driver. Default (zero value)
// passes the slice through unchanged.
MessageOrdering MessageOrderingPolicy
// SchemaMode controls how the request's `ResponseFormat.JSONSchema`
// bytes are mutated before reaching the driver. Default passes
// the schema through unchanged.
SchemaMode SchemaSanitizationMode
// ReasoningEffortRouting controls whether `req.ReasoningEffort` is
// translated to a provider-specific `Extra` key (thinking-class
// models) or passed through as the top-level field (default).
ReasoningEffortRouting ReasoningRouting
// ResponseFormatShape controls the wire-shape translation of
// `req.ResponseFormat`. Default emits the OpenAI envelope; other
// values translate to per-provider envelopes (Anthropic tool-
// schema, `json_only` for providers that reject `json_schema`).
ResponseFormatShape ResponseFormatProfile
// UsageBackfillEnabled, when true, makes the corrections layer
// compute synthetic token counts (and, if `CostOverrides` is set,
// synthetic costs) when the driver returns an all-zeros `Usage`.
// Default false — the response surfaces zeros verbatim.
UsageBackfillEnabled bool
}
CorrectionsProfile carries the per-model quirk flags the Phase 34 `internal/llm/corrections` layer dispatches on. The types live in the `llm` package so the corrections sub-package can consume them without an import cycle (logic lives in `internal/llm/corrections`).
Zero-valued struct means "no quirks declared for this model"; the corrections pass treats each field's zero value as the Harbor-default behaviour (no reorder, no schema mutation, OpenAI-style envelopes, usage backfill off).
Per RFC §6.5 + brief 03 §4: this is the operator-controlled surface for adapting Harbor's neutral `CompleteRequest` shape to per-provider expectations. The corrections layer is the ONLY consumer.
type Cost ¶
type Cost struct {
InputTokensCost float64
OutputTokensCost float64
ReasoningTokensCost float64
TotalCost float64
Currency string // "USD" canonical; reserved for future multi-currency
}
Cost is the provider-reported cost breakdown. Values are USD. Fields are zero when the provider doesn't report a category.
Governance (Phase 36a) subscribes to `llm.cost.recorded` events to drive per-identity accumulators; Phase 36a's payload re-stamps these fields.
type CostRecordedPayload ¶
type CostRecordedPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
Cost Cost
Usage Usage
// ContextWindowTokens is the model's input-token window (from the
// model profile), stamped so the Console can render context-used vs
// window (%). Zero when the model has no profile / configured window.
ContextWindowTokens int
OccurredAt time.Time
}
CostRecordedPayload is the typed payload for EventTypeCostRecorded. SafePayload — cost / token counts are operator-visible. Phase 36a subscribes for per-identity accumulator updates.
type CostTable ¶
type CostTable struct {
InputPer1M float64
OutputPer1M float64
ReasoningPer1M float64
Currency string // "USD" canonical
}
CostTable carries fallback per-1M-token rates. Used when the provider's response doesn't include cost. Phase 36a consumes.
type CustomProviderSpec ¶
type CustomProviderSpec struct {
Name string
BaseURL string
APIKeyEnvVar string
Models []string
BaseProviderType string
Timeout time.Duration
MaxRetries int
RetryBackoffInitial time.Duration
RetryBackoffMax time.Duration
Concurrency int
BufferSize int
RequestPathOverrides map[string]string
}
CustomProviderSpec is one operator-declared OpenAI-compatible provider (Phase 33a). The bifrost driver maps each entry to a `bfschemas.ProviderConfig` with `CustomProviderConfig.BaseProviderType = schemas.OpenAI`. Zero-valued network knobs fall through to `ConfigSnapshot.NetworkDefaults`, which itself falls through to bifrost's package-level defaults.
`APIKeyEnvVar` is the environment-variable NAME (no `env.` prefix); the driver resolves `os.Getenv(name)` at construction. Missing → `ErrMissingAPIKey` with the env var named.
`RequestPathOverrides` maps `bfschemas.RequestType` (string-coded at this layer to avoid the import) to a custom URL path; the bifrost driver translates the keys when wiring the config. Used for OpenAI-compatible endpoints that host e.g. `/chat/completions` at the root.
type Deps ¶
type Deps struct {
Artifacts artifacts.ArtifactStore
Bus events.EventBus
}
Deps carries the runtime dependencies the LLM client subsystem consumes. Both are mandatory — fail-loudly at construction.
- `Artifacts` is the auto-materialize target (D-022). Inline `DataURL` content above the heavy-output threshold is rewritten as an `Artifact` whose bytes live in the store.
- `Bus` is the canonical event bus. The safety pass publishes `llm.image.materialized` / `llm.context_leak` / `llm.context_window_exceeded`; the request-emit path (Phase 36a subscriber lands here) publishes `llm.cost.recorded`.
The package does NOT depend on `state.StateStore` — the LLM client is stateless across calls (D-025).
type Driver ¶
type Driver interface {
// Complete receives a `CompleteRequest` whose messages have
// ALREADY passed the safety net (`enforceContextSafety`): no raw
// heavy content survived, the token-budget guard fired or
// passed, oversize `DataURL` content has been materialized to
// `Artifact` form. The driver translates the request into its
// provider's wire shape and returns the typed response.
Complete(ctx context.Context, req CompleteRequest) (CompleteResponse, error)
// Close mirrors `LLMClient.Close`. Idempotent; second call is a
// no-op (returns nil).
Close(ctx context.Context) error
}
Driver is the unexported-by-naming surface every concrete driver implements. Identical shape to `LLMClient` minus the contract that the safety net has already run. `Open` wraps a `Driver` in a `safetyClient` so the safety pass is mandatory by construction.
Driver authors implement this; callers consume `LLMClient`.
type Factory ¶
type Factory func(cfg ConfigSnapshot, deps Deps) (Driver, error)
Factory builds a `Driver` from a `ConfigSnapshot` + `Deps`. Drivers expose one `Factory` each via `init()` → `Register`.
type FilePart ¶
type FilePart struct {
URL string
DataURL string
Artifact *ArtifactStub
MIME string
Filename string
}
FilePart is a multimodal file input. Same supply forms as `ImagePart`; `MIME` is the document MIME type. `Filename` is a hint shown to the model when the provider supports it.
type ImageMaterializedPayload ¶
type ImageMaterializedPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
ArtifactRef string
MIME string
SizeBytes int64
OccurredAt time.Time
}
ImageMaterializedPayload is the typed payload for EventTypeImageMaterialized. SafePayload — the artifact ref, MIME type, and size are operator-visible content metadata, not secrets.
type ImagePart ¶
type ImagePart struct {
URL string
DataURL string
Artifact *ArtifactStub
MIME string
Detail string
}
ImagePart is a multimodal image input.
Exactly one of `URL` / `DataURL` / `Artifact` is set. `URL` is a provider-fetchable remote URL. `DataURL` is an inline `data:image/...;base64,...` payload — above the heavy-output threshold the runtime materializes it to `Artifact`. `Artifact` is the canonical Harbor reference (D-022).
`MIME` is the image MIME type (`image/jpeg`, `image/png`, `image/webp`, ...). `Detail` is a provider hint (`low` / `high` / `auto`); empty string means "use provider default."
type LLMClient ¶
type LLMClient interface {
Complete(ctx context.Context, req CompleteRequest) (CompleteResponse, error)
// Close releases driver-held resources (HTTP connection pools,
// background goroutines). Subsequent calls return ErrClientClosed.
// Implementations MUST honour ctx during long teardowns.
Close(ctx context.Context) error
}
LLMClient is the single contract callers depend on. ONE method. Streaming is signalled via `req.Stream` + `req.OnContent` / `req.OnReasoning`; cancellation flows through `ctx`. The runtime owns prompt construction, tool semantics, parsing, and parallel dispatch — see RFC §6.4 + brief 07.
Implementations MUST be safe for N concurrent goroutines against a single shared instance (D-025).
func Open ¶
Open returns the `LLMClient` built by the factory whose name matches `cfg.Driver` (defaults to `DefaultDriver` when empty).
Identity is mandatory at every method on the returned client; the safety pass enforces. Deps are validated at construction — `nil Artifacts` / `nil Bus` return wrapped errors immediately.
The returned client is a `*safetyClient` wrapping the registered driver: every `Complete` runs through `enforceContextSafety` BEFORE the driver sees the request. This is mandatory by construction — drivers cannot bypass it through the registry path.
type MessageOrderingPolicy ¶
type MessageOrderingPolicy string
MessageOrderingPolicy enumerates the message-reordering modes the Phase 34 corrections layer supports. Operator-set in `ModelProfile.Corrections.MessageOrdering`.
const ( // OrderingDefault passes the message slice through unchanged. OrderingDefault MessageOrderingPolicy = "" // OrderingSystemFirstStrict collapses all system-role messages // to the front of the slice and emits an alternating // user/assistant tail. Required by NIM and some OpenAI-compatible // proxies that reject mid-thread `system` messages (brief 03 §4). OrderingSystemFirstStrict MessageOrderingPolicy = "system_first_strict" )
type ModeDowngradedPayload ¶
type ModeDowngradedPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
FromMode OutputMode
ToMode OutputMode
From ResponseFormatKind
To ResponseFormatKind
Reason string
OccurredAt time.Time
}
ModeDowngradedPayload is the typed payload for EventTypeModeDowngraded. Phase 35 fills the From/To/Reason fields. `FromMode` / `ToMode` carry the Harbor-side `OutputMode` (Native / Tools / Prompted / text); `From` / `To` carry the resolved `ResponseFormatKind` for backward visibility.
type ModelProfile ¶
type ModelProfile struct {
// ContextWindowTokens is the model's hard input-token cap.
// Required (> 0); the safety net's token-budget guard uses it.
ContextWindowTokens int
// TokenEstimator selects the estimator the safety net runs.
// "" / "chars_div_4" — default chars/4 + role-overhead.
// Phase 33+ may register tiktoken-equivalent estimators by name.
TokenEstimator string
// JSONSchemaMode — Phase 32-era placeholder; the config loader
// normalises this string into `OutputMode` at snapshot time
// (Phase 35). Direct callers SHOULD set `OutputMode`; this field
// is read only when `OutputMode` is `OutputModeUnset`.
JSONSchemaMode string
// OutputMode (Phase 35) — Harbor-side structured-output strategy.
// Drives the request-shaping in `internal/llm/output` and the
// downgrade chain. See `OutputMode` constants for semantics.
// Zero value (`OutputModeUnset`) falls back to the per-known-
// provider default (see `corrections.DefaultOutputModeFor`).
OutputMode OutputMode
// DefaultMaxTokens — Phase 36b's identity-tier override target.
DefaultMaxTokens *int
// ReasoningEffort — request-level default applied by the
// corrections layer (`corrections.Complete`) when the caller left
// `CompleteRequest.ReasoningEffort` empty; an explicit per-call
// value overrides it. Maps to the provider reasoning param (bifrost
// `ChatReasoning.Effort`, or `Extra["reasoning_effort"]` under
// `ReasoningRouteThinking`).
ReasoningEffort ReasoningEffort
// CostOverrides — per-1M-token rates when the provider doesn't
// report cost (some OpenRouter routes don't). Phase 36a reads.
CostOverrides *CostTable
// Corrections — per-provider quirk flags consumed by the Phase 34
// `internal/llm/corrections` layer. Zero-valued struct means
// "no corrections needed for this model"; the corrections layer
// runs a no-op pass for default-shaped profiles.
Corrections CorrectionsProfile
// MaxRetries (Phase 36) — caps the validator-driven corrective
// re-asks performed by the retry wrapper. Zero (default) maps to
// `DefaultMaxRetries` (1). A negative value is rejected at config
// validation.
MaxRetries int
}
ModelProfile carries per-model knobs. Keyed by canonical model name in `LLMConfig.ModelProfiles`. Phase 32 ships the shape + `ContextWindowTokens` + `TokenEstimator` consumers; Phase 33+ consume the rest.
type NetworkDefaults ¶
type NetworkDefaults struct {
Timeout time.Duration
MaxRetries int
RetryBackoffInitial time.Duration
RetryBackoffMax time.Duration
Concurrency int
BufferSize int
}
NetworkDefaults are the operator-tunable defaults bifrost applies to every provider (native + custom) when the per-provider override is absent (Phase 33a). Zero-valued fields fall through to bifrost's package-level defaults.
type OutputMode ¶
type OutputMode string
OutputMode selects the request-shaping strategy for structured output (Phase 35; RFC §6.5). Three modes:
`OutputModeNative` — pass `FormatJSONSchema` through unchanged. The provider validates against the schema natively. Default for OpenAI / Anthropic / Google.
`OutputModeTools` — encode the schema as a *Harbor-side prompted* envelope where the LLM is asked to emit `{"name":"respond_with","arguments":{...}}` as plain output. The runtime parses that locally. Used as a fallback for providers without native `json_schema` support.
IMPORTANT: this is NOT a passthrough to provider-native tool-calling APIs (`tools=` / `tool_choice=` / `function_call` / `tool_use`). Harbor's runtime owns tool dispatch (RFC §6.4 / brief 07); `OutputModeTools` is purely a prompted-output technique. The static guard in `scripts/smoke/phase-35.sh` enforces this boundary.
`OutputModePrompted` — coerce `FormatJSONObject` and inline the schema as a system-prompt instruction. The LLM-side parse is "produce a JSON object matching this schema." Default for NIM / custom OpenAI-compatible / deepseek-reasoner.
The downgrade chain runs `current → next` on `IsInvalidJSONSchemaError` failures, bounded at 3 total attempts (initial + 2 downgrades).
const ( // OutputModeUnset is the zero value — operator did not declare the // mode. The downgrade wrapper applies the per-model-prefix default // (see `internal/llm/corrections.DefaultOutputModeFor`). OutputModeUnset OutputMode = "" // OutputModeNative — pass `FormatJSONSchema` through. Provider // enforces strict schema mode. OutputModeNative OutputMode = "native" // OutputModeTools — Harbor-side prompted envelope. NOT provider // tool-calling APIs. OutputModeTools OutputMode = "tools" // OutputModePrompted — `FormatJSONObject` + schema in system prompt. OutputModePrompted OutputMode = "prompted" )
type PostureProvider ¶
type PostureProvider struct {
// contains filtered or unexported fields
}
PostureProvider is the Phase 72g read-only accessor over the runtime's bound LLM configuration. Built once per Runtime process via NewPostureProvider; `Posture` is safe for concurrent use by N goroutines (D-025).
func NewPostureProvider ¶
func NewPostureProvider(cfg ConfigSnapshot) *PostureProvider
NewPostureProvider builds a PostureProvider over the LLM ConfigSnapshot the binary resolved at boot. The provider / model / region are read from the snapshot and frozen at construction; the `MockMode` flag is NOT taken from the snapshot — it is read live (but race-free) from the boot-captured atomic, so the posture surface reflects D-089's single capture-path source.
When the snapshot's `Driver` field is empty it is normalised to `DefaultDriver` ("bifrost") — the same default `Open` applies — so the posture surface never reports an empty provider for a default-driver boot.
func (*PostureProvider) Posture ¶
func (p *PostureProvider) Posture(_ context.Context) (PostureSnapshot, error)
Posture returns the read-only PostureSnapshot of the runtime's bound LLM provider for the caller. The `ctx` is accepted for signature symmetry with `governance.PostureProvider.Posture` and so a future per-tenant LLM-routing model can scope the read; V1 ships a single provider per Harbor instance (RFC §6.15 + D-088), so the snapshot is identity-independent at this layer — the Protocol handler is the identity-mandatory gate.
`MockMode` is read from the boot-captured atomic (D-089) — NOT from an `os.Getenv` re-read.
type PostureReadAdminPayload ¶
type PostureReadAdminPayload struct {
events.SafeSealed
// Actor is the identity of the admin-scoped caller that performed
// the cross-tenant read.
Actor identity.Quadruple
// RequestedTenant is the tenant_id the caller asked to read — a
// tenant other than the caller's own.
RequestedTenant string
}
PostureReadAdminPayload is the typed payload for EventTypePostureReadAdmin (Phase 72g). SafePayload — the actor's identity and the requested tenant are operator-visible audit metadata, not secret-shaped. NEVER carries provider API keys — the posture surface reports provider/model/region only. The payload runs through the audit Redactor before the bus publish (CLAUDE.md §7).
type PostureSnapshot ¶
type PostureSnapshot struct {
// Provider is the LLM provider name (e.g. "bifrost", "mock").
Provider string
// Model is the bound model identifier (e.g. "openai/gpt-5.3-chat").
Model string
// Region is the provider endpoint region; "" when not applicable.
Region string
// MockMode is true iff the runtime booted with HARBOR_DEV_ALLOW_MOCK=1
// (D-089). Captured at boot via RegisterMockModeCaptured.
MockMode bool
}
PostureSnapshot is the read-only view of the runtime's bound LLM provider. It is the source the `llm.posture` Protocol handler projects onto the `LLMPostureResponse` wire type.
type ReasoningEffort ¶
type ReasoningEffort string
ReasoningEffort hints at provider-side thinking budget. Empty string means "use provider default" (DO NOT touch the request).
const ( ReasoningOff ReasoningEffort = "off" ReasoningLow ReasoningEffort = "low" ReasoningMedium ReasoningEffort = "medium" ReasoningHigh ReasoningEffort = "high" )
The ReasoningEffort levels, ascending. The empty string (not listed here) means "use the provider default".
type ReasoningRouting ¶
type ReasoningRouting string
ReasoningRouting enumerates the `ReasoningEffort` routing modes the Phase 34 corrections layer supports. Operator-set in `ModelProfile.Corrections.ReasoningEffortRouting`.
const ( // ReasoningRouteDefault passes the top-level // `req.ReasoningEffort` through to the driver unchanged. // Bifrost's `ChatReasoning.Effort` field consumes it. ReasoningRouteDefault ReasoningRouting = "" // ReasoningRouteThinking moves the effort hint from the // top-level field into `req.Extra["reasoning_effort"]`. // Thinking-class models (`o1`, `o3`, `deepseek-reasoner`) // interpret the hint via a provider-specific path that bifrost // passes through opaquely. The top-level field is cleared so the // regular reasoning channel is not used. ReasoningRouteThinking ReasoningRouting = "thinking_model" )
type ResponseFormat ¶
type ResponseFormat struct {
Kind ResponseFormatKind
JSONSchema json.RawMessage
}
ResponseFormat is the optional structured-output hint on `CompleteRequest`. `nil` means "plain text" (equivalent to `Kind: FormatText`).
Phase 35 owns the per-provider downgrade chain `json_schema → json_object → text` on `invalid_json_schema` errors; Phase 32 stores the field and the safety-net pass treats the JSON schema bytes as opaque metadata (no token-estimate contribution).
type ResponseFormatKind ¶
type ResponseFormatKind string
ResponseFormatKind discriminates a `ResponseFormat`.
const ( // FormatText — no structured-output constraint. Default when // `CompleteRequest.ResponseFormat` is nil. FormatText ResponseFormatKind = "text" // FormatJSONObject — provider's "JSON mode" (free-form JSON). FormatJSONObject ResponseFormatKind = "json_object" // FormatJSONSchema — caller-supplied JSON Schema (strict mode // when the provider exposes it). FormatJSONSchema ResponseFormatKind = "json_schema" )
type ResponseFormatProfile ¶
type ResponseFormatProfile string
ResponseFormatProfile enumerates the `response_format` envelope shapes the Phase 34 corrections layer can emit. Operator-set in `ModelProfile.Corrections.ResponseFormatShape`.
const ( // ResponseFormatOpenAI emits the OpenAI envelope — // `{"type":"json_object"}` for `FormatJSONObject` and // `{"type":"json_schema","json_schema":{...}}` for // `FormatJSONSchema`. This is the default; bifrost's // `translateResponseFormat` already produces this shape, so a // `default`-profile model is a no-op in the corrections layer. ResponseFormatOpenAI ResponseFormatProfile = "" // ResponseFormatJSONOnly downgrades `FormatJSONSchema` to // `FormatJSONObject`. Used for providers that don't support // `json_schema` natively (e.g. some OpenRouter routes); the // schema is preserved as `Extra["schema_hint"]` so a prompted // fallback can reference it. ResponseFormatJSONOnly ResponseFormatProfile = "json_only" // ResponseFormatAnthropic packages the schema into Anthropic's // tool-schema-style envelope, surfaced in // `req.Extra["anthropic_tool_schema"]`. Phase 33's bifrost // driver passes `Extra` opaquely; the Anthropic provider // converter consumes the key (or future Phase 35 logic does). ResponseFormatAnthropic ResponseFormatProfile = "anthropic" )
type RetryWithFeedbackPayload ¶
type RetryWithFeedbackPayload struct {
events.SafeSealed
Identity identity.Quadruple
Model string
Attempt int
MaxRetries int
Reason string
OccurredAt time.Time
}
RetryWithFeedbackPayload (Phase 36) is the typed payload for EventTypeRetryWithFeedback. SafePayload — `Attempt` is the 1-based retry index (1 = first re-ask after the original); `Reason` is the validator's truncated `Error()` string. The wrapper truncates Reason at 256 characters to keep audit payloads bounded.
type Role ¶
type Role string
Role is the chat-message role. Settled at the four canonical values; `RoleTool` is the in-Harbor convention for the user-role rendering of tool observations (brief 07 §5 — the rendering itself happens at `ObservationRenderer`, not here; this constant exists so callers that construct an explicit user-message describing a tool result can label it for clarity).
type SchemaSanitizationMode ¶
type SchemaSanitizationMode string
SchemaSanitizationMode enumerates the JSON-Schema-mutation modes the Phase 34 `SchemaSanitizer` supports. Operator-set in `ModelProfile.Corrections.SchemaMode`.
const ( // SchemaDefault passes the operator-supplied schema through // unchanged. SchemaDefault SchemaSanitizationMode = "" // SchemaOpenAIStrict adds `additionalProperties:false` and // `strict:true` at every nested object schema. OpenAI's // structured-output mode requires both fields; most schemas // produced by `tools.RegisterFunc[I, O]` omit them. SchemaOpenAIStrict SchemaSanitizationMode = "openai_strict" // SchemaPermissive strips `additionalProperties` and `strict` // fields wherever they appear. Some providers reject those keys. SchemaPermissive SchemaSanitizationMode = "permissive" )
type StubFetch ¶
StubFetch is the optional pointer-to-tool hint on an `ArtifactStub`. When set, an LLM that wants the bytes knows which Harbor tool to call (and with which artifact ID).
type ToolCallStructured ¶ added in v1.2.0
type ToolCallStructured struct {
ID string
Name string
Args json.RawMessage
Index uint16
}
ToolCallStructured is a provider-validated tool-call entry (Phase 107c / D-167). Carries the provider-assigned call ID (round-trips on `ChatMessage.ToolCallID` when the result is threaded back into the next turn), the tool name (matches `tools.Tool.Name`), and provider-validated JSON args.
`Index` is the per-response position of this tool call (0-based) and is the load-bearing discriminator for streaming-delta assembly: per the OpenAI streaming spec, tool-call args arrive across multiple SSE chunks. The first delta carries `ID + Name`; subsequent deltas for the SAME tool call carry empty ID + null Name and an args FRAGMENT to be concatenated onto the prior args. The drivers key on Index to merge fragments correctly; without it, providers like Amazon Bedrock (which streams args one short fragment at a time) produce a trajectory full of half-built ToolCalls. Defaults to 0 for non-streaming responses + tests; the driver layer is the source of truth.
type ToolDeclaration ¶ added in v1.2.0
type ToolDeclaration struct {
Name string
Description string
Schema json.RawMessage
}
ToolDeclaration is the per-turn tool declarator the LLM sees (Phase 107c / D-167). Carries the tool name, operator-facing description, and the args JSON Schema.
type Usage ¶
type Usage struct {
PromptTokens int
CompletionTokens int
ReasoningTokens int
TotalTokens int
LatencyMS int64
// ProviderExtras — opaque provider-specific bag (e.g. cache
// hit/miss). Phase 32 does not interpret these fields; Phase 34+
// may read them for correction-layer decisions.
ProviderExtras map[string]string
}
Usage is the provider-reported token usage.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package corrections is Harbor's provider correction layer (Phase 34 — RFC §6.5).
|
Package corrections is Harbor's provider correction layer (Phase 34 — RFC §6.5). |
|
drivers
|
|
|
bifrost
Package bifrost is Harbor's bifrost-backed LLM driver.
|
Package bifrost is Harbor's bifrost-backed LLM driver. |
|
Package mock is Harbor's test-grade LLM driver.
|
Package mock is Harbor's test-grade LLM driver. |
|
Package output is Harbor's structured-output strategy + downgrade chain (Phase 35 — RFC §6.5).
|
Package output is Harbor's structured-output strategy + downgrade chain (Phase 35 — RFC §6.5). |
|
Package retry is Harbor's retry-with-feedback wrapper (Phase 36 — RFC §6.5).
|
Package retry is Harbor's retry-with-feedback wrapper (Phase 36 — RFC §6.5). |
|
Package summarizer is Harbor's production LLM-backed `memory.Summarizer` — the §13 "test stubs as production defaults" amendment closure for the memory subsystem's `rolling_summary` strategy.
|
Package summarizer is Harbor's production LLM-backed `memory.Summarizer` — the §13 "test stubs as production defaults" amendment closure for the memory subsystem's `rolling_summary` strategy. |