Documentation
¶
Overview ¶
Package conformance ships the planner conformance pack.
Phase 42 landed the harness shape (the Harness struct + the Run(t, factory) entry point + the §13 import-graph lint test). Phase 49 fills in every scenario body — the top-prompt LLM-round-trip set, the malformed-LLM-output salvage path, the CallParallel atomicity check, the load-bearing wake-mode round-trip (D-032 — binding), the budget-aware finish, the pause-payload bounds, the steering drain-between-steps, and the D-025 concurrent-reuse surface.
The conformance pack is a shared test asset: every concrete `Planner` (Phase 45 ReAct, Phase 48 Deterministic, and every future concrete on the same iface) calls Run against the same scenarios. The pack itself never imports a concrete-planner package — the `internal/planner/conformance.TestImportGraph_PlannerDoesNotImportRuntime` lint test walks the planner subtree and would fail otherwise.
Per-concrete consumption pattern (Phase 49+):
func TestReact_Conformance(t *testing.T) {
conformance.Run(t, func() conformance.Harness {
return conformance.Harness{
Factory: func() planner.Planner {
return react.New(mock.New(mock.Options{
SyntheticContent: scenarioContent,
}))
},
WakeMode: planner.WakePush,
RunContextFactory: conformance.DefaultRunContext,
Capabilities: conformance.CapabilitySetLLM,
ScenarioContentMap: conformance.DefaultReactContentMap(),
}
})
}
The harness factory pattern matches the events / tools / tasks conformance suites: each subtest gets a fresh planner instance so internal state can't bleed between scenarios. The harness factory's `Factory` closure returns a planner that is safe under D-025 concurrent reuse — the D-025 scenario runs N=64 concurrent Next calls against one shared instance.
Index ¶
Constants ¶
const CapabilitySetDeterministic = CapabilityCanPause | CapabilityWakeRoundTrip | CapabilityHonoursCancelControl
CapabilitySetDeterministic is the canonical capability set for Phase 48's deterministic planner. Distinct from ReAct: no LLM (Deterministic is programmatic), can emit Pause (via PauseStep), supports the wake-mode poll round-trip.
const CapabilitySetReAct = CapabilityLLMDriven | CapabilityWakeRoundTrip | CapabilityHonoursCancelControl
CapabilitySetReAct is the canonical capability set for Phase 45's LLM-driven ReAct planner.
Variables ¶
This section is empty.
Functions ¶
func DefaultRunContext ¶
func DefaultRunContext() planner.RunContext
DefaultRunContext is a convenience factory the per-concrete tests can pass as `RunContextFactory`. Stamps a populated identity quadruple + a non-empty goal. Concretes that need extra fields (Trajectory, Catalog, etc.) typically build their own factory; this shape covers the Sanity + most scenario subtests.
func Run ¶
Run executes the conformance pack against the planner produced by `factoryFunc`. Phase 49 fills every scenario; the Sanity skeleton scenarios from Phase 42 are preserved verbatim (subtest names are pinned). New scenarios use real drivers at the seam (§17.3 #1).
The factory is called once per subtest so per-scenario planner state can't bleed; the harness's `Cleanup`, when supplied, runs at subtest end.
func SecondStepContent ¶
func SecondStepContent() string
SecondStepContent returns a canned `_finish` envelope used by the wake-mode round-trip scenario's post-resolve Next call. The ScenarioFactory for ReAct supplies a multi-response scripted mock (first response: SpawnTask emission; second: this Finish).
Types ¶
type Capability ¶
type Capability uint32
Capability flags declare which scenarios a concrete planner can execute. The pack honours capability gating so a non-LLM concrete (Deterministic) does not run an LLM-only scenario (e.g. `MalformedLLM_Salvage`).
A concrete planner's per-package conformance test passes a `Capabilities` value built from the constants below; the harness gates each scenario by inspecting the bitmask. A scenario that does NOT match the planner's capabilities reports `t.Skip(...)` with a reason — never a silent skip.
const ( // CapabilityLLMDriven — the planner uses an LLM client and // participates in the LLM-round-trip + malformed-output scenarios. // Phase 45 ReAct sets this; Phase 48 Deterministic does not. CapabilityLLMDriven Capability = 1 << iota // CapabilityCanPause — the planner can emit `RequestPause` under // operator configuration; the pause-payload bounds scenario runs. // Deterministic sets this via its `PauseStep`; ReAct does not in // V1 (Phase 50 wires the planner-side emission path). CapabilityCanPause // CapabilityWakeRoundTrip — the planner is wired to consume the // wake-mode round-trip via real `tasks.TaskRegistry`. Both ReAct // (WakePush) and Deterministic (WakePoll) set this in V1. CapabilityWakeRoundTrip // CapabilityHonoursCancelControl — the planner returns // Finish{Cancelled} at the step boundary when // `rc.Control.Cancelled` is true. Every concrete in V1 sets this; // the cap exists so the steering-drain scenario can fail-loudly // if a future concrete forgets the contract. CapabilityHonoursCancelControl )
Capability constants.
type Harness ¶
type Harness struct {
// Factory constructs a fresh planner instance per subtest. Used
// by scenarios that do NOT need a scenario-specific planner
// configuration (Sanity, WakeMode_Declared, Sealed_DecisionSum,
// Steering_DrainBetweenSteps, ConcurrentReuse_D025). When
// ScenarioFactory is non-nil, scenarios that need a tailored
// planner consume it instead.
Factory func() planner.Planner
// ScenarioFactory, when non-nil, takes a ScenarioName and returns
// a planner pre-configured for that scenario. Used by scenarios
// like TopPrompts_LLMRoundTrip (ReAct needs a specific mock-LLM
// envelope per scenario) and ParallelCall_Atomicity (Deterministic
// needs a CallParallel-emitting step). Fallback when nil: the
// scenario uses `Factory()`.
ScenarioFactory PlannerFactoryFn
// WakeMode is the wake mode the concrete declares (D-032). The
// WakeMode_Declared scenario asserts the planner's
// `ResolveWakeMode` agrees; the WakeMode_RoundTrip scenario
// drives the corresponding round-trip path (push vs poll).
WakeMode planner.WakeMode
// RunContextFactory builds the minimal valid RunContext the
// concrete needs. Required at Phase 49 — every concrete now
// validates identity at Next boundary (§6 rule 9 + D-001).
RunContextFactory func() planner.RunContext
// Capabilities is the planner's declared capability set. The
// pack uses the bitmask to gate scenarios — a scenario whose
// required capability is absent skips with a reason.
Capabilities Capability
// TaskRegistryFactory, when non-nil, builds the real
// `tasks.TaskRegistry` (production inprocess driver) the
// WakeMode_RoundTrip scenario drives. The factory also wires a
// real `events.EventBus` since the registry needs one (D-032 +
// §17.3 #1 — no mocks at the seam).
//
// The pack ships a default factory (`DefaultTaskRegistryFactory`)
// that opens an inmem bus + inprocess registry + inmem state
// store; per-concrete tests typically set this to the default.
TaskRegistryFactory func(t *testing.T) (*WakeRoundTripDeps, func())
// PrebuiltPlannerFactory is the optional hook the
// WakeMode_RoundTrip scenario consumes when the concrete planner
// must be constructed AGAINST a pre-existing TaskRegistry (the
// Deterministic planner binds its registry at construction time
// via `deterministic.WithRegistry`). When nil, the scenario falls
// back to the standard Factory and assumes the planner does NOT
// need a pre-bound registry (ReAct's case — its emission path is
// LLM-prompted, no registry binding at construction).
PrebuiltPlannerFactory func(*WakeRoundTripDeps) planner.Planner
// Cleanup is called at subtest end. Optional — typical for
// planner concretes that hold lifecycle resources.
Cleanup func()
}
Harness is the per-subtest fixture the harness Run loop consumes. Each conformance subtest invokes `factory()` once to obtain a fresh Harness with a fresh planner instance.
Compatibility with Phase 42: the original three fields (Factory, WakeMode, RunContextFactory, Cleanup) are unchanged. Phase 49 adds `ScenarioFactory`, `Capabilities`, `TaskRegistryFactory`, and the scenario-content factories at the bottom — additive only; existing per-concrete tests continue to compile.
type PlannerFactoryFn ¶
type PlannerFactoryFn func(scenario ScenarioName) planner.Planner
PlannerFactoryFn is the factory shape the per-scenario hooks consume. The pack passes a `ScenarioName` so the factory can return a planner pre-configured for that scenario (e.g. ReAct with a mock LLM that emits the right envelope; Deterministic with a step set that emits the right Decision shape).
Factories MUST be safe to invoke multiple times across subtests — each invocation returns a fresh planner instance so internal state (atomic counters, sync.Map state) can't bleed between scenarios.
type ScenarioContentMap ¶
type ScenarioContentMap map[ScenarioName]string
ScenarioContentMap maps a ScenarioName to the synthetic LLM content the harness asks the mock to emit for that scenario. ReAct per-concrete tests construct one via `DefaultReactContentMap` and pass it; the ScenarioFactory consumes the entry to build a fresh mock-LLM driver per subtest.
func DefaultReactContentMap ¶
func DefaultReactContentMap() ScenarioContentMap
DefaultReactContentMap returns the conformance-pack's canned ReAct-side LLM responses keyed by scenario name. Per-concrete tests typically pass this map verbatim; operators with bespoke emission shapes can override individual entries.
The content envelope shapes mirror Phase 45's `DefaultSystemPrompt` — JSON-only, the reserved tool names (`_finish`, `_spawn_task`, `_await_task`), arrays for parallel fan-out.
type ScenarioName ¶
type ScenarioName string
ScenarioName identifies one scenario in the pack. Stable across phases so per-concrete test reports remain comparable.
const ( ScenarioTopPrompts ScenarioName = "TopPrompts_LLMRoundTrip" ScenarioMalformedLLM ScenarioName = "MalformedLLM_Salvage" ScenarioParallelAtomicity ScenarioName = "ParallelCall_Atomicity" ScenarioWakeRoundTrip ScenarioName = "WakeMode_RoundTrip" ScenarioBudgetAware ScenarioName = "BudgetAware_FinishDeadlineExceeded" ScenarioPauseBounds ScenarioName = "PausePayload_BoundsRespected" ScenarioSteeringDrain ScenarioName = "Steering_DrainBetweenSteps" ScenarioConcurrentReuse ScenarioName = "ConcurrentReuse_D025" )
Scenario names. Pinned strings — a rename would break per-concrete suites that may key on subtest names.
type WakeRoundTripDeps ¶
type WakeRoundTripDeps struct {
Bus events.EventBus
Registry tasks.TaskRegistry
State state.StateStore
}
WakeRoundTripDeps bundles the real drivers the wake-mode round-trip scenario consumes. Constructed by the harness's `TaskRegistryFactory`; torn down by the returned cleanup.
All fields are real production drivers (§17.3 #1 — no mocks at the seam): inmem `events.EventBus`, inprocess `tasks.TaskRegistry`, inmem `state.StateStore`. The wake-mode round-trip is the load-bearing D-032 scenario; mocks here would defeat its purpose.
func DefaultTaskRegistryFactory ¶
func DefaultTaskRegistryFactory(t *testing.T) (*WakeRoundTripDeps, func())
DefaultTaskRegistryFactory is the harness-shipped factory that opens an inmem bus + inprocess task registry + inmem state store. Per-concrete tests can use it as-is or wrap it for additional instrumentation.