devstack

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 30, 2026 License: Apache-2.0 Imports: 58 Imported by: 0

Documentation

Overview

Package devstack centralises per-test dev-stack assembly.

Source of truth

This package's `Assemble` function MUST track the production boot stack in `cmd/harbor/cmd_dev.go::bootDevStack` field-for-field. When the production boot order changes — and it will — this helper changes in the same PR. The §17.6 "fix what the integration test finds — no matter where the bug lives" rule requires it: if the production boot is the source of truth, the helper that pretends to be production-shaped MUST stay aligned, or the tests it backs silently drift.

What this package replaces

Before D-094, four integration test files each duplicated ~100–200 LOC of stack assembly (audit + events + state + tasks + steering + protocol + auth + transports + catalog + builder):

  • `test/integration/wave11_test.go::buildWave11Stack`
  • `test/integration/phase64_harbor_dev_helpers_test.go::buildPhase64TestStack`
  • `test/integration/phase64a_catalog_wiring_test.go::buildPhase64aEnv`
  • `test/integration/phase31_approval_gates_test.go::buildPhase31Env`

Each tested a slightly different layer subset. The `AssembleOpts` `Skip*` knobs let a caller opt out of layers it does not exercise (auth / transports / catalog / steering); everything else is always built so the tests prove the layers the production binary composes still compose under the helper.

Real drivers everywhere — no mocks at the seam (CLAUDE.md §17.3)

The helper opens REAL drivers via the registered factories — the patterns audit redactor, the inmem events / state / artifacts / tasks / memory drivers. The four test files MUST blank-import the driver packages so registration fires before Assemble is called; see the helper's godoc on `Assemble` for the canonical import block.

Identity propagation

The helper takes NO identity in its signature. Tests construct their own (`identity.Quadruple`) and pass them into individual calls. Every layer the helper wires reads identity from `ctx` per CLAUDE.md §6.

Concurrent reuse (D-025)

The returned `*DevStack` is shaped like a compiled artifact: every field is concurrent-safe under N parallel invocations (the underlying drivers' concurrent-reuse tests already gate this). `DevStack.Close` is idempotent and safe to defer.

Phase 65 (D-099) hot-reload deliberately NOT mirrored

The production `harbor dev` hot-reload supervisor (`cmd/harbor/cmd_dev_hot_reload.go`) wraps `bootDevStack` — it lives at the runDev level, not inside bootDevStack itself. The helper mirrors bootDevStack's field-for-field assembly, NOT the surrounding supervisor: integration tests that need to exercise the hot-reload shape construct their own supervisor against the helper's assembled stack (the supervisor's exported constructor takes the boot opts and the initial stack — both reproducible here). Per D-094's "helper-tracks-production" rule, this is a deliberate scope choice, not drift: a hot-reload "helper" that owned the rebuild loop would duplicate the cmd-side orchestrator with no test using it. When the rebuild orchestrator's shape next changes, both files (this one and `cmd/harbor/cmd_dev_hot_reload.go`) are revisited together.

harbortest/devstack/session_ensurer.go — adapts the concrete sessions.Registry to the protocol.SessionEnsurer seam (D-171).

Mirrors `cmd/harbor/session_ensurer.go` field-for-field (D-094 source-of-truth invariant): the production dev boot and the production-mirroring fixture wire the SAME create-on-first-use behaviour, so an integration test against the devstack exercises the exact path `harbor dev` runs.

Index

Constants

View Source
const (
	DefaultDevTenant  = "dev"
	DefaultDevUser    = "dev"
	DefaultDevSession = "dev"

	// DefaultKID is the kid header the in-test ES256 signer stamps
	// on tokens. Matches `cmd/harbor`'s DevKID convention.
	DefaultKID = "harbor-test"

	// DefaultTokenTTL pins the validity of minted dev tokens to one
	// hour — short enough that a forgotten token cannot leak past
	// CI run boundaries, long enough that no test will hit refresh.
	DefaultTokenTTL = 1 * time.Hour
)

DefaultDevTenant / DefaultDevUser / DefaultDevSession match the `cmd/harbor` package-private dev-token constants. The Assemble helper mints a Bearer token under this identity when SkipAuth is false; tests that exercise the wire surface use this triple in their request bodies + JWT-validation expectations.

Variables

This section is empty.

Functions

This section is empty.

Types

type AssembleOpts

type AssembleOpts struct {
	// SkipAuth disables Validator construction + dev-token minting.
	// `DevStack.Validator` / `DevStack.Token` are nil. Use for tests
	// that exercise the catalog or in-process invariants and never
	// touch the wire.
	SkipAuth bool

	// SkipTransports disables `transports.NewMux` + the HTTP router.
	// `DevStack.Handler` / `DevStack.Mux` are nil. Implies that the
	// caller never opens an httptest.Server. Always implies the
	// `tools.entries[]` catalog-wiring layer can still fire — the
	// catalog builder does not depend on transports.
	SkipTransports bool

	// SkipCatalog disables `tools.NewCatalog` + the Phase 64a
	// `catalog.Builder` apply path. `DevStack.Catalog` /
	// `DevStack.Coordinator` / `DevStack.Gates` are nil. Use for
	// tests that only need the bus / state / tasks layers.
	SkipCatalog bool

	// SkipSteering disables `steering.NewRegistry` + the
	// ControlSurface. `DevStack.Steering` / `DevStack.Surface` are
	// nil. Implies SkipTransports because a Mux requires a
	// ControlSurface.
	SkipSteering bool

	// SkipRunLoop disables the `steering.RunLoop` construction and
	// the per-task driver that subscribes to `task.spawned` to drive
	// it (D-097, the production wiring that closes #114). When set,
	// `DevStack.RunLoop` / `DevStack.RunLoopDriver` are nil. Tests
	// that don't need the planner-step loop (anything that doesn't
	// drive a `start` request to completion) set this to opt out;
	// `wave11_test.go`'s post-D-097 wire-side approve E2E LEAVES the
	// flag false so the production RunLoop fires.
	//
	// SkipRunLoop implies the in-test bridge for APPROVE/REJECT
	// resolution is no longer needed (the production bridge in
	// `steering.applier.routeThroughGate` fires from the RunLoop's
	// drain), so callers that previously installed
	// `runWave11WireBridge`-shaped goroutines can drop them.
	//
	// SkipRunLoop has no effect when SkipSteering or SkipCatalog is
	// set: the RunLoop requires both the steering Registry and the
	// catalog-applied gates map (the §13 primitive-with-consumer
	// rule applied to the V1 wiring).
	SkipRunLoop bool

	// OAuthProviders pre-populates the OAuth-provider map the
	// catalog Builder consults when an entry declares
	// `tools.entries[].oauth`. Empty by default.
	OAuthProviders map[string]toolauth.OAuthProvider

	// PreRegisterTools is the descriptor list registered with the
	// catalog BEFORE the Builder applies. Use this to register
	// in-test tool fixtures (echo, stub, etc.) that operator config
	// in `cfg.Tools.Entries` then wraps. Ignored when SkipCatalog is
	// true.
	PreRegisterTools []tools.ToolDescriptor

	// LLMConfigSnapshot, when non-nil, overrides the LLM config
	// snapshot the helper would otherwise compute from `cfg.LLM`.
	// Phase 64 / D-089's `HARBOR_DEV_ALLOW_MOCK=1` path drives the
	// production cmd to override `driver` to "mock"; the wave11
	// integration test does the same thing. Pass an explicit
	// snapshot to flip the driver without re-writing the yaml.
	LLMConfigSnapshot *llm.ConfigSnapshot

	// Logger, when non-nil, is threaded through the auth.Middleware
	// wrapper for the draft handler so the helper's auth-rejection
	// log lines match production exactly (D-094 helper-tracks-
	// production rule; audit W2). When nil, the wrapper omits the
	// MWLogger option — silent rejection in tests is fine.
	Logger *slog.Logger

	// PlannerOverride, when non-nil, replaces the registry-resolved
	// planner concrete the helper would otherwise build from
	// `cfg.Planner` (D-103). Tests that need a stub / scripted /
	// pausing planner pass their own instance here; production code
	// never sets this field (the registry path is the only way to
	// reach a planner concrete in `harbor dev`). The override is
	// applied AFTER the LLM client is built so the same `stack.LLMClient`
	// the registry would have used is still available to the test.
	PlannerOverride planner.Planner

	// Identity overrides the dev-token's identity triple. Empty
	// fields fall back to DefaultDev{Tenant,User,Session}.
	Identity struct {
		Tenant  string
		User    string
		Session string
	}

	// Phase 83f (D-149) — mirror the production cmd_dev.go
	// per-run consumer wiring. The four fields are optional; nil /
	// zero leaves the planner's matching wrapper omitted (matching
	// production's behaviour when an operator did not configure the
	// underlying subsystem).
	//
	// `MemoryStore` is the store the per-task driver calls
	// `GetLLMContext(ctx, q)` against — the test passes a real
	// inmem store keyed to the run's identity.
	// `SkillStore` is the store the driver calls `Search(ctx, q, query, cap)`
	// against — the test passes a real localdb store.
	// `SkillsContextMax` caps the Search result count; zero resolves
	// to the package default (5).
	// `PlanningHints`, when non-nil, projects directly onto
	// `RunContext.PlanningHints` for every run the driver spawns.
	MemoryStore      memory.MemoryStore
	SkillStore       skills.SkillStore
	SkillsContextMax int
	PlanningHints    *planner.PlanningHints

	// TopologyAccessor, when non-nil, is wired into the
	// ControlSurface via protocol.WithTopologyAccessor so the Phase 74
	// `topology.snapshot` method returns a real projection (D-114).
	// Production `harbor dev` hosts no engine-graph (its runtime is
	// planner/RunLoop-shaped), so its ControlSurface leaves the
	// accessor nil; the Phase 74 integration test constructs a real
	// `engine.Engine` and passes it here so the topology surface is
	// exercised end-to-end with real drivers (CLAUDE.md §17.6 — the
	// test fixture wires what the test needs; the production absence
	// is documented, not a bug). Ignored when SkipSteering is set.
	TopologyAccessor protocol.TopologyAccessor

	// ScopeChecker, when non-nil, overrides the ControlSurface's
	// admin-cross-tenant scope predicate (Phase 74 / D-114). The
	// integration test injects a deterministic checker to exercise
	// the cross-tenant admin path without standing up an
	// auth.Middleware. Ignored when SkipSteering is set.
	ScopeChecker protocol.ScopeChecker

	// DraftRoot overrides the on-disk root the Phase 66 / D-100
	// draft Store materialises drafts under. Empty falls back to a
	// per-test temp dir (the helper picks one via testing.TempDir).
	// Tests that want to share a root across multiple Assemble calls
	// (rare) supply the same string twice.
	//
	// Cleanup responsibility (audit W5): when DraftRoot is empty, the
	// helper picks the temp dir AND registers an os.RemoveAll cleanup
	// on stack.Close. When DraftRoot is supplied explicitly, the
	// caller OWNS the directory and is responsible for cleanup — the
	// helper does NOT call os.RemoveAll on an operator-supplied path
	// (it would clobber a caller-managed scratch dir). Use t.TempDir
	// + DraftRoot together if you want both control and auto-cleanup.
	DraftRoot string
}

AssembleOpts controls which layers the helper builds. The zero value builds everything the cfg implies — LLM / memory / artifacts / tasks plus auth + transports + catalog + steering.

Each `Skip*` is binary: when set, the corresponding `DevStack` field is left nil. Tests assert against the field they exercise.

type DevStack

type DevStack struct {
	// Cfg is the *config.Config the caller passed in. Pinned on the
	// stack so tests can read driver-specific knobs without
	// threading the cfg through their own helpers.
	Cfg *config.Config

	// Audit / Bus / State / Artifacts / Tasks are always non-nil
	// after a successful Assemble — they are the runtime's
	// load-bearing core. The Memory / LLMClient fields are only
	// non-nil when the cfg declared a driver for them.
	Audit     audit.Redactor
	Bus       events.EventBus
	State     state.StateStore
	Artifacts artifacts.ArtifactStore
	Tasks     tasks.TaskRegistry
	LLMClient llm.LLMClient
	Memory    memory.MemoryStore

	// Steering / Surface are nil when SkipSteering is set.
	Steering *steering.Registry
	Surface  *protocol.ControlSurface

	// Sessions is the StateStore-backed SessionRegistry (D-171). Always
	// non-nil after a successful Assemble — it mirrors the production
	// `cmd/harbor` boot path. The ControlSurface is wired with its
	// create-on-first-use ensurer, and (when transports are mounted) the
	// `sessions.*` Protocol routes project over it. Integration tests use
	// it to assert per-request session create-on-first-use + restart
	// re-discovery via the persistent catalog.
	Sessions *sessions.Registry

	// RunLoop / RunLoopDriver are nil when SkipRunLoop is set OR when
	// SkipSteering / SkipCatalog forces the construction to be
	// skipped (the RunLoop needs both the steering Registry and the
	// catalog-applied gates map). Tests that drive a `start` request
	// rely on these — without RunLoop, the spawned task sits at
	// StatusPending forever and the planner never runs.
	RunLoop       *steering.RunLoop
	RunLoopDriver *DevStackRunLoopDriver

	// Catalog / Coordinator / Gates / OAuthProviders are nil when
	// SkipCatalog is set. The Gates map is keyed by tool name and
	// populated by the catalog Builder; tests that drive
	// `gate.ResolveApproval` reach for it.
	Catalog        tools.ToolCatalog
	Coordinator    pauseresume.Coordinator
	Gates          map[string]*toolapproval.ApprovalGate
	OAuthProviders map[string]toolauth.OAuthProvider

	// Phase 83g (D-150): the MCP Registry the dev stack populates
	// from cfg.Tools.MCPServers. Nil when SkipCatalog is set or no
	// servers are configured. Integration tests inspect this
	// directly to assert each configured server reached the Registry.
	MCPRegistry *mcpdrv.Registry

	// Validator / SigningKey / KID / Token are nil/empty when
	// SkipAuth is set. The Token is a signed Bearer the caller
	// stamps on outgoing HTTP requests; SigningKey is the matching
	// private key callers use to mint additional tokens (e.g. a
	// bogus token for the failure-mode test).
	Validator  auth.Validator
	SigningKey *ecdsa.PrivateKey
	KID        string
	Token      string

	// Mux / Handler are nil when SkipTransports is set. Handler is
	// the composed mux that exposes /healthz + /readyz + /v1/*; it
	// is the value tests pass to httptest.NewServer.
	Mux     *http.ServeMux
	Handler http.Handler

	// DraftStore is the Phase 66 / D-100 draft scratchpad. Always
	// non-nil after a successful Assemble — the helper mirrors
	// production (D-094 source-of-truth invariant). Tests that
	// exercise the draft surface read DraftStore.Root() for the on-
	// disk path or drive the HTTP handler mounted at
	// devdraft.RoutePrefix.
	DraftStore *devdraft.Store

	// Close runs every subsystem's Close in reverse dependency
	// order. Idempotent: safe to defer; safe to call multiple
	// times.
	Close func()
	// contains filtered or unexported fields
}

DevStack is the bundle Assemble returns. Fields are nil when the corresponding layer was skipped via AssembleOpts.

func Assemble

func Assemble(t *testing.T, cfg *config.Config, opts AssembleOpts) *DevStack

Assemble builds the dev stack the production `harbor dev` subcommand boots. See package doc for the source-of-truth invariant and the import block tests must blank-import.

The helper is `*testing.T`-flavoured: every failure is a `t.Fatalf` so tests don't need to thread error returns. On success, the caller defers `stack.Close()` immediately.

stack := devstack.Assemble(t, cfg, devstack.AssembleOpts{})
defer stack.Close()

Required blank imports

Assemble does NOT itself blank-import driver packages — that would surface in production binaries that vendor harbortest. Each test file MUST include the driver imports it needs, e.g.:

import (
    _ "github.com/hurtener/Harbor/internal/audit/drivers/patterns"
    _ "github.com/hurtener/Harbor/internal/events/drivers/inmem"
    _ "github.com/hurtener/Harbor/internal/state/drivers/inmem"
    _ "github.com/hurtener/Harbor/internal/artifacts/drivers/inmem"
    _ "github.com/hurtener/Harbor/internal/memory/drivers/inmem"
    _ "github.com/hurtener/Harbor/internal/tasks/drivers/inprocess"
    _ "github.com/hurtener/Harbor/internal/llm/mock"
)

The helper opens the audit redactor by direct construction (the patterns driver is the only V1 redactor; the seam is documented future-proofing). All other layers use the factory `Open`.

type DevStackRunLoopDriver

type DevStackRunLoopDriver struct {
	// contains filtered or unexported fields
}

DevStackRunLoopDriver mirrors `cmd/harbor`'s package-private `perTaskRunLoopDriver`. The duplication is intentional per D-094's source-of-truth invariant: both ship the same shape (subscribe to `task.spawned`, launch a goroutine per spawned foreground task, drive the planner via `RunLoop.Run`, drain on Close). When the production shape evolves, both move in the same PR.

The driver is exported as a pointer-shaped opaque type — tests inspect via the `RunLoop` field rather than reaching into the driver's internals.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL