tooltest

package
v0.3.13 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package tooltest provides a generic contract test suite for tool.Tool implementations.

The package mirrors [enginetest]'s shape and intent: any type implementing tool.Tool should pass RunSuite to be considered "contract-compliant" — i.e., its behaviour matches what sdk/tool/registry.go and the LLM-tool wire protocol expect from it. Built-in tools (askuser, dispatcher kanban tools) call RunSuite from their own *_test.go; third-party tool authors should do the same.

What the suite covers

  • Definition() metadata invariants: non-empty name, deterministic return, JSON-marshalable schema (the registry serialises it into the LLM's tool catalogue).
  • Execute() error classification: bad JSON arguments must surface as errdefs.Validation; the tool must NOT panic on empty / nil args when the schema doesn't require any property.
  • Context cancellation propagation: a cancelled ctx given to Execute should make it return promptly (within the suite's bounded deadline) rather than ignore the cancel signal.
  • Concurrent invocation safety: ten goroutines hammering Definition() in parallel must not data-race (we run with -race). Tools that hold per-call mutable state should still expose stable Definition() output.

What the suite deliberately does NOT cover

  • The semantic correctness of Execute() — that's per-tool unit test territory. The suite has no idea what "correct" means for a search tool vs. a calculator vs. ask_user.
  • Side effects (network, fs). The suite calls Execute with known-bad input; tools that try to reach external systems on bad input are themselves the bug.

Wiring

func TestAskUser_Contract(t *testing.T) {
    tooltest.RunSuite(t, func() tool.Tool { return askuser.New() })
}

Each subtest constructs a fresh tool via the supplied Factory so per-tool state cannot leak across cases.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RunSuite

func RunSuite(t *testing.T, f Factory, caps ...Capabilities)

RunSuite runs every applicable contract subtest against tools produced by f. Each subtest builds a fresh tool so failures isolate cleanly.

Types

type Capabilities

type Capabilities struct {
	// SkipBadArgsValidation is true when the tool legitimately
	// accepts non-JSON arguments (rare — most tools take a JSON
	// object per the LLM tool-call protocol). When true the suite
	// skips the bad-JSON / unparseable-argument cases.
	SkipBadArgsValidation bool

	// SkipEmptyArgsTolerance is true when the tool's schema
	// declares required properties and so genuinely cannot run
	// with empty args. The suite then asserts the empty-args path
	// returns errdefs.Validation rather than treating an error as
	// a failure.
	SkipEmptyArgsTolerance bool

	// SkipContextCancel is true when the tool's Execute is so
	// cheap that it returns before any ctx-cancel could be observed
	// (pure in-memory transforms with no select). The suite then
	// skips the cancellation responsiveness check.
	SkipContextCancel bool
}

Capabilities lets a tool opt out of subtests that don't apply. Most tools should pass the zero value (= every subtest runs).

type Factory

type Factory func() tool.Tool

Factory builds a fresh tool.Tool for each subtest. The suite invokes it once per case so subtests do not share tool state.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL