conformance

package
v0.10.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 8, 2026 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package conformance contains shared provider behavior tests.

To register a new provider, add a package-local test that builds a shaped double server, returns a Subject from a Factory, and calls Run with a Capabilities descriptor. Keep protocol-specific HTTP fixtures in the provider package so this package only depends on agent.Provider behavior.

To add a new capability, extend Capabilities and add one subtest in Run. New capability checks should be opt-in when real model behavior is unreliable or provider-specific.

Live smoke tests should reuse the same Run entry point and skip unless their environment is configured. Current live switches:

  • OMLX_URL for omlx OpenAI-compatible endpoints.
  • LMSTUDIO_URL for LM Studio OpenAI-compatible endpoints.
  • OPENROUTER_API_KEY for OpenRouter.
  • OPENAI_API_KEY for OpenAI.
  • OLLAMA_URL for Ollama.
  • ANTHROPIC_API_KEY for Anthropic.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func HTTPStatusError

func HTTPStatusError(status int) error

func Run

func Run(t *testing.T, factory Factory, caps Capabilities)

Run executes the shared provider conformance catalog.

Types

type Capabilities

type Capabilities struct {
	Name              string
	ExpectedModels    []string
	SupportsStreaming bool
	SupportsThinking  bool
	SupportsToolCalls bool
	MaxTokensSlack    int

	ChatContains      string
	StreamContains    string
	ReasoningContains string

	// ChatMaxTokens overrides the default chat/stream max-tokens budget
	// (8 tokens). Thinking-mode providers like luce burn output budget on
	// reasoning_content before the visible content; without headroom they
	// return empty content and the chat assertions fail. 0 = use default.
	ChatMaxTokens int
	// StreamMaxTokensCheck overrides the per-test cap for the
	// "streaming max_tokens honored" subtest. The check still asserts the
	// returned word count is bounded by this value + MaxTokensSlack. 0 =
	// use the existing default of 3.
	StreamMaxTokensCheck int
	// ScenarioTimeout overrides the per-subtest wall-clock budget
	// (default 5 seconds). Local thinking-capable providers can need
	// substantially more — set generously for those. 0 = use default.
	ScenarioTimeout time.Duration
}

Capabilities declares the scenarios that apply to a provider. Add fields as the shared catalog grows rather than baking provider names into Run.

type Factory

type Factory func(t *testing.T) Subject

Factory builds a fresh provider subject for one conformance scenario.

type Subject

type Subject struct {
	Provider agent.Provider

	HealthCheck func(context.Context) error
	ListModels  func(context.Context) ([]string, error)
}

Subject is the provider plus any protocol-specific discovery hooks needed to exercise health and model-discovery capabilities.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL