Documentation
¶
Overview ¶
Package conformance contains shared provider behavior tests.
To register a new provider, add a package-local test that builds a shaped double server, returns a Subject from a Factory, and calls Run with a Capabilities descriptor. Keep protocol-specific HTTP fixtures in the provider package so this package only depends on agent.Provider behavior.
To add a new capability, extend Capabilities and add one subtest in Run. New capability checks should be opt-in when real model behavior is unreliable or provider-specific.
Live smoke tests should reuse the same Run entry point and skip unless their environment is configured. Current live switches:
- OMLX_URL for omlx OpenAI-compatible endpoints.
- LMSTUDIO_URL for LM Studio OpenAI-compatible endpoints.
- OPENROUTER_API_KEY for OpenRouter.
- OPENAI_API_KEY for OpenAI.
- OLLAMA_URL for Ollama.
- ANTHROPIC_API_KEY for Anthropic.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HTTPStatusError ¶
Types ¶
type Capabilities ¶
type Capabilities struct {
Name string
ExpectedModels []string
SupportsStreaming bool
SupportsThinking bool
SupportsToolCalls bool
MaxTokensSlack int
ChatContains string
StreamContains string
ReasoningContains string
// ChatMaxTokens overrides the default chat/stream max-tokens budget
// (8 tokens). Thinking-mode providers like luce burn output budget on
// reasoning_content before the visible content; without headroom they
// return empty content and the chat assertions fail. 0 = use default.
ChatMaxTokens int
// StreamMaxTokensCheck overrides the per-test cap for the
// "streaming max_tokens honored" subtest. The check still asserts the
// returned word count is bounded by this value + MaxTokensSlack. 0 =
// use the existing default of 3.
StreamMaxTokensCheck int
// ScenarioTimeout overrides the per-subtest wall-clock budget
// (default 5 seconds). Local thinking-capable providers can need
// substantially more — set generously for those. 0 = use default.
ScenarioTimeout time.Duration
}
Capabilities declares the scenarios that apply to a provider. Add fields as the shared catalog grows rather than baking provider names into Run.