Documentation
¶
Overview ¶
Package provider is stroma's shared OpenAI-compatible HTTP substrate.
It owns the per-request mechanics common to embed and chat: retry with capped exponential backoff, Retry-After parsing, response-size bounding, and classification of transport / HTTP / decode failures into a public FailureClass taxonomy. Callers (embed.OpenAI, chat.OpenAI, and downstream wrappers) consume provider.Do with their own request shape and response decoder.
Product-specific labels (Runtime, RequestType, etc.) are intentionally out of scope — callers that need those wrap Error on return.
Index ¶
- Constants
- func Classify(statusCode int, err error, message string) string
- func Do[T any](ctx context.Context, client *http.Client, target Target, ...) (T, error)
- func ExtractErrorMessage(body []byte) string
- func ExtractErrorValue(raw json.RawMessage) string
- type DecodeFunc
- type Error
- type FailureDetails
- type Policy
- type Target
Constants ¶
const ( FailureClassAuth = "auth" FailureClassRateLimit = "rate_limit" FailureClassSchemaMismatch = "schema_mismatch" FailureClassServer = "server" FailureClassTimeout = "timeout" FailureClassTransport = "transport" )
FailureClass enumerates the substrate-level failure modes a provider call can surface. Callers branch on these to decide whether to retry, degrade, or propagate. The string values are stable and safe to log.
Variables ¶
This section is empty.
Functions ¶
func Classify ¶
Classify returns the FailureClass best describing the given HTTP status, Go error, and upstream error message. It is called by provider.Do to label every failure the substrate surfaces; callers may also call it directly when synthesising a FailureDetails for a caller-side error path.
Precedence: HTTP status mapping wins first (auth / rate_limit / timeout / server); otherwise the Go error and message are inspected for auth / rate-limit / timeout / transport hints, defaulting to dependency_unavailable.
func Do ¶
func Do[T any]( ctx context.Context, client *http.Client, target Target, details FailureDetails, policy Policy, decode DecodeFunc[T], ) (T, error)
Do sends target, applies retry + Retry-After behaviour per policy, bounds the response body, classifies non-2xx responses and transport failures, and hands the raw body to decode on success. details is the baseline FailureDetails used to enrich every returned *Error; HTTPStatus and FailureClass are populated by Do.
Retries fire on: 429 Too Many Requests, any 5xx, connection refused / reset / broken pipe, and net.Error.Timeout(). Retry-After (integer seconds or HTTP-Date) is honoured when present; otherwise the delay is exponential 200ms → 400ms → ... → 2s cap. Context cancellation or deadline exceeded aborts immediately without retry.
func ExtractErrorMessage ¶
ExtractErrorMessage returns a human-readable error message from a full OpenAI-compatible response body. It tolerates both `{"error":"..."}` and `{"error":{"message":"..."}}` shapes as well as a top-level `{"message":"..."}` used by some self-hosted gateways.
func ExtractErrorValue ¶
func ExtractErrorValue(raw json.RawMessage) string
ExtractErrorValue returns a human-readable error message from an OpenAI-compatible `error` field that may be either a string or an object with message/error/detail keys.
Types ¶
type DecodeFunc ¶
DecodeFunc parses a successful 2xx response body into T. Decoders may return a *Error directly (e.g. for schema_mismatch) to signal a caller-classified failure; other errors are treated as opaque and wrapped as dependency_unavailable.
type Error ¶
type Error struct {
Message string
HTTPStatus int
Details *FailureDetails
}
Error is the classified-failure type returned by every provider.Do path. Callers branch on Details.FailureClass (or use errors.As to reach it) to decide retry / degrade / propagate.
func NewError ¶
func NewError(details FailureDetails, format string, args ...any) *Error
NewError formats a classified failure without HTTP context.
func NewErrorStatus ¶
func NewErrorStatus(details FailureDetails, status int, format string, args ...any) *Error
NewErrorStatus formats a classified failure and records the associated HTTP status code.
func (*Error) DiagnosticFields ¶
DiagnosticFields returns a JSON-friendly diagnostic payload built from the attached details, or nil when none are set.
func (*Error) FailureClass ¶
FailureClass returns the classified failure mode, or "" when the error carries no details.
func (*Error) HTTPStatusCode ¶
HTTPStatusCode returns the associated HTTP status when the failure came from an HTTP response, or zero otherwise.
type FailureDetails ¶
type FailureDetails struct {
Model string
Endpoint string
FailureClass string
HTTPStatus int
TimeoutMS int
MaxRetries int
BatchSize int
InputCount int
}
FailureDetails is the substrate-level diagnostic payload attached to every Error. It carries fields provider.Do can populate from the request and response itself. Callers that need product-layer labels (Runtime, Provider, RequestType) wrap the error on return rather than adding those fields here.
func (FailureDetails) Map ¶
func (d FailureDetails) Map() map[string]any
Map returns the details as a JSON-friendly object with empty fields omitted so log sinks can emit a minimal diagnostic payload.
type Policy ¶
type Policy struct {
MaxRetries int
MaxResponseBytes int64
// MaxRetryAfter bounds how long waitBeforeRetry will honour a
// server-supplied Retry-After header. A hostile or misconfigured
// upstream sending `Retry-After: 86400` (or a far-future HTTP
// date) would otherwise park the goroutine for that entire
// duration when the caller's context has no deadline. A zero or
// negative value selects defaultMaxRetryAfter (30s) — generous for
// real rate limits, tight enough that a pathological header does
// not silently consume hours of wall time. Callers with a larger
// budget set this explicitly.
MaxRetryAfter time.Duration
}
Policy controls retry and response-size behaviour. A zero Policy is valid: MaxRetries 0 means "try once, no retries", and a zero MaxResponseBytes selects defaultMaxResponseBytes (4 MiB) — generous for chat completions, tight enough to avoid OOMing the host on a misconfigured upstream. Embedders with larger expected payloads set MaxResponseBytes explicitly.
type Target ¶
Target describes an idempotent request that provider.Do may resend on retry. Body is stored as a byte slice (rather than an io.Reader) so the transport can rebuild a fresh http.Request per attempt.
An empty Token skips the Authorization header. Header is optional and merged after the core sets Content-Type and Authorization, so callers can set e.g. Accept without clobbering either.