failover

package
v0.5.24 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package failover classifies upstream-provider errors into a small, well-defined taxonomy that the cross-provider retry loop dispatches on.

The package is intentionally narrow: it does NOT perform retry, transport, or scoring work. It only answers two questions for the conductor:

  1. What kind of failure is this? (Class)
  2. Should the next provider be tried? (advance vs abort)

Classification reads existing error shapes already produced by the executor layer — http status via the `StatusCode() int` interface, `*executor.stallError` via `IsStallError`, and `context.Canceled` / `context.DeadlineExceeded` directly — so executors do not need to be modified to participate.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ErrorClass

type ErrorClass int

ErrorClass is the taxonomy validated by the gemini+opencode round of lope-negotiate (sprint doc, Decision #1).

const (
	// ClassUnknown is the default zero value; treated like ClassTransient by
	// the conductor (advance to next provider) but logged distinctly so we
	// can spot classifier blind spots.
	ClassUnknown ErrorClass = iota

	// ClassTransient — 5xx, conn refused, ctx deadline mid-request,
	// header timeout. Retry next provider (and optionally same once).
	ClassTransient

	// ClassRateLimit — 429. Skip provider for its backoff window, advance.
	ClassRateLimit

	// ClassAuth — 401, 403. Advance; mark credential degraded.
	ClassAuth

	// ClassOutOfCredits — 402, or provider-signaled credit exhaustion.
	// Advance; mark provider unavailable for the session.
	ClassOutOfCredits

	// ClassContextLength — 400 with provider-signaled context-length-exceeded.
	// Advance only to providers with larger window; fail fast otherwise.
	ClassContextLength

	// ClassPermanent — 400 (other), 404, 422. Return to client. Do NOT advance.
	ClassPermanent

	// ClassEmptyContent — 200 OK with empty content/choices.
	// Advance immediately (sick node or safety filter).
	ClassEmptyContent

	// ClassStallPreFirstByte — upstream produced no bytes within
	// firstByteTimeout. Stream context cancelled by watchdog. Advance —
	// nothing has been flushed to the client yet.
	ClassStallPreFirstByte

	// ClassStallMidStream — stall AFTER first chunk was flushed to client.
	// UNRECOVERABLE: we cannot retract a partial SSE response. Abort.
	ClassStallMidStream

	// ClassClientDisconnect — caller cancelled the request. NOT a provider
	// failure; do not retry, do not record failure on health monitor.
	ClassClientDisconnect
)

func Classify

func Classify(ctx context.Context, err error, body []byte) ErrorClass

Classify maps any error returned from an executor (or downstream) into one of the ErrorClass values. body may be nil; when present and non-empty it is consulted only for provider-signaled hints inside an HTTP 400 response (out_of_credits / context_length).

The function never panics on nil err — it returns ClassUnknown.

func (ErrorClass) ShouldAdvance

func (c ErrorClass) ShouldAdvance() bool

ShouldAdvance reports whether the conductor should try the next provider for this class. ClassPermanent, ClassClientDisconnect, and ClassStallMidStream are terminal — everything else advances.

func (ErrorClass) String

func (c ErrorClass) String() string

type FailoverError

type FailoverError struct {
	Class    ErrorClass
	Provider string
	HTTPCode int
	Wrapped  error
}

FailoverError is the typed error the conductor wraps an executor failure in once it has been classified. It is the only error type the structured failover log line (P4) and any future P1 retry policy need to inspect.

It implements:

error           — to flow through normal error paths
Unwrap() error  — so errors.Is / errors.As reach the original cause
StatusCode()    — so the existing statusCodeFromError helper still works

FailoverError MUST NOT be created speculatively by executors; it is the conductor's job to classify after the fact. Executors keep returning the raw upstream error; the conductor wraps once at the boundary.

func AsFailoverError

func AsFailoverError(err error) (*FailoverError, bool)

AsFailoverError extracts a *FailoverError from anywhere in the error chain. Returns (nil, false) if not present.

func (*FailoverError) Error

func (e *FailoverError) Error() string

Error implements the error interface.

func (*FailoverError) StatusCode

func (e *FailoverError) StatusCode() int

StatusCode preserves the HTTP-status interface contract so existing code paths (handlers.go status extraction, conductor cooldown logic) keep behaving identically when the wrapped error is a FailoverError.

func (*FailoverError) Unwrap

func (e *FailoverError) Unwrap() error

Unwrap returns the underlying error so errors.Is and errors.As keep working transparently.

type StallPhaser

type StallPhaser interface {
	error
	StallPhase() string
}

StallPhaser is the cross-package interface that the executor's stall error implements. We detect stalls via this interface rather than importing internal/runtime/executor directly — that would create an import cycle (auth → failover → executor → auth).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL