Documentation
¶
Overview ¶
Package quotaheaders parses provider-supplied rate-limit / quota response headers into a structured Signal. The Signal feeds the per-provider quota state machine introduced in fizeau-92b4b823 so dispatch can route around providers whose subscription/daily cap is hit (or imminently will be).
This package is intentionally limited to the subscription/daily exhaustion case. Per-second / per-minute throttling (a 429 with a short Retry-After) stays in the per-request feedback path.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Signal ¶
type Signal struct {
// Present is true when the response carried at least one recognized
// rate-limit header. Consumers must check Present before reading the
// numeric fields.
Present bool
// RemainingTokens is the remaining token budget in the current window.
// -1 means the provider did not report a token budget on this response.
RemainingTokens int64
// RemainingRequests is the remaining request budget in the current
// window. -1 means the provider did not report a request budget.
RemainingRequests int64
// ResetTime is the absolute wall-clock instant when the smaller of the
// remaining-tokens / remaining-requests window resets. Zero when no
// reset header was supplied.
ResetTime time.Time
// RetryAfter is the duration carried by an explicit Retry-After header.
// Zero when absent. RetryAfter > 0 always indicates the provider has
// already returned a 429-class response and wants the caller to wait.
RetryAfter time.Duration
}
Signal is the structured shape returned by every per-provider parser. All fields are zero values when the corresponding header is missing.
The router treats a Signal as actionable only when Present is true; that guards against treating a response with no rate-limit headers (e.g. a 200 from a self-hosted endpoint) as "remaining tokens = 0".
func ParseAnthropic ¶
ParseAnthropic decodes Anthropic Messages-API rate-limit headers.
Recognized canonical headers (current Anthropic docs):
anthropic-ratelimit-requests-{limit,remaining,reset}
anthropic-ratelimit-tokens-{limit,remaining,reset}
anthropic-ratelimit-input-tokens-{limit,remaining,reset}
anthropic-ratelimit-output-tokens-{limit,remaining,reset}
retry-after
The bead also references the legacy "X-RateLimit-Remaining-*" family; those names are accepted for forward-compatibility but the canonical "anthropic-ratelimit-*" names are preferred when both are present.
Reset values are RFC3339 timestamps (Anthropic) and are returned as the minimum of the per-axis reset times so the consumer wakes at the earliest recovery boundary.
func ParseOpenAI ¶
ParseOpenAI decodes OpenAI Chat Completions rate-limit headers.
Recognized headers (https://platform.openai.com/docs/guides/rate-limits):
x-ratelimit-limit-requests x-ratelimit-limit-tokens x-ratelimit-remaining-requests x-ratelimit-remaining-tokens x-ratelimit-reset-requests (duration: "1s", "100ms", "2m30s") x-ratelimit-reset-tokens (duration) retry-after
Reset values are durations; this parser converts them to absolute times using `now`.
func ParseOpenRouter ¶
ParseOpenRouter decodes OpenRouter rate-limit headers.
Recognized headers (https://openrouter.ai/docs/limits):
x-ratelimit-limit x-ratelimit-remaining x-ratelimit-reset (Unix ms timestamp) retry-after
OpenRouter expresses one combined budget rather than separate token / request axes; the parser maps it onto RemainingRequests and leaves RemainingTokens at -1 ("not reported").
func (Signal) IsExhausted ¶
IsExhausted applies the conservative "remaining tokens/requests would not last to the next reset" heuristic from the bead. The current rule:
- explicit Retry-After > 0 (provider already returned 429) — exhausted
- RemainingRequests reported and == 0 — exhausted
- RemainingTokens reported and == 0 — exhausted
Returns the wall-clock retry-after time (zero when nothing actionable was signaled). The "will-exhaust-before-reset" predicate is intentionally conservative: only zero-budget triggers exhaustion. That keeps false positives out of the dispatch path while still catching the cases where the provider has authoritatively said "no more headroom in this window."