Documentation
¶
Overview ¶
Package middleware provides reusable model.Client middlewares such as adaptive rate limiting.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AdaptiveRateLimiter ¶
type AdaptiveRateLimiter struct {
// contains filtered or unexported fields
}
AdaptiveRateLimiter applies an AIMD-style adaptive token bucket on top of a model.Client. It estimates the token cost of each request, blocks callers until capacity is available, and adjusts its effective tokens-per-minute budget in response to rate limiting signals from the provider.
The limiter is process-local and designed to sit at the provider client boundary. Callers construct a single instance per process and wrap the underlying model.Client with Middleware before passing it to planners or runtimes.
func NewAdaptiveRateLimiter ¶
func NewAdaptiveRateLimiter(ctx context.Context, m *rmap.Map, key string, initialTPM, maxTPM float64) *AdaptiveRateLimiter
NewAdaptiveRateLimiter constructs an AdaptiveRateLimiter with a tokens-per-minute budget. When m and key are set, it coordinates capacity across processes using a Pulse replicated map; otherwise it operates as a process-local limiter.
func (*AdaptiveRateLimiter) Middleware ¶
func (l *AdaptiveRateLimiter) Middleware() func(model.Client) model.Client
Middleware returns a model.Client middleware that enforces the adaptive tokens-per-minute limit for both Complete and Stream calls.