middleware

package
v1.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package middleware provides reusable model.Client middlewares such as adaptive rate limiting.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type AdaptiveRateLimiter

type AdaptiveRateLimiter struct {
	// contains filtered or unexported fields
}

AdaptiveRateLimiter applies an AIMD-style adaptive token bucket on top of a model.Client. It estimates the token cost of each request, blocks callers until capacity is available, and adjusts its effective tokens-per-minute budget in response to rate limiting signals from the provider.

The limiter is process-local and designed to sit at the provider client boundary. Callers construct a single instance per process and wrap the underlying model.Client with Middleware before passing it to planners or runtimes.

func NewAdaptiveRateLimiter

func NewAdaptiveRateLimiter(ctx context.Context, m *rmap.Map, key string, initialTPM, maxTPM float64) *AdaptiveRateLimiter

NewAdaptiveRateLimiter constructs an AdaptiveRateLimiter with a tokens-per-minute budget. When m and key are set, it coordinates capacity across processes using a Pulse replicated map; otherwise it operates as a process-local limiter.

func (*AdaptiveRateLimiter) Middleware

func (l *AdaptiveRateLimiter) Middleware() func(model.Client) model.Client

Middleware returns a model.Client middleware that enforces the adaptive tokens-per-minute limit for both Complete and Stream calls.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL