ratelimit

package

v1.0.62 Latest Latest Go to latest Published: Dec 24, 2025 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/cecil-the-coder/ai-provider-kit

Links

Open Source Insights

Documentation ¶

Overview ¶

Package ratelimit provides rate limiting functionality for AI provider API requests. It supports tracking rate limits across multiple providers including Anthropic, OpenAI, Cerebras, OpenRouter, and others.

Index ¶

func FormatDuration(d time.Duration) string
func FormatQwenInfo(info *Info) string
func MustParseTime(s string) time.Time
func WithCredits(limit, remaining float64) func(*Info)
func WithCustomData(data map[string]interface{}) func(*Info)
func WithDailyRequests(limit, remaining int, reset string) func(*Info)
func WithFreeTier(freeTier bool) func(*Info)
func WithInputTokens(limit, remaining int, reset string) func(*Info)
func WithOutputTokens(limit, remaining int, reset string) func(*Info)
func WithRequestID(id string) func(*Info)
func WithRequests(limit, remaining int, reset string) func(*Info)
func WithRetryAfter(d time.Duration) func(*Info)
func WithTokens(limit, remaining int, reset string) func(*Info)
type AnthropicInfo
type AnthropicParser
- func NewAnthropicParser() *AnthropicParser
- func (p *AnthropicParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *AnthropicParser) ParseAndValidate(headers http.Header, model string) (*Info, error)
- func (p *AnthropicParser) ProviderName() string
type BaseInfo
type CerebrasInfo
type CerebrasParser
- func (p *CerebrasParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *CerebrasParser) ProviderName() string
type GeminiParser
- func NewGeminiParser() *GeminiParser
- func (p *GeminiParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *GeminiParser) ProviderName() string
type Info
- func MakeTestInfo(provider, model string, setters ...func(*Info)) *Info
type OpenAIParser
- func NewOpenAIParser() *OpenAIParser
- func (p *OpenAIParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *OpenAIParser) ProviderName() string
type OpenRouterInfo
type OpenRouterParser
- func NewOpenRouterParser() *OpenRouterParser
- func (p *OpenRouterParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *OpenRouterParser) ProviderName() string
type Parser
type QwenParser
- func NewQwenParser(logHeaders bool) *QwenParser
- func (p *QwenParser) Parse(headers http.Header, model string) (*Info, error)
- func (p *QwenParser) ProviderName() string
type Tracker
- func NewTracker() *Tracker
- func (t *Tracker) CanMakeRequest(model string, estimatedTokens int) bool
- func (t *Tracker) Get(model string) (*Info, bool)
- func (t *Tracker) GetWaitTime(model string) time.Duration
- func (t *Tracker) ShouldThrottle(model string, threshold float64) bool
- func (t *Tracker) Update(info *Info)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func FormatDuration ¶

func FormatDuration(d time.Duration) string

FormatDuration formats a duration string in a human-readable way. This is useful for displaying reset times.

func FormatQwenInfo ¶

func FormatQwenInfo(info *Info) string

FormatQwenInfo formats rate limit info for human-readable display. This is useful for logging and debugging.

func MustParseTime ¶ added in v1.0.60

func MustParseTime(s string) time.Time

MustParseTime parses an RFC3339 timestamp or panics

func WithCredits ¶ added in v1.0.60

func WithCredits(limit, remaining float64) func(*Info)

WithCredits sets OpenRouter credit fields

func WithCustomData ¶ added in v1.0.60

func WithCustomData(data map[string]interface{}) func(*Info)

WithCustomData sets custom data map

func WithDailyRequests ¶ added in v1.0.60

func WithDailyRequests(limit, remaining int, reset string) func(*Info)

WithDailyRequests sets Cerebras daily request fields

func WithFreeTier ¶ added in v1.0.60

func WithFreeTier(freeTier bool) func(*Info)

WithFreeTier sets OpenRouter free tier flag

func WithInputTokens ¶ added in v1.0.60

func WithInputTokens(limit, remaining int, reset string) func(*Info)

WithInputTokens sets Anthropic input token fields

func WithOutputTokens ¶ added in v1.0.60

func WithOutputTokens(limit, remaining int, reset string) func(*Info)

WithOutputTokens sets Anthropic output token fields

func WithRequestID ¶ added in v1.0.60

func WithRequestID(id string) func(*Info)

WithRequestID sets the request ID

func WithRequests ¶ added in v1.0.60

func WithRequests(limit, remaining int, reset string) func(*Info)

WithRequests sets request-related rate limit fields

func WithRetryAfter ¶ added in v1.0.60

func WithRetryAfter(d time.Duration) func(*Info)

WithRetryAfter sets the retry-after duration

func WithTokens ¶ added in v1.0.60

func WithTokens(limit, remaining int, reset string) func(*Info)

WithTokens sets token-related rate limit fields

Types ¶

type AnthropicInfo ¶ added in v1.0.60

type AnthropicInfo struct {
	// InputTokensLimit is the maximum number of input tokens allowed
	InputTokensLimit int `json:"input_tokens_limit,omitempty"`

	// InputTokensRemaining is the number of input tokens remaining
	InputTokensRemaining int `json:"input_tokens_remaining,omitempty"`

	// InputTokensReset is when the input token limit resets
	InputTokensReset time.Time `json:"input_tokens_reset,omitempty"`

	// OutputTokensLimit is the maximum number of output tokens allowed
	OutputTokensLimit int `json:"output_tokens_limit,omitempty"`

	// OutputTokensRemaining is the number of output tokens remaining
	OutputTokensRemaining int `json:"output_tokens_remaining,omitempty"`

	// OutputTokensReset is when the output token limit resets
	OutputTokensReset time.Time `json:"output_tokens_reset,omitempty"`
}

AnthropicInfo contains Anthropic-specific rate limit fields. Anthropic tracks input and output tokens separately from aggregate tokens.

type AnthropicParser ¶

type AnthropicParser struct{}

AnthropicParser implements the Parser interface for Anthropic's rate limit headers. Anthropic uses RFC 3339 timestamps for reset times and tracks input/output tokens separately.

Header format:

anthropic-ratelimit-requests-limit: Maximum requests allowed
anthropic-ratelimit-requests-remaining: Requests remaining
anthropic-ratelimit-requests-reset: RFC 3339 timestamp when limit resets
anthropic-ratelimit-tokens-limit: Maximum total tokens allowed
anthropic-ratelimit-tokens-remaining: Total tokens remaining
anthropic-ratelimit-tokens-reset: RFC 3339 timestamp when token limit resets
anthropic-ratelimit-input-tokens-limit: Maximum input tokens allowed
anthropic-ratelimit-input-tokens-remaining: Input tokens remaining
anthropic-ratelimit-input-tokens-reset: RFC 3339 timestamp when input token limit resets
anthropic-ratelimit-output-tokens-limit: Maximum output tokens allowed
anthropic-ratelimit-output-tokens-remaining: Output tokens remaining
anthropic-ratelimit-output-tokens-reset: RFC 3339 timestamp when output token limit resets
request-id: Unique request identifier
retry-after: Seconds to wait before retrying (on 429 responses)

func NewAnthropicParser ¶

func NewAnthropicParser() *AnthropicParser

NewAnthropicParser creates a new Anthropic rate limit parser.

func (*AnthropicParser) Parse ¶

func (p *AnthropicParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from Anthropic API response headers. It handles both standard token limits and separate input/output token limits. Missing headers are handled gracefully by leaving the corresponding fields at zero values.

func (*AnthropicParser) ParseAndValidate ¶

func (p *AnthropicParser) ParseAndValidate(headers http.Header, model string) (*Info, error)

ParseAndValidate is a convenience method that parses headers and validates that at least some rate limit information was found.

func (*AnthropicParser) ProviderName ¶

func (p *AnthropicParser) ProviderName() string

ProviderName returns "anthropic" as the provider identifier.

type BaseInfo ¶ added in v1.0.60

type BaseInfo struct {
	// Provider is the name of the AI provider (e.g., "anthropic", "openai")
	Provider string `json:"provider"`

	// Model is the specific model identifier (e.g., "claude-3-opus-20240229")
	Model string `json:"model"`

	// Timestamp is when this rate limit information was captured
	Timestamp time.Time `json:"timestamp"`

	// RequestsLimit is the maximum number of requests allowed in the current window
	RequestsLimit int `json:"requests_limit"`

	// RequestsRemaining is the number of requests remaining in the current window
	RequestsRemaining int `json:"requests_remaining"`

	// RequestsReset is when the request limit counter will reset
	RequestsReset time.Time `json:"requests_reset"`

	// TokensLimit is the maximum number of tokens allowed in the current window
	TokensLimit int `json:"tokens_limit"`

	// TokensRemaining is the number of tokens remaining in the current window
	TokensRemaining int `json:"tokens_remaining"`

	// TokensReset is when the token limit counter will reset
	TokensReset time.Time `json:"tokens_reset"`

	// RequestID is the unique identifier for the request that generated this info
	RequestID string `json:"request_id,omitempty"`

	// RetryAfter indicates how long to wait before retrying (from Retry-After header)
	RetryAfter time.Duration `json:"retry_after,omitempty"`

	// CustomData holds any additional provider-specific data that doesn't fit standard fields
	CustomData map[string]interface{} `json:"custom_data,omitempty"`
}

BaseInfo contains the common rate limit fields shared across all providers. This provides the foundation for provider-specific rate limit information.

type CerebrasInfo ¶ added in v1.0.60

type CerebrasInfo struct {
	// DailyRequestsLimit is the maximum number of requests per day
	DailyRequestsLimit int `json:"daily_requests_limit,omitempty"`

	// DailyRequestsRemaining is the number of daily requests remaining
	DailyRequestsRemaining int `json:"daily_requests_remaining,omitempty"`

	// DailyRequestsReset is when the daily request limit resets
	DailyRequestsReset time.Time `json:"daily_requests_reset,omitempty"`
}

CerebrasInfo contains Cerebras-specific rate limit fields. Cerebras provides daily request limits in addition to standard rate limits.

type CerebrasParser ¶

type CerebrasParser struct{}

CerebrasParser implements the Parser interface for Cerebras API rate limits. Cerebras tracks both daily requests and per-minute limits for requests and tokens.

Cerebras Rate Limit Headers:

x-ratelimit-limit-requests-day: Daily request limit
x-ratelimit-remaining-requests-day: Remaining daily requests
x-ratelimit-reset-requests-day: Daily limit reset time (float seconds)
x-ratelimit-limit-requests-minute: Per-minute request limit
x-ratelimit-remaining-requests-minute: Remaining per-minute requests
x-ratelimit-reset-requests-minute: Per-minute limit reset time (float seconds)
x-ratelimit-limit-tokens-minute: Per-minute token limit
x-ratelimit-remaining-tokens-minute: Remaining per-minute tokens
x-ratelimit-reset-tokens-minute: Per-minute token reset time (float seconds)

Custom Headers:

cerebras-request-id: Unique request identifier
cerebras-processing-time: Processing time in seconds
cerebras-region: Data center region

func (*CerebrasParser) Parse ¶

func (p *CerebrasParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from Cerebras API response headers. The model parameter is stored in the Info but doesn't affect parsing logic.

Note: Cerebras uses FLOAT SECONDS for reset times (e.g., "33011.382867"), which are converted to absolute time.Time values by adding to time.Now().

func (*CerebrasParser) ProviderName ¶

func (p *CerebrasParser) ProviderName() string

ProviderName returns "cerebras" as the provider identifier.

type GeminiParser ¶

type GeminiParser struct{}

GeminiParser implements the Parser interface for Google's Gemini API rate limit headers.

IMPORTANT LIMITATION: The Gemini API does NOT provide rate limit information in normal API responses. Unlike OpenAI, Anthropic, and other providers, Gemini does not include headers like x-ratelimit-limit-requests or x-ratelimit-remaining-requests in successful responses (200 OK).

This parser can only extract information from error responses (429 Too Many Requests):

retry-after: Number of seconds to wait before retrying (or HTTP date)

For proactive rate limiting with Gemini, client-side tracking is required:

Track your own request counts and timing
Implement token bucket or leaky bucket algorithms
Use official quota limits from Google Cloud Console
Monitor usage through Google Cloud Console/API

The Gemini API follows these rate limits (as of 2024):

Free tier: 15 RPM (requests per minute), 1 million TPM (tokens per minute)
Pay-as-you-go: 360 RPM, 4 million TPM (varies by model)
Limits are per project and can be viewed in Google Cloud Console

Since these limits are not provided in headers, this parser primarily serves to:

Extract retry-after duration from 429 error responses
Provide a consistent interface with other provider parsers
Return minimal Info with Provider and Model for tracking purposes

func NewGeminiParser ¶

func NewGeminiParser() *GeminiParser

NewGeminiParser creates a new Gemini rate limit parser.

func (*GeminiParser) Parse ¶

func (p *GeminiParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from Gemini API response headers.

Unlike other providers, Gemini does not return proactive rate limit headers. This parser only extracts the retry-after header when present (typically in 429 responses).

The retry-after header can be in two formats:

Integer seconds: "60" (wait 60 seconds)
HTTP date: "Wed, 21 Oct 2015 07:28:00 GMT" (wait until this time)

For normal successful responses (200 OK), this will return minimal Info with:

Provider: "gemini"
Model: the provided model name
All limit/remaining fields: 0 (unknown)
RetryAfter: 0 (no retry needed)

Parameters:

headers: HTTP response headers from Gemini API
model: The model identifier (e.g., "gemini-pro", "gemini-pro-vision")

Returns:

Info with minimal rate limit information
error is always nil (this parser doesn't fail)

func (*GeminiParser) ProviderName ¶

func (p *GeminiParser) ProviderName() string

ProviderName returns "gemini" as the provider identifier.

type Info ¶

type Info struct {
	BaseInfo

	// Anthropic-specific fields (embedded for direct access)
	AnthropicInfo

	// Cerebras-specific fields (embedded for direct access)
	CerebrasInfo

	// OpenRouter-specific fields (embedded for direct access)
	OpenRouterInfo
}

Info contains rate limit information for a specific model from an AI provider. It uses composition to include common fields along with provider-specific fields. This structure allows for clean separation while maintaining backward compatibility.

func MakeTestInfo ¶ added in v1.0.60

func MakeTestInfo(provider, model string, setters ...func(*Info)) *Info

MakeTestInfo creates a test Info struct with proper initialization

type OpenAIParser ¶

type OpenAIParser struct{}

OpenAIParser implements the Parser interface for OpenAI's rate limit headers. OpenAI provides rate limit information in the following format:

x-ratelimit-limit-requests: Maximum requests allowed per time window
x-ratelimit-remaining-requests: Requests remaining in current window
x-ratelimit-reset-requests: Duration until window resets (e.g., "6m0s", "1h30m")
x-ratelimit-limit-tokens: Maximum tokens allowed per time window
x-ratelimit-remaining-tokens: Tokens remaining in current window
x-ratelimit-reset-tokens: Duration until token window resets
x-request-id: Unique identifier for the request
retry-after: Optional retry delay in seconds

Example ¶

ExampleOpenAIParser demonstrates parsing OpenAI rate limit headers

package main

import (
	"fmt"
	"net/http"
	"time"

	"github.com/cecil-the-coder/ai-provider-kit/pkg/ratelimit"
)

func main() {
	// Simulate OpenAI response headers
	headers := http.Header{
		"X-Ratelimit-Limit-Requests":     []string{"60"},
		"X-Ratelimit-Remaining-Requests": []string{"58"},
		"X-Ratelimit-Reset-Requests":     []string{"6m0s"},
		"X-Ratelimit-Limit-Tokens":       []string{"90000"},
		"X-Ratelimit-Remaining-Tokens":   []string{"85000"},
		"X-Ratelimit-Reset-Tokens":       []string{"1m30s"},
		"X-Request-Id":                   []string{"req_abc123"},
	}

	// Create parser and parse headers
	parser := ratelimit.NewOpenAIParser()
	info, err := parser.Parse(headers, "gpt-4")
	if err != nil {
		fmt.Printf("Error parsing headers: %v\n", err)
		return
	}

	// Display parsed information
	fmt.Printf("Provider: %s\n", info.Provider)
	fmt.Printf("Model: %s\n", info.Model)
	fmt.Printf("Requests: %d / %d remaining\n", info.RequestsRemaining, info.RequestsLimit)
	fmt.Printf("Tokens: %d / %d remaining\n", info.TokensRemaining, info.TokensLimit)
	fmt.Printf("Request ID: %s\n", info.RequestID)
	fmt.Printf("Requests reset in: %v\n", time.Until(info.RequestsReset).Round(time.Second))
	fmt.Printf("Tokens reset in: %v\n", time.Until(info.TokensReset).Round(time.Second))

}

Output:
Provider: openai
Model: gpt-4
Requests: 58 / 60 remaining
Tokens: 85000 / 90000 remaining
Request ID: req_abc123
Requests reset in: 6m0s
Tokens reset in: 1m30s

Example (WithTracker) ¶

ExampleOpenAIParser_withTracker demonstrates using the parser with a rate limit tracker

package main

import (
	"fmt"
	"net/http"
	"time"

	"github.com/cecil-the-coder/ai-provider-kit/pkg/ratelimit"
)

func main() {
	// Create a tracker to manage rate limits
	tracker := ratelimit.NewTracker()

	// Simulate parsing response headers
	headers := http.Header{
		"X-Ratelimit-Limit-Requests":     []string{"60"},
		"X-Ratelimit-Remaining-Requests": []string{"5"},
		"X-Ratelimit-Reset-Requests":     []string{"30s"},
		"X-Ratelimit-Limit-Tokens":       []string{"90000"},
		"X-Ratelimit-Remaining-Tokens":   []string{"1000"},
		"X-Ratelimit-Reset-Tokens":       []string{"30s"},
	}

	parser := ratelimit.NewOpenAIParser()
	info, _ := parser.Parse(headers, "gpt-4")

	// Update tracker with parsed info
	tracker.Update(info)

	// Check if we can make a request
	if tracker.CanMakeRequest("gpt-4", 500) {
		fmt.Println("Request allowed")
	} else {
		waitTime := tracker.GetWaitTime("gpt-4")
		fmt.Printf("Rate limited. Retry after: %v\n", waitTime.Round(time.Second))
	}

	// Check if we should throttle (99% threshold)
	if tracker.ShouldThrottle("gpt-4", 0.99) {
		fmt.Println("Approaching rate limits - consider throttling")
	}

}

Output:
Request allowed

func NewOpenAIParser ¶

func NewOpenAIParser() *OpenAIParser

NewOpenAIParser creates a new OpenAI rate limit parser.

func (*OpenAIParser) Parse ¶

func (p *OpenAIParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from OpenAI response headers. It handles both request-based and token-based rate limits. Reset times are provided as duration strings (e.g., "6m0s") which are parsed and converted to absolute timestamps.

func (*OpenAIParser) ProviderName ¶

func (p *OpenAIParser) ProviderName() string

ProviderName returns "openai" as the provider identifier.

type OpenRouterInfo ¶ added in v1.0.60

type OpenRouterInfo struct {
	// CreditsLimit is the maximum credits available
	CreditsLimit float64 `json:"credits_limit,omitempty"`

	// CreditsRemaining is the number of credits remaining
	CreditsRemaining float64 `json:"credits_remaining,omitempty"`

	// IsFreeTier indicates if the account is on the free tier
	IsFreeTier bool `json:"is_free_tier,omitempty"`
}

OpenRouterInfo contains OpenRouter-specific rate limit fields. OpenRouter uses a credit-based system for rate limiting.

type OpenRouterParser ¶

type OpenRouterParser struct{}

OpenRouterParser implements the Parser interface for OpenRouter's rate limit headers. OpenRouter provides rate limit information in the following format:

x-ratelimit-limit: Maximum credits or requests allowed per time window
x-ratelimit-remaining: Credits or requests remaining in current window
x-ratelimit-reset: Milliseconds since epoch when the limit resets
x-ratelimit-requests: Optional request count limit
x-ratelimit-tokens: Optional token count limit

OpenRouter uses a credit-based system where different models consume different amounts of credits per request. The free tier has different limits than paid tiers.

Note: OpenRouter also provides a proactive rate limit checking endpoint at /api/v1/key which can be used to query rate limits without making actual model requests. This parser only handles rate limit information from response headers.

func NewOpenRouterParser ¶

func NewOpenRouterParser() *OpenRouterParser

NewOpenRouterParser creates a new OpenRouter rate limit parser.

func (*OpenRouterParser) Parse ¶

func (p *OpenRouterParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from OpenRouter response headers. OpenRouter uses a hybrid system that can include both credit-based and request-based limits.

Key differences from other providers:

Reset time is in MILLISECONDS since epoch (not seconds or duration)
May have credit-based limits (x-ratelimit-limit/remaining as floats)
May have request-based limits (x-ratelimit-requests)
May have token-based limits (x-ratelimit-tokens)
Free tier accounts may have different limits

func (*OpenRouterParser) ProviderName ¶

func (p *OpenRouterParser) ProviderName() string

ProviderName returns "openrouter" as the provider identifier.

type Parser ¶

type Parser interface {
	// Parse extracts rate limit information from HTTP response headers.
	// It takes the response headers and model name as input and returns
	// a populated Info struct or an error if parsing fails.
	Parse(headers http.Header, model string) (*Info, error)

	// ProviderName returns the name of the provider this parser handles.
	// This is used for logging and tracking purposes.
	ProviderName() string
}

Parser is the interface that must be implemented by provider-specific rate limit parsers. Each AI provider has different header formats and schemes for communicating rate limits, so each provider needs its own parser implementation.

type QwenParser ¶

type QwenParser struct {
	// contains filtered or unexported fields
}

QwenParser implements the Parser interface for Qwen's rate limit headers.

Qwen (DashScope API) uses a combination of:

OpenAI-compatible headers in compatible-mode (x-ratelimit-*)
DashScope-specific headers (dashscope-*, x-dashscope-*)
Request tracking headers (x-request-id, req-cost-time)
Standard retry-after headers for rate limit recovery

DISCOVERED HEADERS (through API testing):

x-request-id: Unique request identifier
req-cost-time: Request processing time in milliseconds
req-arrive-time: Request arrival timestamp
resp-start-time: Response start timestamp

LIKELY RATE LIMIT HEADERS (based on OpenAI-compatible mode):

x-ratelimit-limit-requests: Maximum requests allowed per time window
x-ratelimit-remaining-requests: Requests remaining in current window
x-ratelimit-reset-requests: Duration until window resets
x-ratelimit-limit-tokens: Maximum tokens allowed per time window
x-ratelimit-remaining-tokens: Tokens remaining in current window
x-ratelimit-reset-tokens: Duration until token window resets
retry-after: Seconds to wait before retrying (on 429 responses)

POTENTIAL DASHSCOPE-SPECIFIC HEADERS:

dashscope-ratelimit-*: Possible DashScope-specific rate limit headers
x-dashscope-*: Alternative DashScope header prefix

func NewQwenParser ¶

func NewQwenParser(logHeaders bool) *QwenParser

NewQwenParser creates a new Qwen rate limit parser. Set logHeaders to true to enable header logging for debugging and documentation.

func (*QwenParser) Parse ¶

func (p *QwenParser) Parse(headers http.Header, model string) (*Info, error)

Parse extracts rate limit information from Qwen response headers. It attempts multiple parsing strategies to handle undocumented header formats:

Standard x-ratelimit-* headers (OpenAI-compatible)
Qwen-specific qwen-ratelimit-* headers
Retry-After header for backoff timing

The implementation is deliberately flexible to accommodate various possible formats.

func (*QwenParser) ProviderName ¶

func (p *QwenParser) ProviderName() string

ProviderName returns "qwen" as the provider identifier.

type Tracker ¶

type Tracker struct {
	// contains filtered or unexported fields
}

Tracker provides thread-safe tracking of rate limit information across multiple models. It maintains the current rate limit state for each model and provides methods to check if requests can be made and when to retry.

func NewTracker ¶

func NewTracker() *Tracker

NewTracker creates a new Tracker instance for tracking rate limits.

func (*Tracker) CanMakeRequest ¶

func (t *Tracker) CanMakeRequest(model string, estimatedTokens int) bool

CanMakeRequest checks if a request can be made for the given model with the estimated number of tokens. It returns true if the request is likely to succeed based on current rate limits, false otherwise. This method is thread-safe.

func (*Tracker) Get ¶

func (t *Tracker) Get(model string) (*Info, bool)

Get retrieves the rate limit information for a specific model. It returns the Info and a boolean indicating whether the model was found. This method is thread-safe and can be called concurrently.

func (*Tracker) GetWaitTime ¶

func (t *Tracker) GetWaitTime(model string) time.Duration

GetWaitTime returns the duration to wait before the next request can be made for the given model. If no waiting is required, it returns 0. This method is thread-safe.

func (*Tracker) ShouldThrottle ¶

func (t *Tracker) ShouldThrottle(model string, threshold float64) bool

ShouldThrottle determines if requests should be throttled based on the current rate limit usage. The threshold parameter is a value between 0 and 1 representing the percentage of limits consumed at which throttling should begin. For example, threshold=0.8 means throttle when 80% of limits are consumed. This method is thread-safe.

func (*Tracker) Update ¶

func (t *Tracker) Update(info *Info)

Update updates the rate limit information for a model. This method is thread-safe and can be called concurrently.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL