utils

package

v1.0.62 Latest Latest Go to latest Published: Dec 24, 2025 License: MIT Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/cecil-the-coder/ai-provider-kit

Links

Open Source Insights

Documentation ¶

Overview ¶

Package utils provides utility functions for token estimation, tool call validation, and embedded error detection. These primitives enable consumers to make routing decisions and validate API interactions without imposing specific patterns.

Package utils provides utility functions for the application.

Package utils provides utility functions for the AI Provider Kit.

Index ¶

Constants
Variables
func ByteThresholdForTokens(tokens int) int
func ContainsAnyPattern(body string, patterns []string) bool
func ContainsCommonErrors(body string) bool
func CountTokens(text, model string) (int, error)
func CountTokensFromMessages(messages []types.ChatMessage, model string) (int, error)
func EstimateTokensFast(text string, maxTokens int) int
func EstimateTokensFromBytes(byteCount int) int
func EstimateTokensFromMessages(messages []types.ChatMessage) int
func EstimateTokensFromString(s string) int
func FixMissingToolResponses(messages []types.ChatMessage, defaultResponse string) []types.ChatMessage
func GetPendingToolCalls(messages []types.ChatMessage) []types.ToolCall
func HasPendingToolCalls(messages []types.ChatMessage) bool
func Min(a, b int) int
func ProcessSlice[T, R any](items []T, mapper func(T) R) []R
func ProcessSliceWithPool[T, R any](pool *WorkerPool, items []T, mapper func(T) R) []R
type EmbeddedError
- func CheckCommonErrors(body string) *EmbeddedError
- func CheckEmbeddedErrors(body string, patterns []string) *EmbeddedError
- func (e *EmbeddedError) Error() string
type TiktokenCache
- func GetTiktokenCache() *TiktokenCache
- func (tc *TiktokenCache) CountMessagesTokens(messages []string, model string) (int, error)
- func (tc *TiktokenCache) CountTokens(text, model string) (int, error)
- func (tc *TiktokenCache) GetEncoder(model string) (*tiktoken.Tiktoken, error)
type ToolCallValidationError
- func ValidateToolCallSequence(messages []types.ChatMessage) []ToolCallValidationError
type WorkerPool
- func NewWorkerPool(maxWorkers int) *WorkerPool
- func (p *WorkerPool) Close()
- func (p *WorkerPool) Start()
- func (p *WorkerPool) Submit(task func())

Constants ¶

View Source

const (
	TokenThreshold4K   = 4096
	TokenThreshold8K   = 8192
	TokenThreshold16K  = 16384
	TokenThreshold32K  = 32768
	TokenThreshold128K = 131072
)

TokenThreshold represents common context window sizes in tokens.

View Source

const BytesPerToken = 4.7

BytesPerToken is the empirically-derived average bytes per token. Can be used by consumers for custom calculations.

Variables ¶

View Source

var CommonErrorPatterns = []string{
	"token quota is not enough",
	"rate limit exceeded",
	"context length exceeded",
	"insufficient_quota",
	"model_not_found",
	"invalid_api_key",
	"quota exceeded",
	"capacity exceeded",
	"overloaded",
}

CommonErrorPatterns provides default patterns for known provider errors. Consumers can use these or define their own.

Functions ¶

func ByteThresholdForTokens ¶

func ByteThresholdForTokens(tokens int) int

ByteThresholdForTokens converts token thresholds to approximate byte sizes. Useful for quick content-length based routing decisions.

func ContainsAnyPattern ¶

func ContainsAnyPattern(body string, patterns []string) bool

ContainsAnyPattern returns true if body contains any of the patterns. Case-insensitive matching.

func ContainsCommonErrors ¶

func ContainsCommonErrors(body string) bool

ContainsCommonErrors returns true if body contains any common error patterns.

func CountTokens ¶ added in v1.0.60

func CountTokens(text, model string) (int, error)

CountTokens returns accurate token count using tiktoken-based encoding with caching. This is more accurate than EstimateTokensFromString but slower. The encoding is automatically selected based on the model name. Results are cached for performance.

When the Rust tokenizer is available (via -tags=rusttokenizer), it provides 3-15x faster tokenization compared to the pure Go implementation.

func CountTokensFromMessages ¶ added in v1.0.60

func CountTokensFromMessages(messages []types.ChatMessage, model string) (int, error)

CountTokensFromMessages returns accurate token count for ChatMessages using tiktoken. Uses parallel encoding for multiple messages and caches results. The encoding is automatically selected based on the model name.

When the Rust tokenizer is available (via -tags=rusttokenizer), it uses batch processing for even better performance.

func EstimateTokensFast ¶ added in v1.0.60

func EstimateTokensFast(text string, maxTokens int) int

EstimateTokensFast provides a fast character-based token estimate. For small payloads, full BPE encoding is unnecessary. This function is optimized for quick estimates on small text sizes.

func EstimateTokensFromBytes ¶

func EstimateTokensFromBytes(byteCount int) int

EstimateTokensFromBytes estimates token count from byte length. Based on empirical observation: ~4.7 bytes per token on average. This is a rough estimate, not exact tokenization.

func EstimateTokensFromMessages ¶

func EstimateTokensFromMessages(messages []types.ChatMessage) int

EstimateTokensFromMessages estimates total tokens across all messages. Uses GetTextContent() to extract text from both simple and multimodal messages. Parallelizes token estimation for messages when there are multiple messages.

func EstimateTokensFromString ¶

func EstimateTokensFromString(s string) int

EstimateTokensFromString estimates token count from string content.

func FixMissingToolResponses ¶

func FixMissingToolResponses(messages []types.ChatMessage, defaultResponse string) []types.ChatMessage

FixMissingToolResponses returns a new message slice with injected responses for any tool calls that don't have corresponding tool responses. Tool responses are inserted immediately after the assistant message containing the tool_calls, not at the end of the message array.

func GetPendingToolCalls ¶

func GetPendingToolCalls(messages []types.ChatMessage) []types.ToolCall

GetPendingToolCalls returns tool calls that don't have responses yet.

func HasPendingToolCalls ¶

func HasPendingToolCalls(messages []types.ChatMessage) bool

HasPendingToolCalls returns true if there are tool calls without responses.

func Min ¶ added in v1.0.60

func Min(a, b int) int

Min returns the minimum of two integers.

func ProcessSlice ¶ added in v1.0.60

func ProcessSlice[T, R any](items []T, mapper func(T) R) []R

ProcessSlice concurrently processes a slice of items using the worker pool. It maps each item to a result using the provided mapper function. The results are returned in the same order as the input items.

func ProcessSliceWithPool ¶ added in v1.0.60

func ProcessSliceWithPool[T, R any](pool *WorkerPool, items []T, mapper func(T) R) []R

ProcessSliceWithPool processes a slice using an existing worker pool. This is useful when you want to reuse a pool for multiple operations.

Types ¶

type EmbeddedError ¶

type EmbeddedError struct {
	Pattern string // The pattern that matched
	Context string // Surrounding text for debugging (up to 100 chars)
}

EmbeddedError represents an error found in a successful response body

func CheckCommonErrors ¶

func CheckCommonErrors(body string) *EmbeddedError

CheckCommonErrors is a convenience function using CommonErrorPatterns.

func CheckEmbeddedErrors ¶

func CheckEmbeddedErrors(body string, patterns []string) *EmbeddedError

CheckEmbeddedErrors scans response body for error patterns. Returns nil if no errors found, or the first matching error. Matching is case-insensitive.

func (*EmbeddedError) Error ¶

func (e *EmbeddedError) Error() string

Error implements the error interface

type TiktokenCache ¶ added in v1.0.60

type TiktokenCache struct {
	// contains filtered or unexported fields
}

TiktokenCache provides cached tiktoken-based token counting

func GetTiktokenCache ¶ added in v1.0.60

func GetTiktokenCache() *TiktokenCache

GetTiktokenCache returns the singleton TiktokenCache instance

func (*TiktokenCache) CountMessagesTokens ¶ added in v1.0.60

func (tc *TiktokenCache) CountMessagesTokens(messages []string, model string) (int, error)

CountMessagesTokens returns token count for multiple messages with parallel encoding

func (*TiktokenCache) CountTokens ¶ added in v1.0.60

func (tc *TiktokenCache) CountTokens(text, model string) (int, error)

CountTokens returns token count with caching for a single text

func (*TiktokenCache) GetEncoder ¶ added in v1.0.60

func (tc *TiktokenCache) GetEncoder(model string) (*tiktoken.Tiktoken, error)

GetEncoder returns a cached tiktoken encoder for the model

type ToolCallValidationError ¶

type ToolCallValidationError struct {
	ToolCallID   string
	ToolName     string
	MessageIndex int
	Issue        string // "missing_response", "orphan_response", etc.
}

ToolCallValidationError represents a missing or invalid tool response

func ValidateToolCallSequence ¶

func ValidateToolCallSequence(messages []types.ChatMessage) []ToolCallValidationError

ValidateToolCallSequence checks if all tool calls have matching responses. Returns nil if valid, or a slice of validation errors.

type WorkerPool ¶ added in v1.0.60

type WorkerPool struct {
	// contains filtered or unexported fields
}

WorkerPool manages a pool of worker goroutines for concurrent task execution. It automatically scales based on the number of available CPU cores and provides a simple interface for parallelizing workloads.

func NewWorkerPool ¶ added in v1.0.60

func NewWorkerPool(maxWorkers int) *WorkerPool

NewWorkerPool creates a new worker pool with the specified number of workers. If maxWorkers is 0 or negative, it defaults to the number of available CPUs.

func (*WorkerPool) Close ¶ added in v1.0.60

func (p *WorkerPool) Close()

Close gracefully shuts down the worker pool. It waits for all submitted tasks to complete before returning.

func (*WorkerPool) Start ¶ added in v1.0.60

func (p *WorkerPool) Start()

Start initializes the worker pool's goroutines. This method is idempotent and can be called multiple times safely.

func (*WorkerPool) Submit ¶ added in v1.0.60

func (p *WorkerPool) Submit(task func())

Submit adds a task to the worker pool queue. The task will be executed by one of the available workers. If the pool hasn't been started yet, it will be started automatically.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL