Documentation
¶
Overview ¶
Package llmproxy provides a pluggable, composable library for proxying requests to upstream LLM providers.
The library uses small, focused interfaces that can be mixed and matched to create custom provider implementations. It supports OpenAI-compatible APIs out of the box and can be extended for provider-specific behaviors.
Core concepts:
- BodyParser: Extracts metadata from request bodies
- RequestEnricher: Modifies outgoing requests (headers, etc.)
- ResponseExtractor: Extracts metadata from responses
- URLResolver: Determines the upstream provider URL
- Provider: Composes the above components
- Interceptor: Wraps the request/response flow for cross-cutting concerns
Basic usage:
provider, _ := openai.New("sk-your-key")
proxy := llmproxy.NewProxy(provider,
llmproxy.WithInterceptor(interceptors.NewLogging(nil)),
)
resp, meta, _ := proxy.Forward(ctx, req)
Index ¶
- type BaseProvider
- type BillingResult
- type BodyMetadata
- type BodyParser
- type CacheDetail
- type CacheUsage
- type Choice
- type CostInfo
- type CostLookup
- type Interceptor
- type InterceptorChain
- type Logger
- type LoggerFunc
- type MapRegistry
- type Message
- type MetaContextKey
- type MetaContextValue
- type Provider
- type ProviderOption
- type Proxy
- type ProxyOption
- type Registry
- type RequestEnricher
- type ResponseExtractor
- type ResponseMetadata
- type RoundTripFunc
- type URLResolver
- type Usage
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BaseProvider ¶
type BaseProvider struct {
// contains filtered or unexported fields
}
BaseProvider provides a configurable implementation of Provider. It allows setting individual components via functional options, making it easy to mix and match behaviors.
Use NewBaseProvider with With* options to create a custom provider:
provider := NewBaseProvider("my-provider",
WithBodyParser(myParser),
WithRequestEnricher(myEnricher),
)
func NewBaseProvider ¶
func NewBaseProvider(name string, opts ...ProviderOption) *BaseProvider
NewBaseProvider creates a new provider with the given name and options. Unset components will return nil from their accessor methods.
func (*BaseProvider) BodyParser ¶
func (p *BaseProvider) BodyParser() BodyParser
BodyParser returns the configured body parser, or nil if not set.
func (*BaseProvider) RequestEnricher ¶
func (p *BaseProvider) RequestEnricher() RequestEnricher
RequestEnricher returns the configured request enricher, or nil if not set.
func (*BaseProvider) ResponseExtractor ¶
func (p *BaseProvider) ResponseExtractor() ResponseExtractor
ResponseExtractor returns the configured response extractor, or nil if not set.
func (*BaseProvider) URLResolver ¶
func (p *BaseProvider) URLResolver() URLResolver
URLResolver returns the configured URL resolver, or nil if not set.
type BillingResult ¶
type BillingResult struct {
// Provider is the provider name.
Provider string
// Model is the model identifier.
Model string
// PromptTokens is the number of input tokens.
PromptTokens int
// CompletionTokens is the number of output tokens.
CompletionTokens int
// CachedTokens is the number of prompt tokens served from cache.
CachedTokens int
// TotalTokens is the sum of prompt and completion tokens.
TotalTokens int
// InputCost is the calculated input cost in USD (non-cached prompt tokens).
InputCost float64
// CachedInputCost is the cost for cached prompt tokens in USD.
CachedInputCost float64
// OutputCost is the calculated output cost in USD.
OutputCost float64
// TotalCost is the sum of all costs in USD.
TotalCost float64
}
BillingResult contains the calculated cost for a request.
func CalculateCost ¶
func CalculateCost(provider, model string, costInfo CostInfo, promptTokens, completionTokens int, cacheUsage *CacheUsage) BillingResult
CalculateCost computes the billing result from cost info, token usage, and cache usage. Cached tokens are billed at the CacheRead rate (if available), and non-cached prompt tokens are billed at the full Input rate.
type BodyMetadata ¶
type BodyMetadata struct {
// Model is the requested model identifier (e.g., "gpt-4", "claude-3-opus").
Model string `json:"model"`
// Messages contains the conversation history for chat completions.
Messages []Message `json:"messages,omitempty"`
// MaxTokens is the maximum number of tokens to generate.
MaxTokens int `json:"max_tokens,omitempty"`
// Stream indicates whether streaming is requested.
Stream bool `json:"stream"`
// Custom holds provider-specific fields that don't map to standard fields.
Custom map[string]any `json:"-"`
}
BodyMetadata contains extracted metadata from a parsed request body. It provides a common structure that works across different LLM providers while allowing provider-specific fields via the Custom map.
type BodyParser ¶
type BodyParser interface {
// Parse reads the request body and extracts metadata.
// It returns the parsed metadata, the raw body bytes (for later use),
// and any error encountered during parsing.
//
// The caller is responsible for closing the body ReadCloser.
Parse(body io.ReadCloser) (BodyMetadata, []byte, error)
}
BodyParser extracts metadata from a request body.
Since io.ReadCloser can only be read once, Parse returns both the extracted metadata and the raw body bytes. The caller is responsible for reconstructing the body for the upstream request.
Implementations should handle provider-specific JSON formats and map them to the common BodyMetadata structure.
type CacheDetail ¶ added in v0.0.3
type CacheDetail struct {
// TTL is the time-to-live for the cache entry (e.g., "5m", "1h").
TTL string `json:"ttl,omitempty"`
// CacheWriteTokens is the number of tokens written to cache at this TTL.
CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
}
CacheDetail contains cache details for a checkpoint (Bedrock).
type CacheUsage ¶ added in v0.0.3
type CacheUsage struct {
// CachedTokens is the number of tokens served from cache (OpenAI).
CachedTokens int `json:"cached_tokens,omitempty"`
// CacheCreationInputTokens is the number of tokens written to cache (Anthropic).
CacheCreationInputTokens int `json:"cache_creation_input_tokens,omitempty"`
// CacheReadInputTokens is the number of tokens read from cache (Anthropic).
CacheReadInputTokens int `json:"cache_read_input_tokens,omitempty"`
// Ephemeral5mInputTokens is the number of 5-minute cache write tokens (Anthropic).
Ephemeral5mInputTokens int `json:"ephemeral_5m_input_tokens,omitempty"`
// Ephemeral1hInputTokens is the number of 1-hour cache write tokens (Anthropic).
Ephemeral1hInputTokens int `json:"ephemeral_1h_input_tokens,omitempty"`
// CacheWriteTokens is the number of tokens written to cache (Bedrock).
CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
// CacheDetails contains TTL-based cache write breakdown (Bedrock).
CacheDetails []CacheDetail `json:"cache_details,omitempty"`
}
CacheUsage tracks prompt caching token consumption.
type Choice ¶
type Choice struct {
// Index is the position of this choice in the choices array.
Index int `json:"index"`
// Message contains the completed message (for non-streaming responses).
Message *Message `json:"message,omitempty"`
// Delta contains the partial message (for streaming responses).
Delta *Message `json:"delta,omitempty"`
// FinishReason indicates why the completion stopped (e.g., "stop", "length").
FinishReason string `json:"finish_reason"`
}
Choice represents a single completion choice in the response.
type CostInfo ¶
type CostInfo struct {
// Input is the cost per 1M input tokens in USD.
Input float64
// Output is the cost per 1M output tokens in USD.
Output float64
// CacheRead is the cost per 1M cached input tokens (optional).
CacheRead float64
// CacheWrite is the cost per 1M cache write tokens (optional, Anthropic).
CacheWrite float64
}
CostInfo contains pricing information for a model.
type CostLookup ¶
CostLookup is a function that returns the cost for a given provider and model. It should return the pricing info or false if the model is not found.
The lookup function allows the pricing data to be managed externally, such as downloading from models.dev or using a custom pricing database.
type Interceptor ¶
type Interceptor interface {
// Intercept processes a request through the interceptor chain.
//
// Parameters:
// - req: The HTTP request to send upstream
// - meta: Parsed metadata from the request body
// - rawBody: The original request body bytes
// - next: The next handler in the chain (call this to continue)
//
// Returns:
// - resp: The HTTP response (body will be re-attached from rawRespBody)
// - respMeta: Parsed response metadata
// - rawRespBody: The raw response body bytes
// - error: Any error that occurred
Intercept(req *http.Request, meta BodyMetadata, rawBody []byte, next RoundTripFunc) (resp *http.Response, respMeta ResponseMetadata, rawRespBody []byte, err error)
}
Interceptor wraps the request/response cycle for cross-cutting concerns.
Interceptors form a chain around the actual request execution, allowing behavior to be added before and after the upstream call. Common uses include:
- Logging request/response details
- Collecting metrics (latency, token usage)
- Retrying failed requests
- Rate limiting
- Caching responses
Interceptors must call next(req) to continue the chain. Not calling next will short-circuit the request (useful for caching or mocking).
type InterceptorChain ¶
type InterceptorChain []Interceptor
InterceptorChain is an ordered list of interceptors that are applied in sequence. Interceptors are applied in reverse order during wrapping so that they execute in forward order during request processing.
func (InterceptorChain) Wrap ¶
func (c InterceptorChain) Wrap(final RoundTripFunc) RoundTripFunc
Wrap chains all interceptors around the final RoundTripFunc. Interceptors are wrapped in reverse order so they execute in forward order.
Example: Given interceptors [A, B, C] and final function F:
- Wrapping produces: A(B(C(F)))
- Execution order: A -> B -> C -> F -> C -> B -> A
type Logger ¶
type Logger interface {
// Debug level logging
Debug(msg string, args ...interface{})
// Info level logging
Info(msg string, args ...interface{})
// Warning level logging
Warn(msg string, args ...interface{})
// Error level logging
Error(msg string, args ...interface{})
}
Logger is an interface for logging. It matches the interface from github.com/agentuity/go-common/logger. Any logger implementing this interface can be used with interceptors.
type LoggerFunc ¶
LoggerFunc is an adapter to allow using ordinary functions as loggers.
func (LoggerFunc) Debug ¶
func (f LoggerFunc) Debug(msg string, args ...interface{})
func (LoggerFunc) Error ¶
func (f LoggerFunc) Error(msg string, args ...interface{})
func (LoggerFunc) Info ¶
func (f LoggerFunc) Info(msg string, args ...interface{})
func (LoggerFunc) Warn ¶
func (f LoggerFunc) Warn(msg string, args ...interface{})
type MapRegistry ¶
type MapRegistry struct {
// contains filtered or unexported fields
}
MapRegistry is a simple registry that stores providers by name. It provides thread-safe registration and lookup.
func (*MapRegistry) Get ¶
func (r *MapRegistry) Get(name string) (Provider, bool)
Get retrieves a provider by name.
func (*MapRegistry) Match ¶
func (r *MapRegistry) Match(req *http.Request) (Provider, error)
Match is not implemented for MapRegistry and returns nil. Use a more sophisticated implementation for request-based routing.
func (*MapRegistry) Register ¶
func (r *MapRegistry) Register(p Provider)
Register adds a provider to the registry under its name. If a provider with the same name exists, it is replaced.
type Message ¶
type Message struct {
// Role is the role of the message author (e.g., "user", "assistant", "system").
Role string `json:"role"`
// Content is the text content of the message.
Content string `json:"content"`
}
Message represents a single message in a chat completion request.
type MetaContextKey ¶
type MetaContextKey struct{}
MetaContextKey is the context key for storing request metadata.
type MetaContextValue ¶
type MetaContextValue struct {
Meta BodyMetadata
RawBody []byte
OrgID string
}
MetaContextValue holds the metadata stored in request context.
func GetMetaFromContext ¶
func GetMetaFromContext(ctx context.Context) MetaContextValue
GetMetaFromContext retrieves the metadata stored in a context. Returns an empty MetaContextValue if the context is nil or doesn't contain metadata.
type Provider ¶
type Provider interface {
// Name returns the provider's unique identifier (e.g., "openai", "anthropic").
Name() string
// BodyParser returns the parser for extracting request metadata.
BodyParser() BodyParser
// RequestEnricher returns the enricher for modifying outgoing requests.
RequestEnricher() RequestEnricher
// ResponseExtractor returns the extractor for parsing responses.
ResponseExtractor() ResponseExtractor
// URLResolver returns the resolver for determining upstream URLs.
URLResolver() URLResolver
}
Provider composes all the components needed to handle requests for an LLM provider.
A provider brings together:
- BodyParser: To extract request metadata
- RequestEnricher: To modify outgoing requests
- ResponseExtractor: To parse responses
- URLResolver: To determine upstream URLs
Implementations can use BaseProvider for a configurable default, or implement Provider directly for complete control.
type ProviderOption ¶
type ProviderOption func(*BaseProvider)
ProviderOption configures a BaseProvider during construction.
func WithBodyParser ¶
func WithBodyParser(bp BodyParser) ProviderOption
WithBodyParser sets the body parser for the provider.
func WithRequestEnricher ¶
func WithRequestEnricher(re RequestEnricher) ProviderOption
WithRequestEnricher sets the request enricher for the provider.
func WithResponseExtractor ¶
func WithResponseExtractor(re ResponseExtractor) ProviderOption
WithResponseExtractor sets the response extractor for the provider.
func WithURLResolver ¶
func WithURLResolver(ur URLResolver) ProviderOption
WithURLResolver sets the URL resolver for the provider.
type Proxy ¶
type Proxy struct {
// contains filtered or unexported fields
}
Proxy forwards requests to an upstream LLM provider.
Proxy handles the complete request lifecycle:
- Reads and parses the request body
- Resolves the upstream URL
- Creates and enriches the upstream request
- Executes the request through the interceptor chain
- Extracts metadata from the response
- Re-attaches the raw response body
Use NewProxy with functional options to configure:
proxy := NewProxy(provider,
WithInterceptor(loggingInterceptor),
WithHTTPClient(customClient),
)
func NewProxy ¶
func NewProxy(provider Provider, opts ...ProxyOption) *Proxy
NewProxy creates a new proxy for the given provider. Options can be used to add interceptors or customize the HTTP client.
func (*Proxy) Forward ¶
func (p *Proxy) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)
Forward sends a request to the upstream provider and returns the response.
The method:
- Reads and parses the request body to extract metadata
- Resolves the upstream URL based on the metadata
- Creates a new request for the upstream, copying headers
- Enriches the request with provider-specific headers
- Executes the request through the interceptor chain
- Extracts metadata from the response
- Re-attaches the raw response body so the caller can read it
The returned response body contains the original raw bytes from the upstream and can be read by the caller. Any custom/unsupported fields in the JSON are preserved.
type ProxyOption ¶
type ProxyOption func(*Proxy)
ProxyOption configures a Proxy during construction.
func WithHTTPClient ¶
func WithHTTPClient(c *http.Client) ProxyOption
WithHTTPClient sets a custom HTTP client for upstream requests. If not set, http.DefaultClient is used.
func WithInterceptor ¶
func WithInterceptor(i Interceptor) ProxyOption
WithInterceptor adds an interceptor to the global chain. Interceptors are applied in the order they are added.
type Registry ¶
type Registry interface {
// Register adds a provider to the registry.
Register(p Provider)
// Get retrieves a provider by name.
// Returns the provider and true if found, nil and false otherwise.
Get(name string) (Provider, bool)
// Match selects a provider for the given request.
// Implementations may parse the request body to determine routing.
// Returns the matched provider, or an error if no match is found.
Match(req *http.Request) (Provider, error)
}
Registry manages a collection of providers and supports routing requests to the appropriate provider based on request characteristics.
type RequestEnricher ¶
type RequestEnricher interface {
// Enrich modifies the request with provider-specific enhancements.
// The meta parameter contains parsed body metadata for decision-making.
// The rawBody contains the original request body bytes.
//
// Implementations should modify req in place (headers, URL, etc.)
// and return nil on success, or an error to abort the request.
Enrich(req *http.Request, meta BodyMetadata, rawBody []byte) error
}
RequestEnricher modifies an outgoing request before it's sent to the upstream provider.
Typical uses include:
- Setting authentication headers (Authorization, X-API-Key, etc.)
- Adding provider-specific headers
- Modifying the request body or URL
The rawBody is provided for cases where the enricher needs to modify the body content.
type ResponseExtractor ¶
type ResponseExtractor interface {
// Extract parses the HTTP response and returns unified metadata.
//
// The method reads and consumes the response body, parses it for metadata,
// and returns both the metadata and the raw body bytes. The proxy will
// re-attach the raw bytes to the response so the caller can read them.
//
// Parameters:
// - resp: The HTTP response from the upstream provider
//
// Returns:
// - metadata: Parsed response metadata (tokens, model, etc.)
// - rawBody: The original response body bytes (must be returned for forwarding)
// - error: Any parsing error
Extract(resp *http.Response) (metadata ResponseMetadata, rawBody []byte, err error)
}
ResponseExtractor parses an upstream provider response and extracts metadata.
Implementations handle provider-specific response formats and map them to the common ResponseMetadata structure. This allows the proxy to track token usage, costs, and other metrics in a provider-agnostic way.
The extractor must return the raw response body bytes so the proxy can re-attach them to the response for the caller. This preserves any custom/unsupported fields in the original JSON.
type ResponseMetadata ¶
type ResponseMetadata struct {
// ID is the unique identifier for the response (provider-specific).
ID string `json:"id,omitempty"`
// Object is the object type (e.g., "chat.completion").
Object string `json:"object,omitempty"`
// Model is the model used for the completion.
Model string `json:"model,omitempty"`
// Usage contains token consumption statistics.
Usage Usage `json:"usage"`
// Choices contains the completion choices.
Choices []Choice `json:"choices,omitempty"`
// Custom holds provider-specific response fields.
Custom map[string]any `json:"-"`
}
ResponseMetadata contains extracted metadata from a provider response. It provides a unified view of response data across different providers.
type RoundTripFunc ¶
RoundTripFunc is the signature for executing a request through the chain. It returns the response, metadata, raw response body, and error.
type URLResolver ¶
type URLResolver interface {
// Resolve returns the upstream URL for the given request metadata.
// The returned URL should be the full endpoint for the completion request.
//
// Implementations can use metadata fields (like Model) to make routing decisions.
Resolve(meta BodyMetadata) (*url.URL, error)
}
URLResolver determines the upstream provider URL for a given request.
This allows routing requests to different endpoints based on the request metadata, such as model name. Some providers may use different endpoints for different models or have region-specific URLs.
type Usage ¶
type Usage struct {
// PromptTokens is the number of tokens in the prompt.
PromptTokens int `json:"prompt_tokens"`
// CompletionTokens is the number of tokens generated in the completion.
CompletionTokens int `json:"completion_tokens"`
// TotalTokens is the sum of prompt and completion tokens.
TotalTokens int `json:"total_tokens"`
}
Usage tracks token consumption for a completion request.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
examples
|
|
|
basic
command
Example_basic demonstrates a basic proxy setup with multiple providers.
|
Example_basic demonstrates a basic proxy setup with multiple providers. |
|
pricing
|
|
|
modelsdev
Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json).
|
Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json). |
|
providers
|
|
|
anthropic
Package anthropic provides a provider implementation for Anthropic's Claude API.
|
Package anthropic provides a provider implementation for Anthropic's Claude API. |
|
bedrock
Package bedrock provides a provider implementation for AWS Bedrock.
|
Package bedrock provides a provider implementation for AWS Bedrock. |
|
fireworks
Package fireworks provides a provider implementation for Fireworks AI's API.
|
Package fireworks provides a provider implementation for Fireworks AI's API. |
|
googleai
Package googleai provides a provider implementation for Google AI's Gemini API.
|
Package googleai provides a provider implementation for Google AI's Gemini API. |
|
groq
Package groq provides a provider implementation for Groq's API.
|
Package groq provides a provider implementation for Groq's API. |
|
openai
Package openai provides a provider implementation for OpenAI's API.
|
Package openai provides a provider implementation for OpenAI's API. |
|
openai_compatible
Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs.
|
Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs. |
|
xai
Package xai provides a provider implementation for x.AI's Grok API.
|
Package xai provides a provider implementation for x.AI's Grok API. |