llmproxy

package module

v0.0.4 Latest Latest Go to latest Published: Apr 13, 2026 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/agentuity/llmproxy

Links

Open Source Insights

README ¶

llmproxy

A Go library for proxying requests to upstream LLM providers with pluggable, composable architecture.

Install

go get github.com/agentuity/llmproxy

Quick Start

package main

import (
    "context"
    "io"
    "net/http"

    "github.com/agentuity/llmproxy"
    "github.com/agentuity/llmproxy/interceptors"
    "github.com/agentuity/llmproxy/providers/openai"
)

func main() {
    ctx := context.Background()

    provider, _ := openai.New("sk-your-key")

    proxy := llmproxy.NewProxy(provider,
        llmproxy.WithInterceptor(interceptors.NewLogging(nil)),
    )

    http.HandleFunc("/v1/chat/completions", func(w http.ResponseWriter, r *http.Request) {
        resp, meta, err := proxy.Forward(ctx, r)
        if err != nil {
            http.Error(w, err.Error(), 500)
            return
        }
        defer resp.Body.Close()

        // Response includes token usage
        _ = meta.Usage.PromptTokens
        _ = meta.Usage.CompletionTokens

        io.Copy(w, resp.Body)
    })

    http.ListenAndServe(":8080", nil)
}

Features

9 Provider Implementations: OpenAI, Anthropic, Groq, Fireworks, x.AI, Google AI, AWS Bedrock, Azure OpenAI, OpenAI-compatible base
8 Built-in Interceptors: Logging, Metrics, Retry, Billing, Tracing (OTel), HeaderBan, AddHeader, PromptCaching
Pricing Integration: models.dev adapter with markup support
Prompt Caching: prompt caching support for Anthropic, OpenAI, xAI, Fireworks, and Bedrock
Raw Body Preservation: Custom JSON fields pass through unchanged

Providers

Provider	Auth	API Format
OpenAI	Bearer token	Chat completions
Anthropic	`x-api-key`	Messages API
Groq	Bearer token	OpenAI-compatible
Fireworks	Bearer token	OpenAI-compatible
x.AI	Bearer token	OpenAI-compatible
Google AI	API key query param	Gemini generateContent
AWS Bedrock	AWS Signature V4	Converse API
Azure OpenAI	`api-key` or Azure AD	Chat completions (deployments)

Interceptors

// Logging
llmproxy.WithInterceptor(interceptors.NewLogging(logger))

// Metrics (thread-safe)
metrics := &interceptors.Metrics{}
llmproxy.WithInterceptor(interceptors.NewMetrics(metrics))

// Retry on 429/5xx
llmproxy.WithInterceptor(interceptors.NewRetry(3, time.Second))

// Billing with models.dev pricing
adapter, _ := modelsdev.LoadFromURL()
llmproxy.WithInterceptor(interceptors.NewBilling(adapter.GetCostLookup(), func(r llmproxy.BillingResult) {
    log.Printf("Cost: $%.6f", r.TotalCost)
}))

// OTel tracing
llmproxy.WithInterceptor(interceptors.NewTracing(otelExtractor))

// Strip sensitive headers
llmproxy.WithInterceptor(interceptors.NewResponseHeaderBan("Openai-Organization"))

// Add custom headers
llmproxy.WithInterceptor(interceptors.NewAddResponseHeader(
    interceptors.NewHeader("X-Gateway", "llmproxy"),
))

// Anthropic prompt caching (default 5 min, free)
llmproxy.WithInterceptor(interceptors.NewAnthropicPromptCaching(interceptors.CacheRetentionDefault))

// Anthropic prompt caching with 1h retention (costs more)
llmproxy.WithInterceptor(interceptors.NewAnthropicPromptCaching(interceptors.CacheRetention1h))

// OpenAI prompt caching with explicit cache key
llmproxy.WithInterceptor(interceptors.NewOpenAIPromptCaching(interceptors.CacheRetention24h, "my-cache-key"))

// OpenAI prompt caching with auto-derived key and tenant namespace
llmproxy.WithInterceptor(interceptors.NewOpenAIPromptCachingAuto("tenant-123", interceptors.CacheRetentionDefault))

// xAI/Grok prompt caching (uses x-grok-conv-id header)
llmproxy.WithInterceptor(interceptors.NewXAIPromptCaching("conv-abc123"))

// Fireworks prompt caching (uses x-session-affinity and x-prompt-cache-isolation-key headers)
llmproxy.WithInterceptor(interceptors.NewFireworksPromptCaching("session-123"))

Architecture

The library uses small, focused interfaces that compose into providers:

Parse → Enrich → Resolve → Forward → Extract

BodyParser — Extract metadata from request body
RequestEnricher — Add auth headers
URLResolver — Determine upstream URL
ResponseExtractor — Parse response metadata
Provider — Composes the above
Interceptor — Wrap request/response for cross-cutting concerns

See DESIGN.md for full architecture details.

Example

A complete multi-provider proxy server:

cd examples/basic
go run main.go

Environment variables:

Variable	Provider
`OPENAI_API_KEY`	OpenAI
`ANTHROPIC_API_KEY`	Anthropic
`GROQ_API_KEY`	Groq
`FIREWORKS_API_KEY`	Fireworks
`XAI_API_KEY`	x.AI
`GOOGLE_AI_API_KEY`	Google AI
`AZURE_OPENAI_RESOURCE`	Azure OpenAI
`AZURE_OPENAI_API_KEY`	Azure OpenAI
`AWS_REGION` + `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	AWS Bedrock

License

MIT

Documentation ¶

Overview ¶

Package llmproxy provides a pluggable, composable library for proxying requests to upstream LLM providers.

The library uses small, focused interfaces that can be mixed and matched to create custom provider implementations. It supports OpenAI-compatible APIs out of the box and can be extended for provider-specific behaviors.

Core concepts:

BodyParser: Extracts metadata from request bodies
RequestEnricher: Modifies outgoing requests (headers, etc.)
ResponseExtractor: Extracts metadata from responses
URLResolver: Determines the upstream provider URL
Provider: Composes the above components
Interceptor: Wraps the request/response flow for cross-cutting concerns

Basic usage:

provider, _ := openai.New("sk-your-key")
proxy := llmproxy.NewProxy(provider,
    llmproxy.WithInterceptor(interceptors.NewLogging(nil)),
)
resp, meta, _ := proxy.Forward(ctx, req)

Index ¶

type BaseProvider
- func NewBaseProvider(name string, opts ...ProviderOption) *BaseProvider
- func (p *BaseProvider) BodyParser() BodyParser
- func (p *BaseProvider) Name() string
- func (p *BaseProvider) RequestEnricher() RequestEnricher
- func (p *BaseProvider) ResponseExtractor() ResponseExtractor
- func (p *BaseProvider) URLResolver() URLResolver
type BillingResult
- func CalculateCost(provider, model string, costInfo CostInfo, promptTokens, completionTokens int, ...) BillingResult
type BodyMetadata
type BodyParser
type CacheDetail
type CacheUsage
type Choice
type CostInfo
type CostLookup
type Interceptor
type InterceptorChain
- func (c InterceptorChain) Wrap(final RoundTripFunc) RoundTripFunc
type Logger
type LoggerFunc
- func (f LoggerFunc) Debug(msg string, args ...interface{})
- func (f LoggerFunc) Error(msg string, args ...interface{})
- func (f LoggerFunc) Info(msg string, args ...interface{})
- func (f LoggerFunc) Warn(msg string, args ...interface{})
type MapRegistry
- func NewRegistry() *MapRegistry
- func (r *MapRegistry) Get(name string) (Provider, bool)
- func (r *MapRegistry) Match(req *http.Request) (Provider, error)
- func (r *MapRegistry) Register(p Provider)
type Message
type MetaContextKey
type MetaContextValue
- func GetMetaFromContext(ctx context.Context) MetaContextValue
type Provider
type ProviderOption
- func WithBodyParser(bp BodyParser) ProviderOption
- func WithRequestEnricher(re RequestEnricher) ProviderOption
- func WithResponseExtractor(re ResponseExtractor) ProviderOption
- func WithURLResolver(ur URLResolver) ProviderOption
type Proxy
- func NewProxy(provider Provider, opts ...ProxyOption) *Proxy
- func (p *Proxy) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)
type ProxyOption
- func WithHTTPClient(c *http.Client) ProxyOption
- func WithInterceptor(i Interceptor) ProxyOption
type Registry
type RequestEnricher
type ResponseExtractor
type ResponseMetadata
type RoundTripFunc
type URLResolver
type Usage

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type BaseProvider ¶

type BaseProvider struct {
	// contains filtered or unexported fields
}

BaseProvider provides a configurable implementation of Provider. It allows setting individual components via functional options, making it easy to mix and match behaviors.

Use NewBaseProvider with With* options to create a custom provider:

provider := NewBaseProvider("my-provider",
    WithBodyParser(myParser),
    WithRequestEnricher(myEnricher),
)

func NewBaseProvider ¶

func NewBaseProvider(name string, opts ...ProviderOption) *BaseProvider

NewBaseProvider creates a new provider with the given name and options. Unset components will return nil from their accessor methods.

func (*BaseProvider) BodyParser ¶

func (p *BaseProvider) BodyParser() BodyParser

BodyParser returns the configured body parser, or nil if not set.

func (*BaseProvider) Name ¶

func (p *BaseProvider) Name() string

Name returns the provider's name.

func (*BaseProvider) RequestEnricher ¶

func (p *BaseProvider) RequestEnricher() RequestEnricher

RequestEnricher returns the configured request enricher, or nil if not set.

func (*BaseProvider) ResponseExtractor ¶

func (p *BaseProvider) ResponseExtractor() ResponseExtractor

ResponseExtractor returns the configured response extractor, or nil if not set.

func (*BaseProvider) URLResolver ¶

func (p *BaseProvider) URLResolver() URLResolver

URLResolver returns the configured URL resolver, or nil if not set.

type BillingResult ¶

type BillingResult struct {
	// Provider is the provider name.
	Provider string
	// Model is the model identifier.
	Model string
	// PromptTokens is the number of input tokens.
	PromptTokens int
	// CompletionTokens is the number of output tokens.
	CompletionTokens int
	// CachedTokens is the number of prompt tokens served from cache.
	CachedTokens int
	// TotalTokens is the sum of prompt and completion tokens.
	TotalTokens int
	// InputCost is the calculated input cost in USD (non-cached prompt tokens).
	InputCost float64
	// CachedInputCost is the cost for cached prompt tokens in USD.
	CachedInputCost float64
	// OutputCost is the calculated output cost in USD.
	OutputCost float64
	// TotalCost is the sum of all costs in USD.
	TotalCost float64
}

BillingResult contains the calculated cost for a request.

func CalculateCost ¶

func CalculateCost(provider, model string, costInfo CostInfo, promptTokens, completionTokens int, cacheUsage *CacheUsage) BillingResult

CalculateCost computes the billing result from cost info, token usage, and cache usage. Cached tokens are billed at the CacheRead rate (if available), and non-cached prompt tokens are billed at the full Input rate.

type BodyMetadata ¶

type BodyMetadata struct {
	// Model is the requested model identifier (e.g., "gpt-4", "claude-3-opus").
	Model string `json:"model"`
	// Messages contains the conversation history for chat completions.
	Messages []Message `json:"messages,omitempty"`
	// MaxTokens is the maximum number of tokens to generate.
	MaxTokens int `json:"max_tokens,omitempty"`
	// Stream indicates whether streaming is requested.
	Stream bool `json:"stream"`
	// Custom holds provider-specific fields that don't map to standard fields.
	Custom map[string]any `json:"-"`
}

BodyMetadata contains extracted metadata from a parsed request body. It provides a common structure that works across different LLM providers while allowing provider-specific fields via the Custom map.

type BodyParser ¶

type BodyParser interface {
	// Parse reads the request body and extracts metadata.
	// It returns the parsed metadata, the raw body bytes (for later use),
	// and any error encountered during parsing.
	//
	// The caller is responsible for closing the body ReadCloser.
	Parse(body io.ReadCloser) (BodyMetadata, []byte, error)
}

BodyParser extracts metadata from a request body.

Since io.ReadCloser can only be read once, Parse returns both the extracted metadata and the raw body bytes. The caller is responsible for reconstructing the body for the upstream request.

Implementations should handle provider-specific JSON formats and map them to the common BodyMetadata structure.

type CacheDetail ¶ added in v0.0.3

type CacheDetail struct {
	// TTL is the time-to-live for the cache entry (e.g., "5m", "1h").
	TTL string `json:"ttl,omitempty"`
	// CacheWriteTokens is the number of tokens written to cache at this TTL.
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
}

CacheDetail contains cache details for a checkpoint (Bedrock).

type CacheUsage ¶ added in v0.0.3

type CacheUsage struct {
	// CachedTokens is the number of tokens served from cache (OpenAI).
	CachedTokens int `json:"cached_tokens,omitempty"`
	// CacheCreationInputTokens is the number of tokens written to cache (Anthropic).
	CacheCreationInputTokens int `json:"cache_creation_input_tokens,omitempty"`
	// CacheReadInputTokens is the number of tokens read from cache (Anthropic).
	CacheReadInputTokens int `json:"cache_read_input_tokens,omitempty"`
	// Ephemeral5mInputTokens is the number of 5-minute cache write tokens (Anthropic).
	Ephemeral5mInputTokens int `json:"ephemeral_5m_input_tokens,omitempty"`
	// Ephemeral1hInputTokens is the number of 1-hour cache write tokens (Anthropic).
	Ephemeral1hInputTokens int `json:"ephemeral_1h_input_tokens,omitempty"`
	// CacheWriteTokens is the number of tokens written to cache (Bedrock).
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
	// CacheDetails contains TTL-based cache write breakdown (Bedrock).
	CacheDetails []CacheDetail `json:"cache_details,omitempty"`
}

CacheUsage tracks prompt caching token consumption.

type Choice ¶

type Choice struct {
	// Index is the position of this choice in the choices array.
	Index int `json:"index"`
	// Message contains the completed message (for non-streaming responses).
	Message *Message `json:"message,omitempty"`
	// Delta contains the partial message (for streaming responses).
	Delta *Message `json:"delta,omitempty"`
	// FinishReason indicates why the completion stopped (e.g., "stop", "length").
	FinishReason string `json:"finish_reason"`
}

Choice represents a single completion choice in the response.

type CostInfo ¶

type CostInfo struct {
	// Input is the cost per 1M input tokens in USD.
	Input float64
	// Output is the cost per 1M output tokens in USD.
	Output float64
	// CacheRead is the cost per 1M cached input tokens (optional).
	CacheRead float64
	// CacheWrite is the cost per 1M cache write tokens (optional, Anthropic).
	CacheWrite float64
}

CostInfo contains pricing information for a model.

type CostLookup ¶

type CostLookup func(provider string, model string) (CostInfo, bool)

CostLookup is a function that returns the cost for a given provider and model. It should return the pricing info or false if the model is not found.

The lookup function allows the pricing data to be managed externally, such as downloading from models.dev or using a custom pricing database.

type Interceptor ¶

type Interceptor interface {
	// Intercept processes a request through the interceptor chain.
	//
	// Parameters:
	//   - req: The HTTP request to send upstream
	//   - meta: Parsed metadata from the request body
	//   - rawBody: The original request body bytes
	//   - next: The next handler in the chain (call this to continue)
	//
	// Returns:
	//   - resp: The HTTP response (body will be re-attached from rawRespBody)
	//   - respMeta: Parsed response metadata
	//   - rawRespBody: The raw response body bytes
	//   - error: Any error that occurred
	Intercept(req *http.Request, meta BodyMetadata, rawBody []byte, next RoundTripFunc) (resp *http.Response, respMeta ResponseMetadata, rawRespBody []byte, err error)
}

Interceptor wraps the request/response cycle for cross-cutting concerns.

Interceptors form a chain around the actual request execution, allowing behavior to be added before and after the upstream call. Common uses include:

Logging request/response details
Collecting metrics (latency, token usage)
Retrying failed requests
Rate limiting
Caching responses

Interceptors must call next(req) to continue the chain. Not calling next will short-circuit the request (useful for caching or mocking).

type InterceptorChain ¶

type InterceptorChain []Interceptor

InterceptorChain is an ordered list of interceptors that are applied in sequence. Interceptors are applied in reverse order during wrapping so that they execute in forward order during request processing.

func (InterceptorChain) Wrap ¶

func (c InterceptorChain) Wrap(final RoundTripFunc) RoundTripFunc

Wrap chains all interceptors around the final RoundTripFunc. Interceptors are wrapped in reverse order so they execute in forward order.

Example: Given interceptors [A, B, C] and final function F:

Wrapping produces: A(B(C(F)))
Execution order: A -> B -> C -> F -> C -> B -> A

type Logger ¶

type Logger interface {
	// Debug level logging
	Debug(msg string, args ...interface{})
	// Info level logging
	Info(msg string, args ...interface{})
	// Warning level logging
	Warn(msg string, args ...interface{})
	// Error level logging
	Error(msg string, args ...interface{})
}

Logger is an interface for logging. It matches the interface from github.com/agentuity/go-common/logger. Any logger implementing this interface can be used with interceptors.

type LoggerFunc ¶

type LoggerFunc func(level string, msg string, args ...interface{})

LoggerFunc is an adapter to allow using ordinary functions as loggers.

func (LoggerFunc) Debug ¶

func (f LoggerFunc) Debug(msg string, args ...interface{})

func (LoggerFunc) Error ¶

func (f LoggerFunc) Error(msg string, args ...interface{})

func (LoggerFunc) Info ¶

func (f LoggerFunc) Info(msg string, args ...interface{})

func (LoggerFunc) Warn ¶

func (f LoggerFunc) Warn(msg string, args ...interface{})

type MapRegistry ¶

type MapRegistry struct {
	// contains filtered or unexported fields
}

MapRegistry is a simple registry that stores providers by name. It provides thread-safe registration and lookup.

func NewRegistry ¶

func NewRegistry() *MapRegistry

NewRegistry creates a new empty registry.

func (*MapRegistry) Get ¶

func (r *MapRegistry) Get(name string) (Provider, bool)

Get retrieves a provider by name.

func (*MapRegistry) Match ¶

func (r *MapRegistry) Match(req *http.Request) (Provider, error)

Match is not implemented for MapRegistry and returns nil. Use a more sophisticated implementation for request-based routing.

func (*MapRegistry) Register ¶

func (r *MapRegistry) Register(p Provider)

Register adds a provider to the registry under its name. If a provider with the same name exists, it is replaced.

type Message ¶

type Message struct {
	// Role is the role of the message author (e.g., "user", "assistant", "system").
	Role string `json:"role"`
	// Content is the text content of the message.
	Content string `json:"content"`
}

Message represents a single message in a chat completion request.

type MetaContextKey ¶

type MetaContextKey struct{}

MetaContextKey is the context key for storing request metadata.

type MetaContextValue ¶

type MetaContextValue struct {
	Meta    BodyMetadata
	RawBody []byte
	OrgID   string
}

MetaContextValue holds the metadata stored in request context.

func GetMetaFromContext ¶

func GetMetaFromContext(ctx context.Context) MetaContextValue

GetMetaFromContext retrieves the metadata stored in a context. Returns an empty MetaContextValue if the context is nil or doesn't contain metadata.

type Provider ¶

type Provider interface {
	// Name returns the provider's unique identifier (e.g., "openai", "anthropic").
	Name() string

	// BodyParser returns the parser for extracting request metadata.
	BodyParser() BodyParser

	// RequestEnricher returns the enricher for modifying outgoing requests.
	RequestEnricher() RequestEnricher

	// ResponseExtractor returns the extractor for parsing responses.
	ResponseExtractor() ResponseExtractor

	// URLResolver returns the resolver for determining upstream URLs.
	URLResolver() URLResolver
}

Provider composes all the components needed to handle requests for an LLM provider.

A provider brings together:

BodyParser: To extract request metadata
RequestEnricher: To modify outgoing requests
ResponseExtractor: To parse responses
URLResolver: To determine upstream URLs

Implementations can use BaseProvider for a configurable default, or implement Provider directly for complete control.

type ProviderOption ¶

type ProviderOption func(*BaseProvider)

ProviderOption configures a BaseProvider during construction.

func WithBodyParser ¶

func WithBodyParser(bp BodyParser) ProviderOption

WithBodyParser sets the body parser for the provider.

func WithRequestEnricher ¶

func WithRequestEnricher(re RequestEnricher) ProviderOption

WithRequestEnricher sets the request enricher for the provider.

func WithResponseExtractor ¶

func WithResponseExtractor(re ResponseExtractor) ProviderOption

WithResponseExtractor sets the response extractor for the provider.

func WithURLResolver ¶

func WithURLResolver(ur URLResolver) ProviderOption

WithURLResolver sets the URL resolver for the provider.

type Proxy ¶

type Proxy struct {
	// contains filtered or unexported fields
}

Proxy forwards requests to an upstream LLM provider.

Proxy handles the complete request lifecycle:

Reads and parses the request body
Resolves the upstream URL
Creates and enriches the upstream request
Executes the request through the interceptor chain
Extracts metadata from the response
Re-attaches the raw response body

Use NewProxy with functional options to configure:

proxy := NewProxy(provider,
    WithInterceptor(loggingInterceptor),
    WithHTTPClient(customClient),
)

func NewProxy ¶

func NewProxy(provider Provider, opts ...ProxyOption) *Proxy

NewProxy creates a new proxy for the given provider. Options can be used to add interceptors or customize the HTTP client.

func (*Proxy) Forward ¶

func (p *Proxy) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)

Forward sends a request to the upstream provider and returns the response.

The method:

Reads and parses the request body to extract metadata
Resolves the upstream URL based on the metadata
Creates a new request for the upstream, copying headers
Enriches the request with provider-specific headers
Executes the request through the interceptor chain
Extracts metadata from the response
Re-attaches the raw response body so the caller can read it

The returned response body contains the original raw bytes from the upstream and can be read by the caller. Any custom/unsupported fields in the JSON are preserved.

type ProxyOption ¶

type ProxyOption func(*Proxy)

ProxyOption configures a Proxy during construction.

func WithHTTPClient ¶

func WithHTTPClient(c *http.Client) ProxyOption

WithHTTPClient sets a custom HTTP client for upstream requests. If not set, http.DefaultClient is used.

func WithInterceptor ¶

func WithInterceptor(i Interceptor) ProxyOption

WithInterceptor adds an interceptor to the global chain. Interceptors are applied in the order they are added.

type Registry ¶

type Registry interface {
	// Register adds a provider to the registry.
	Register(p Provider)

	// Get retrieves a provider by name.
	// Returns the provider and true if found, nil and false otherwise.
	Get(name string) (Provider, bool)

	// Match selects a provider for the given request.
	// Implementations may parse the request body to determine routing.
	// Returns the matched provider, or an error if no match is found.
	Match(req *http.Request) (Provider, error)
}

Registry manages a collection of providers and supports routing requests to the appropriate provider based on request characteristics.

type RequestEnricher ¶

type RequestEnricher interface {
	// Enrich modifies the request with provider-specific enhancements.
	// The meta parameter contains parsed body metadata for decision-making.
	// The rawBody contains the original request body bytes.
	//
	// Implementations should modify req in place (headers, URL, etc.)
	// and return nil on success, or an error to abort the request.
	Enrich(req *http.Request, meta BodyMetadata, rawBody []byte) error
}

RequestEnricher modifies an outgoing request before it's sent to the upstream provider.

Typical uses include:

Setting authentication headers (Authorization, X-API-Key, etc.)
Adding provider-specific headers
Modifying the request body or URL

The rawBody is provided for cases where the enricher needs to modify the body content.

type ResponseExtractor ¶

type ResponseExtractor interface {
	// Extract parses the HTTP response and returns unified metadata.
	//
	// The method reads and consumes the response body, parses it for metadata,
	// and returns both the metadata and the raw body bytes. The proxy will
	// re-attach the raw bytes to the response so the caller can read them.
	//
	// Parameters:
	//   - resp: The HTTP response from the upstream provider
	//
	// Returns:
	//   - metadata: Parsed response metadata (tokens, model, etc.)
	//   - rawBody: The original response body bytes (must be returned for forwarding)
	//   - error: Any parsing error
	Extract(resp *http.Response) (metadata ResponseMetadata, rawBody []byte, err error)
}

ResponseExtractor parses an upstream provider response and extracts metadata.

Implementations handle provider-specific response formats and map them to the common ResponseMetadata structure. This allows the proxy to track token usage, costs, and other metrics in a provider-agnostic way.

The extractor must return the raw response body bytes so the proxy can re-attach them to the response for the caller. This preserves any custom/unsupported fields in the original JSON.

type ResponseMetadata ¶

type ResponseMetadata struct {
	// ID is the unique identifier for the response (provider-specific).
	ID string `json:"id,omitempty"`
	// Object is the object type (e.g., "chat.completion").
	Object string `json:"object,omitempty"`
	// Model is the model used for the completion.
	Model string `json:"model,omitempty"`
	// Usage contains token consumption statistics.
	Usage Usage `json:"usage"`
	// Choices contains the completion choices.
	Choices []Choice `json:"choices,omitempty"`
	// Custom holds provider-specific response fields.
	Custom map[string]any `json:"-"`
}

ResponseMetadata contains extracted metadata from a provider response. It provides a unified view of response data across different providers.

type RoundTripFunc ¶

type RoundTripFunc func(*http.Request) (*http.Response, ResponseMetadata, []byte, error)

RoundTripFunc is the signature for executing a request through the chain. It returns the response, metadata, raw response body, and error.

type URLResolver ¶

type URLResolver interface {
	// Resolve returns the upstream URL for the given request metadata.
	// The returned URL should be the full endpoint for the completion request.
	//
	// Implementations can use metadata fields (like Model) to make routing decisions.
	Resolve(meta BodyMetadata) (*url.URL, error)
}

URLResolver determines the upstream provider URL for a given request.

This allows routing requests to different endpoints based on the request metadata, such as model name. Some providers may use different endpoints for different models or have region-specific URLs.

type Usage ¶

type Usage struct {
	// PromptTokens is the number of tokens in the prompt.
	PromptTokens int `json:"prompt_tokens"`
	// CompletionTokens is the number of tokens generated in the completion.
	CompletionTokens int `json:"completion_tokens"`
	// TotalTokens is the sum of prompt and completion tokens.
	TotalTokens int `json:"total_tokens"`
}

Usage tracks token consumption for a completion request.

Source Files ¶

View all Source files

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
examples
basic command Example_basic demonstrates a basic proxy setup with multiple providers.	Example_basic demonstrates a basic proxy setup with multiple providers.
interceptors
pricing
modelsdev Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json).	Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json).
providers
anthropic Package anthropic provides a provider implementation for Anthropic's Claude API.	Package anthropic provides a provider implementation for Anthropic's Claude API.
azure
bedrock Package bedrock provides a provider implementation for AWS Bedrock.	Package bedrock provides a provider implementation for AWS Bedrock.
fireworks Package fireworks provides a provider implementation for Fireworks AI's API.	Package fireworks provides a provider implementation for Fireworks AI's API.
googleai Package googleai provides a provider implementation for Google AI's Gemini API.	Package googleai provides a provider implementation for Google AI's Gemini API.
groq Package groq provides a provider implementation for Groq's API.	Package groq provides a provider implementation for Groq's API.
openai Package openai provides a provider implementation for OpenAI's API.	Package openai provides a provider implementation for OpenAI's API.
openai_compatible Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs.	Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs.
perplexity
xai Package xai provides a provider implementation for x.AI's Grok API.	Package xai provides a provider implementation for x.AI's Grok API.