llmproxy

package module

v0.0.10 Latest Latest Go to latest Published: May 4, 2026 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/agentuity/llmproxy

Links

Open Source Insights

README ¶

llmproxy

A Go library for proxying requests to upstream LLM providers with pluggable, composable architecture.

Install

go get github.com/agentuity/llmproxy

Quick Start

Simple Proxy

package main

import (
    "context"
    "io"
    "net/http"

    "github.com/agentuity/llmproxy"
    "github.com/agentuity/llmproxy/interceptors"
    "github.com/agentuity/llmproxy/providers/openai"
)

func main() {
    ctx := context.Background()

    provider, _ := openai.New("sk-your-key")

    proxy := llmproxy.NewProxy(provider,
        llmproxy.WithInterceptor(interceptors.NewLogging(nil)),
    )

    http.HandleFunc("/v1/chat/completions", func(w http.ResponseWriter, r *http.Request) {
        resp, meta, err := proxy.Forward(ctx, r)
        if err != nil {
            http.Error(w, err.Error(), 500)
            return
        }
        defer resp.Body.Close()

        // Response includes token usage
        _ = meta.Usage.PromptTokens
        _ = meta.Usage.CompletionTokens

        io.Copy(w, resp.Body)
    })

    http.ListenAndServe(":8080", nil)
}

AutoRouter (Recommended)

Single endpoint that auto-detects provider and API type:

package main

import (
    "net/http"

    "github.com/agentuity/llmproxy"
    "github.com/agentuity/llmproxy/providers/openai"
    "github.com/agentuity/llmproxy/providers/anthropic"
)

func main() {
    openaiProvider, _ := openai.New("sk-openai-key")
    anthropicProvider, _ := anthropic.New("sk-ant-key")

    router := llmproxy.NewAutoRouter(
        llmproxy.WithAutoRouterFallbackProvider(openaiProvider),
    )
    router.RegisterProvider(openaiProvider)
    router.RegisterProvider(anthropicProvider)

    // Single endpoint handles all providers and APIs
    http.Handle("/", router)
    http.ListenAndServe(":8080", nil)
}

POST to / with any model - provider and API are auto-detected:

# Auto-detect OpenAI from gpt-4 model name
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}]}'

# Auto-detect Anthropic from claude model name  
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-3-opus","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'

# Auto-detect Responses API from input field
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4o","input":"Hello"}'

Features

9 Provider Implementations: OpenAI, Anthropic, Groq, Fireworks, x.AI, Google AI, AWS Bedrock, Azure OpenAI, OpenAI-compatible base
AutoRouter: Single endpoint with automatic provider/API detection
Responses API: Full support for OpenAI's Responses API (HTTP streaming and WebSocket mode)
WebSocket Mode: Persistent connections for multi-turn Responses API workflows with per-turn billing
SSE Streaming: Full streaming support with efficient token usage extraction
8 Built-in Interceptors: Logging, Metrics, Retry, Billing, Tracing (OTel), HeaderBan, AddHeader, PromptCaching
Pricing Integration: models.dev adapter with markup support
Prompt Caching: prompt caching support for Anthropic, OpenAI, xAI, Fireworks, and Bedrock
Raw Body Preservation: Custom JSON fields pass through unchanged

AutoRouter

The AutoRouter provides automatic routing from a single endpoint:

Detection Order

Path-based - /v1/messages → Messages API, /v1/responses → Responses API
Body + Provider - When path is / or unknown:
- input field → Responses API
- prompt field → Completions API
- contents field → GenerateContent API
- messages + Anthropic → Messages API
- messages + other → Chat Completions

Provider Detection

X-Provider header - Explicit override
Model prefix - openai/gpt-4 → OpenAI (strips prefix before forwarding)
Model pattern - gpt-* → OpenAI, claude-* → Anthropic, etc.

Examples

# Explicit provider via header
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -H 'X-Provider: anthropic' \
  -d '{"model":"claude-3-opus","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'

# Provider prefix in model (gets stripped)
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -d '{"model":"anthropic/claude-3-opus","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'

# Traditional path still works
curl -X POST http://localhost:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}]}'

Streaming

SSE streaming is fully supported with automatic token usage extraction for billing:

# Streaming with automatic usage extraction
curl -X POST http://localhost:8080/ \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-4","stream":true,"messages":[{"role":"user","content":"Hello"}]}'

Key Features:

Efficient flushing: Uses http.ResponseController for immediate SSE delivery
Token extraction: Extracts usage from streaming responses for billing
Auto stream_options: Automatically injects stream_options.include_usage when billing is configured
Works with billing: Billing is calculated after stream completes

Example with billing:

adapter, _ := modelsdev.LoadFromURL()
billingCallback := func(r llmproxy.BillingResult) {
    log.Printf("Cost: $%.6f (tokens: %d/%d)", r.TotalCost, r.PromptTokens, r.CompletionTokens)
}

router := llmproxy.NewAutoRouter(
    llmproxy.WithAutoRouterBillingCalculator(llmproxy.NewBillingCalculator(adapter.GetCostLookup(), billingCallback)),
)

WebSocket Mode

The Responses API supports persistent WebSocket connections for multi-turn, tool-call-heavy workflows. WebSocket support is opt-in with a zero-dependency adapter pattern — bring your own WebSocket library.

gorilla/websocket Example

package main

import (
    "context"
    "log"
    "net/http"

    "github.com/agentuity/llmproxy"
    "github.com/agentuity/llmproxy/providers/openai"
    "github.com/gorilla/websocket"
)

// Configure allowed origins for WebSocket upgrades.
var trustedOrigins = []string{"https://myapp.example.com"}

// Thin adapters — gorilla's *Conn already satisfies llmproxy.WSConn

type gorillaUpgrader struct{ websocket.Upgrader }

func (u *gorillaUpgrader) Upgrade(w http.ResponseWriter, r *http.Request, h http.Header) (llmproxy.WSConn, error) {
    conn, err := u.Upgrader.Upgrade(w, r, h)
    return conn, err
}

type gorillaDialer struct{ websocket.Dialer }

func (d *gorillaDialer) DialContext(ctx context.Context, urlStr string, h http.Header) (llmproxy.WSConn, *http.Response, error) {
    conn, resp, err := d.Dialer.DialContext(ctx, urlStr, h)
    return conn, resp, err
}

func main() {
    // In production, validate the Origin header against trusted origins.
    // This example allows all origins for brevity.
    upgrader := websocket.Upgrader{
        CheckOrigin: func(r *http.Request) bool {
            origin := r.Header.Get("Origin")
            for _, allowed := range trustedOrigins {
                if origin == allowed {
                    return true
                }
            }
            return false
        },
    }

    provider, _ := openai.New("sk-your-key")

    router := llmproxy.NewAutoRouter(
        llmproxy.WithAutoRouterFallbackProvider(provider),
        llmproxy.WithAutoRouterWebSocket(
            &gorillaUpgrader{upgrader},
            &gorillaDialer{websocket.Dialer{}},
        ),
        llmproxy.WithAutoRouterWSBillingCallback(func(turn int, meta llmproxy.ResponseMetadata, billing *llmproxy.BillingResult) {
            log.Printf("Turn %d: %d prompt + %d completion tokens",
                turn, meta.Usage.PromptTokens, meta.Usage.CompletionTokens)
        }),
    )
    router.RegisterProvider(provider)

    http.Handle("/", router)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Clients connect with any WebSocket library:

from websocket import create_connection
import json

ws = create_connection("ws://localhost:8080/v1/responses",
    header=["Authorization: Bearer sk-your-key"])

ws.send(json.dumps({
    "type": "response.create",
    "model": "gpt-4o",
    "input": [{"type": "message", "role": "user",
               "content": [{"type": "input_text", "text": "Hello!"}]}],
}))

for msg in ws:
    event = json.loads(msg)
    print(event["type"], event.get("delta", ""))
    if event["type"] == "response.completed":
        break

The proxy handles model prefix stripping, auth header forwarding, usage extraction, and per-turn billing automatically. See DESIGN.md for full protocol details.

Providers

Provider	Auth	API Format	Notes
OpenAI	Bearer token	Chat completions, Responses, WebSocket	HTTP + WebSocket for `/v1/responses`
Anthropic	`x-api-key`	Messages API
Groq	Bearer token	OpenAI-compatible
Fireworks	Bearer token	OpenAI-compatible
x.AI	Bearer token	OpenAI-compatible
Google AI	API key query param	Gemini generateContent
AWS Bedrock	AWS Signature V4	Converse API
Azure OpenAI	`api-key` or Azure AD	Chat completions (deployments)

Interceptors

// Logging
llmproxy.WithInterceptor(interceptors.NewLogging(logger))

// Metrics (thread-safe)
metrics := &interceptors.Metrics{}
llmproxy.WithInterceptor(interceptors.NewMetrics(metrics))

// Retry on 429/5xx
llmproxy.WithInterceptor(interceptors.NewRetry(3, time.Second))

// Billing with models.dev pricing
adapter, _ := modelsdev.LoadFromURL()
llmproxy.WithInterceptor(interceptors.NewBilling(adapter.GetCostLookup(), func(r llmproxy.BillingResult) {
    log.Printf("Cost: $%.6f", r.TotalCost)
}))

// OTel tracing
llmproxy.WithInterceptor(interceptors.NewTracing(otelExtractor))

// Strip sensitive headers
llmproxy.WithInterceptor(interceptors.NewResponseHeaderBan("Openai-Organization"))

// Add custom headers
llmproxy.WithInterceptor(interceptors.NewAddResponseHeader(
    interceptors.NewHeader("X-Gateway", "llmproxy"),
))

// Anthropic prompt caching (default 5 min, free)
llmproxy.WithInterceptor(interceptors.NewAnthropicPromptCaching(interceptors.CacheRetentionDefault))

// Anthropic prompt caching with 1h retention (costs more)
llmproxy.WithInterceptor(interceptors.NewAnthropicPromptCaching(interceptors.CacheRetention1h))

// OpenAI prompt caching with explicit cache key
llmproxy.WithInterceptor(interceptors.NewOpenAIPromptCaching(interceptors.CacheRetention24h, "my-cache-key"))

// OpenAI prompt caching with auto-derived key and tenant namespace
llmproxy.WithInterceptor(interceptors.NewOpenAIPromptCachingAuto("tenant-123", interceptors.CacheRetentionDefault))

// xAI/Grok prompt caching (uses x-grok-conv-id header)
llmproxy.WithInterceptor(interceptors.NewXAIPromptCaching("conv-abc123"))

// Fireworks prompt caching (uses x-session-affinity and x-prompt-cache-isolation-key headers)
llmproxy.WithInterceptor(interceptors.NewFireworksPromptCaching("session-123"))

Architecture

The library uses small, focused interfaces that compose into providers:

Parse → Enrich → Resolve → Forward → Extract

BodyParser — Extract metadata from request body
RequestEnricher — Add auth headers
URLResolver — Determine upstream URL
ResponseExtractor — Parse response metadata
Provider — Composes the above
Interceptor — Wrap request/response for cross-cutting concerns

See DESIGN.md for full architecture details.

Example

A complete multi-provider proxy server:

cd examples/basic
go run main.go

Environment variables:

Variable	Provider
`OPENAI_API_KEY`	OpenAI
`ANTHROPIC_API_KEY`	Anthropic
`GROQ_API_KEY`	Groq
`FIREWORKS_API_KEY`	Fireworks
`XAI_API_KEY`	x.AI
`GOOGLE_AI_API_KEY`	Google AI
`AZURE_OPENAI_RESOURCE`	Azure OpenAI
`AZURE_OPENAI_API_KEY`	Azure OpenAI
`AWS_REGION` + `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`	AWS Bedrock

License

MIT

Documentation ¶

Overview ¶

Package llmproxy provides a pluggable, composable library for proxying requests to upstream LLM providers.

The library uses small, focused interfaces that can be mixed and matched to create custom provider implementations. It supports OpenAI-compatible APIs out of the box and can be extended for provider-specific behaviors.

Core concepts:

BodyParser: Extracts metadata from request bodies
RequestEnricher: Modifies outgoing requests (headers, etc.)
ResponseExtractor: Extracts metadata from responses
URLResolver: Determines the upstream provider URL
Provider: Composes the above components
Interceptor: Wraps the request/response flow for cross-cutting concerns

Basic usage:

provider, _ := openai.New("sk-your-key")
proxy := llmproxy.NewProxy(provider,
    llmproxy.WithInterceptor(interceptors.NewLogging(nil)),
)
resp, meta, _ := proxy.Forward(ctx, req)

Index ¶

Constants
Variables
func DetectProviderFromModel(model string) string
func FormatSSEEvent(event string, data []byte) []byte
func IsSSEStream(contentType string) bool
type APIType
- func DetectAPIType(body []byte) APIType
- func DetectAPITypeFromBodyAndProvider(body []byte, provider string) APIType
- func DetectAPITypeFromPath(path string) APIType
type AnthropicContentBlock
type AnthropicStreamDelta
type AnthropicStreamEvent
- func ParseAnthropicSSEEvent(data []byte) (*AnthropicStreamEvent, error)
type AnthropicStreamMessage
type AnthropicStreamUsage
type AutoRouter
- func NewAutoRouter(opts ...AutoRouterOption) *AutoRouter
- func (a *AutoRouter) BillingCalculator() *BillingCalculator
- func (a *AutoRouter) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)
- func (a *AutoRouter) ForwardStreaming(ctx context.Context, req *http.Request, w http.ResponseWriter) (ResponseMetadata, error)
- func (a *AutoRouter) ForwardWebSocket(ctx context.Context, w http.ResponseWriter, r *http.Request) error
- func (a *AutoRouter) GetProvider(name string) Provider
- func (a *AutoRouter) RegisterProvider(p Provider)
- func (a *AutoRouter) ServeHTTP(w http.ResponseWriter, r *http.Request)
type AutoRouterOption
- func WithAutoRouterBillingCalculator(calculator *BillingCalculator) AutoRouterOption
- func WithAutoRouterDetector(d ProviderDetector) AutoRouterOption
- func WithAutoRouterFallbackProvider(p Provider) AutoRouterOption
- func WithAutoRouterHTTPClient(c *http.Client) AutoRouterOption
- func WithAutoRouterInterceptor(i Interceptor) AutoRouterOption
- func WithAutoRouterModelProviderLookup(lookup ModelProviderLookup) AutoRouterOption
- func WithAutoRouterRegistry(r Registry) AutoRouterOption
- func WithAutoRouterWSBillingCallback(cb WSBillingCallback) AutoRouterOption
- func WithAutoRouterWebSocket(upgrader WSUpgrader, dialer WSDialer) AutoRouterOption
type BaseProvider
- func NewBaseProvider(name string, opts ...ProviderOption) *BaseProvider
- func (p *BaseProvider) BodyParser() BodyParser
- func (p *BaseProvider) Name() string
- func (p *BaseProvider) RequestEnricher() RequestEnricher
- func (p *BaseProvider) ResponseExtractor() ResponseExtractor
- func (p *BaseProvider) URLResolver() URLResolver
type BillingCalculator
- func NewBillingCalculator(lookup CostLookup, onResult func(BillingResult)) *BillingCalculator
- func (c *BillingCalculator) Calculate(meta BodyMetadata, respMeta *ResponseMetadata) *BillingResult
- func (c *BillingCalculator) Lookup() CostLookup
- func (c *BillingCalculator) OnResult() func(BillingResult)
type BillingResult
- func CalculateCost(provider, model string, costInfo CostInfo, promptTokens, completionTokens int, ...) BillingResult
type BodyMetadata
type BodyParser
type CacheDetail
type CacheUsage
type Choice
type CostInfo
type CostLookup
type DefaultStreamingHandler
- func NewDefaultStreamingHandler(extractor StreamingResponseExtractor) *DefaultStreamingHandler
- func (h *DefaultStreamingHandler) HandleStream(resp *http.Response, w http.ResponseWriter, meta BodyMetadata) (ResponseMetadata, error)
type Interceptor
type InterceptorChain
- func (c InterceptorChain) Wrap(final RoundTripFunc) RoundTripFunc
type Logger
type LoggerFunc
- func (f LoggerFunc) Debug(msg string, args ...interface{})
- func (f LoggerFunc) Error(msg string, args ...interface{})
- func (f LoggerFunc) Info(msg string, args ...interface{})
- func (f LoggerFunc) Warn(msg string, args ...interface{})
type MapRegistry
- func NewRegistry() *MapRegistry
- func (r *MapRegistry) Get(name string) (Provider, bool)
- func (r *MapRegistry) Match(req *http.Request) (Provider, error)
- func (r *MapRegistry) Register(p Provider)
type Message
- func (m Message) MarshalJSON() ([]byte, error)
- func (m *Message) UnmarshalJSON(data []byte) error
type MetaContextKey
type MetaContextValue
- func GetMetaFromContext(ctx context.Context) MetaContextValue
type ModelProviderLookup
type OpenAIStreamChoice
type OpenAIStreamChunk
- func ParseOpenAISSEEvent(data []byte) (*OpenAIStreamChunk, error)
type OpenAIStreamCompletionDetails
type OpenAIStreamDelta
type OpenAIStreamLogprobContent
type OpenAIStreamLogprobs
type OpenAIStreamPromptDetails
type OpenAIStreamUsage
type Provider
type ProviderDetector
type ProviderDetectorFunc
- func (f ProviderDetectorFunc) Detect(hint ProviderHint) string
type ProviderError
- func (e *ProviderError) Error() string
type ProviderHint
type ProviderOption
- func WithBodyParser(bp BodyParser) ProviderOption
- func WithRequestEnricher(re RequestEnricher) ProviderOption
- func WithResponseExtractor(re ResponseExtractor) ProviderOption
- func WithURLResolver(ur URLResolver) ProviderOption
type Proxy
- func NewProxy(provider Provider, opts ...ProxyOption) *Proxy
- func (p *Proxy) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)
type ProxyOption
- func WithHTTPClient(c *http.Client) ProxyOption
- func WithInterceptor(i Interceptor) ProxyOption
type Registry
type RequestEnricher
type ResponseExtractor
type ResponseMetadata
type ResponsesStreamEvent
- func ParseResponsesSSEEvent(data []byte) (*ResponsesStreamEvent, error)
type ResponsesStreamInputDetails
type ResponsesStreamOutputDetails
type ResponsesStreamResponse
type ResponsesStreamUsage
type RoundTripFunc
type SSEEvent
type SSEParser
- func NewSSEParser(r io.Reader) *SSEParser
- func (p *SSEParser) Next() (*SSEEvent, error)
type StreamingHandler
type StreamingResponseExtractor
type StreamingUsage
- func ExtractUsageFromAnthropicEvent(event *AnthropicStreamEvent) *StreamingUsage
- func ExtractUsageFromOpenAIChunk(chunk *OpenAIStreamChunk) *StreamingUsage
- func ExtractUsageFromResponsesEvent(event *ResponsesStreamEvent) *StreamingUsage
- func ExtractWSUsage(data []byte) (*StreamingUsage, error)
type TeeReader
- func NewTeeReader(r io.Reader, w io.Writer) *TeeReader
- func (t *TeeReader) Read(p []byte) (n int, err error)
type URLResolver
type Usage
type WSBillingCallback
type WSConn
type WSDialer
type WSEventCallback
type WSMessage
- func ParseWSMessage(data []byte) (*WSMessage, error)
type WSResponseCompleted
type WSResponseEnvelope
type WSResponseInputDetails
type WSResponseOutputDetails
type WSResponseUsage
type WSUpgrader
type WebSocketCapableProvider

Constants ¶

View Source

const (
	TextMessage   = 1
	BinaryMessage = 2
	CloseMessage  = 8
	PingMessage   = 9
	PongMessage   = 10
)

RFC 6455 WebSocket message type constants.

Variables ¶

View Source

var DefaultProviderDetector = ProviderDetectorFunc(func(hint ProviderHint) string {

	if hint.Headers != nil {
		if provider := hint.Headers.Get("X-Provider"); provider != "" {
			return provider
		}
	}

	if hint.Model != "" {
		if provider := DetectProviderFromModel(hint.Model); provider != "" {
			return provider
		}
	}

	if hint.Headers != nil {
		return detectProviderFromHeaderHeuristics(hint.Headers)
	}

	return ""
})

DefaultProviderDetector detects the provider from model name patterns and headers. Precedence: X-Provider header > model pattern > other header heuristics

View Source

var ErrNoProvider = &ProviderError{Message: "no provider available for request"}

View Source

var (
	ErrStreamComplete = errors.New("stream complete")
)

View Source

var ErrWebSocketNotConfigured = errors.New("websocket forwarding is not configured")

Functions ¶

func DetectProviderFromModel ¶ added in v0.0.6

func DetectProviderFromModel(model string) string

DetectProviderFromModel returns the provider name based on model naming patterns.

func FormatSSEEvent ¶ added in v0.0.6

func FormatSSEEvent(event string, data []byte) []byte

func IsSSEStream ¶ added in v0.0.6

func IsSSEStream(contentType string) bool

Types ¶

type APIType ¶ added in v0.0.6

type APIType string

const (
	APITypeChatCompletions APIType = "chat_completions"
	APITypeResponses       APIType = "responses"
	APITypeCompletions     APIType = "completions"
	APITypeMessages        APIType = "messages"
	APITypeGenerateContent APIType = "generate_content"
	APITypeConverse        APIType = "converse"
)

func DetectAPIType ¶ added in v0.0.6

func DetectAPIType(body []byte) APIType

func DetectAPITypeFromBodyAndProvider ¶ added in v0.0.6

func DetectAPITypeFromBodyAndProvider(body []byte, provider string) APIType

func DetectAPITypeFromPath ¶ added in v0.0.6

func DetectAPITypeFromPath(path string) APIType

type AnthropicContentBlock ¶ added in v0.0.6

type AnthropicContentBlock struct {
	Type string `json:"type"`
	Text string `json:"text,omitempty"`
}

type AnthropicStreamDelta ¶ added in v0.0.6

type AnthropicStreamDelta struct {
	Type       string `json:"type,omitempty"`
	Text       string `json:"text,omitempty"`
	StopReason string `json:"stop_reason,omitempty"`
}

type AnthropicStreamEvent ¶ added in v0.0.6

type AnthropicStreamEvent struct {
	Type         string                  `json:"type"`
	Index        int                     `json:"index,omitempty"`
	Delta        *AnthropicStreamDelta   `json:"delta,omitempty"`
	ContentBlock *AnthropicContentBlock  `json:"content_block,omitempty"`
	Usage        *AnthropicStreamUsage   `json:"usage,omitempty"`
	Message      *AnthropicStreamMessage `json:"message,omitempty"`
}

func ParseAnthropicSSEEvent ¶ added in v0.0.6

func ParseAnthropicSSEEvent(data []byte) (*AnthropicStreamEvent, error)

type AnthropicStreamMessage ¶ added in v0.0.6

type AnthropicStreamMessage struct {
	ID         string                  `json:"id,omitempty"`
	Type       string                  `json:"type,omitempty"`
	Role       string                  `json:"role,omitempty"`
	Content    []AnthropicContentBlock `json:"content,omitempty"`
	Model      string                  `json:"model,omitempty"`
	StopReason string                  `json:"stop_reason,omitempty"`
	Usage      *AnthropicStreamUsage   `json:"usage,omitempty"`
}

type AnthropicStreamUsage ¶ added in v0.0.6

type AnthropicStreamUsage struct {
	InputTokens              int `json:"input_tokens,omitempty"`
	OutputTokens             int `json:"output_tokens,omitempty"`
	CacheCreationInputTokens int `json:"cache_creation_input_tokens,omitempty"`
	CacheReadInputTokens     int `json:"cache_read_input_tokens,omitempty"`
}

type AutoRouter ¶ added in v0.0.6

type AutoRouter struct {
	// contains filtered or unexported fields
}

func NewAutoRouter ¶ added in v0.0.6

func NewAutoRouter(opts ...AutoRouterOption) *AutoRouter

func (*AutoRouter) BillingCalculator ¶ added in v0.0.6

func (a *AutoRouter) BillingCalculator() *BillingCalculator

func (*AutoRouter) Forward ¶ added in v0.0.6

func (a *AutoRouter) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)

func (*AutoRouter) ForwardStreaming ¶ added in v0.0.6

func (a *AutoRouter) ForwardStreaming(ctx context.Context, req *http.Request, w http.ResponseWriter) (ResponseMetadata, error)

func (*AutoRouter) ForwardWebSocket ¶ added in v0.0.8

func (a *AutoRouter) ForwardWebSocket(ctx context.Context, w http.ResponseWriter, r *http.Request) error

func (*AutoRouter) GetProvider ¶ added in v0.0.6

func (a *AutoRouter) GetProvider(name string) Provider

func (*AutoRouter) RegisterProvider ¶ added in v0.0.6

func (a *AutoRouter) RegisterProvider(p Provider)

func (*AutoRouter) ServeHTTP ¶ added in v0.0.6

func (a *AutoRouter) ServeHTTP(w http.ResponseWriter, r *http.Request)

type AutoRouterOption ¶ added in v0.0.6

type AutoRouterOption func(*AutoRouter)

func WithAutoRouterBillingCalculator ¶ added in v0.0.6

func WithAutoRouterBillingCalculator(calculator *BillingCalculator) AutoRouterOption

func WithAutoRouterDetector ¶ added in v0.0.6

func WithAutoRouterDetector(d ProviderDetector) AutoRouterOption

func WithAutoRouterFallbackProvider ¶ added in v0.0.6

func WithAutoRouterFallbackProvider(p Provider) AutoRouterOption

func WithAutoRouterHTTPClient ¶ added in v0.0.6

func WithAutoRouterHTTPClient(c *http.Client) AutoRouterOption

func WithAutoRouterInterceptor ¶ added in v0.0.6

func WithAutoRouterInterceptor(i Interceptor) AutoRouterOption

func WithAutoRouterModelProviderLookup ¶ added in v0.0.6

func WithAutoRouterModelProviderLookup(lookup ModelProviderLookup) AutoRouterOption

func WithAutoRouterRegistry ¶ added in v0.0.6

func WithAutoRouterRegistry(r Registry) AutoRouterOption

func WithAutoRouterWSBillingCallback ¶ added in v0.0.8

func WithAutoRouterWSBillingCallback(cb WSBillingCallback) AutoRouterOption

func WithAutoRouterWebSocket ¶ added in v0.0.8

func WithAutoRouterWebSocket(upgrader WSUpgrader, dialer WSDialer) AutoRouterOption

type BaseProvider ¶

type BaseProvider struct {
	// contains filtered or unexported fields
}

BaseProvider provides a configurable implementation of Provider. It allows setting individual components via functional options, making it easy to mix and match behaviors.

Use NewBaseProvider with With* options to create a custom provider:

provider := NewBaseProvider("my-provider",
    WithBodyParser(myParser),
    WithRequestEnricher(myEnricher),
)

func NewBaseProvider ¶

func NewBaseProvider(name string, opts ...ProviderOption) *BaseProvider

NewBaseProvider creates a new provider with the given name and options. Unset components will return nil from their accessor methods.

func (*BaseProvider) BodyParser ¶

func (p *BaseProvider) BodyParser() BodyParser

BodyParser returns the configured body parser, or nil if not set.

func (*BaseProvider) Name ¶

func (p *BaseProvider) Name() string

Name returns the provider's name.

func (*BaseProvider) RequestEnricher ¶

func (p *BaseProvider) RequestEnricher() RequestEnricher

RequestEnricher returns the configured request enricher, or nil if not set.

func (*BaseProvider) ResponseExtractor ¶

func (p *BaseProvider) ResponseExtractor() ResponseExtractor

ResponseExtractor returns the configured response extractor, or nil if not set.

func (*BaseProvider) URLResolver ¶

func (p *BaseProvider) URLResolver() URLResolver

URLResolver returns the configured URL resolver, or nil if not set.

type BillingCalculator ¶ added in v0.0.6

type BillingCalculator struct {
	// contains filtered or unexported fields
}

func NewBillingCalculator ¶ added in v0.0.6

func NewBillingCalculator(lookup CostLookup, onResult func(BillingResult)) *BillingCalculator

func (*BillingCalculator) Calculate ¶ added in v0.0.6

func (c *BillingCalculator) Calculate(meta BodyMetadata, respMeta *ResponseMetadata) *BillingResult

func (*BillingCalculator) Lookup ¶ added in v0.0.6

func (c *BillingCalculator) Lookup() CostLookup

func (*BillingCalculator) OnResult ¶ added in v0.0.6

func (c *BillingCalculator) OnResult() func(BillingResult)

type BillingResult ¶

type BillingResult struct {
	// Provider is the provider name.
	Provider string
	// Model is the model identifier.
	Model string
	// PromptTokens is the number of input tokens.
	PromptTokens int
	// CompletionTokens is the number of output tokens.
	CompletionTokens int
	// CachedTokens is the number of prompt tokens served from cache.
	CachedTokens int
	// TotalTokens is the sum of prompt and completion tokens.
	TotalTokens int
	// InputCost is the calculated input cost in USD (non-cached prompt tokens).
	InputCost float64
	// CachedInputCost is the cost for cached prompt tokens in USD.
	CachedInputCost float64
	// OutputCost is the calculated output cost in USD.
	OutputCost float64
	// TotalCost is the sum of all costs in USD.
	TotalCost float64
}

BillingResult contains the calculated cost for a request.

func CalculateCost ¶

func CalculateCost(provider, model string, costInfo CostInfo, promptTokens, completionTokens int, cacheUsage *CacheUsage) BillingResult

CalculateCost computes the billing result from cost info, token usage, and cache usage. Cached tokens are billed at the CacheRead rate (if available), and non-cached prompt tokens are billed at the full Input rate.

type BodyMetadata ¶

type BodyMetadata struct {
	// Model is the requested model identifier (e.g., "gpt-4", "claude-3-opus").
	Model string `json:"model"`
	// Messages contains the conversation history for chat completions.
	Messages []Message `json:"messages,omitempty"`
	// MaxTokens is the maximum number of tokens to generate.
	MaxTokens int `json:"max_tokens,omitempty"`
	// Stream indicates whether streaming is requested.
	Stream bool `json:"stream"`
	// Custom holds provider-specific fields that don't map to standard fields.
	Custom map[string]any `json:"-"`
}

BodyMetadata contains extracted metadata from a parsed request body. It provides a common structure that works across different LLM providers while allowing provider-specific fields via the Custom map.

type BodyParser ¶

type BodyParser interface {
	// Parse reads the request body and extracts metadata.
	// It returns the parsed metadata, the raw body bytes (for later use),
	// and any error encountered during parsing.
	//
	// The caller is responsible for closing the body ReadCloser.
	Parse(body io.ReadCloser) (BodyMetadata, []byte, error)
}

BodyParser extracts metadata from a request body.

Since io.ReadCloser can only be read once, Parse returns both the extracted metadata and the raw body bytes. The caller is responsible for reconstructing the body for the upstream request.

Implementations should handle provider-specific JSON formats and map them to the common BodyMetadata structure.

type CacheDetail ¶ added in v0.0.3

type CacheDetail struct {
	// TTL is the time-to-live for the cache entry (e.g., "5m", "1h").
	TTL string `json:"ttl,omitempty"`
	// CacheWriteTokens is the number of tokens written to cache at this TTL.
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
}

CacheDetail contains cache details for a checkpoint (Bedrock).

type CacheUsage ¶ added in v0.0.3

type CacheUsage struct {
	// CachedTokens is the number of tokens served from cache (OpenAI).
	CachedTokens int `json:"cached_tokens,omitempty"`
	// CacheCreationInputTokens is the number of tokens written to cache (Anthropic).
	CacheCreationInputTokens int `json:"cache_creation_input_tokens,omitempty"`
	// CacheReadInputTokens is the number of tokens read from cache (Anthropic).
	CacheReadInputTokens int `json:"cache_read_input_tokens,omitempty"`
	// Ephemeral5mInputTokens is the number of 5-minute cache write tokens (Anthropic).
	Ephemeral5mInputTokens int `json:"ephemeral_5m_input_tokens,omitempty"`
	// Ephemeral1hInputTokens is the number of 1-hour cache write tokens (Anthropic).
	Ephemeral1hInputTokens int `json:"ephemeral_1h_input_tokens,omitempty"`
	// CacheWriteTokens is the number of tokens written to cache (Bedrock).
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
	// CacheDetails contains TTL-based cache write breakdown (Bedrock).
	CacheDetails []CacheDetail `json:"cache_details,omitempty"`
}

CacheUsage tracks prompt caching token consumption.

type Choice ¶

type Choice struct {
	// Index is the position of this choice in the choices array.
	Index int `json:"index"`
	// Message contains the completed message (for non-streaming responses).
	Message *Message `json:"message,omitempty"`
	// Delta contains the partial message (for streaming responses).
	Delta *Message `json:"delta,omitempty"`
	// FinishReason indicates why the completion stopped (e.g., "stop", "length").
	FinishReason string `json:"finish_reason"`
}

Choice represents a single completion choice in the response.

type CostInfo ¶

type CostInfo struct {
	// Input is the cost per 1M input tokens in USD.
	Input float64
	// Output is the cost per 1M output tokens in USD.
	Output float64
	// CacheRead is the cost per 1M cached input tokens (optional).
	CacheRead float64
	// CacheWrite is the cost per 1M cache write tokens (optional, Anthropic).
	CacheWrite float64
}

CostInfo contains pricing information for a model.

type CostLookup ¶

type CostLookup func(provider string, model string) (CostInfo, bool)

CostLookup is a function that returns the cost for a given provider and model. It should return the pricing info or false if the model is not found.

The lookup function allows the pricing data to be managed externally, such as downloading from models.dev or using a custom pricing database.

type DefaultStreamingHandler ¶ added in v0.0.6

type DefaultStreamingHandler struct {
	// contains filtered or unexported fields
}

func NewDefaultStreamingHandler ¶ added in v0.0.6

func NewDefaultStreamingHandler(extractor StreamingResponseExtractor) *DefaultStreamingHandler

func (*DefaultStreamingHandler) HandleStream ¶ added in v0.0.6

func (h *DefaultStreamingHandler) HandleStream(resp *http.Response, w http.ResponseWriter, meta BodyMetadata) (ResponseMetadata, error)

type Interceptor ¶

type Interceptor interface {
	// Intercept processes a request through the interceptor chain.
	//
	// Parameters:
	//   - req: The HTTP request to send upstream
	//   - meta: Parsed metadata from the request body
	//   - rawBody: The original request body bytes
	//   - next: The next handler in the chain (call this to continue)
	//
	// Returns:
	//   - resp: The HTTP response (body will be re-attached from rawRespBody)
	//   - respMeta: Parsed response metadata
	//   - rawRespBody: The raw response body bytes
	//   - error: Any error that occurred
	Intercept(req *http.Request, meta BodyMetadata, rawBody []byte, next RoundTripFunc) (resp *http.Response, respMeta ResponseMetadata, rawRespBody []byte, err error)
}

Interceptor wraps the request/response cycle for cross-cutting concerns.

Interceptors form a chain around the actual request execution, allowing behavior to be added before and after the upstream call. Common uses include:

Logging request/response details
Collecting metrics (latency, token usage)
Retrying failed requests
Rate limiting
Caching responses

Interceptors must call next(req) to continue the chain. Not calling next will short-circuit the request (useful for caching or mocking).

type InterceptorChain ¶

type InterceptorChain []Interceptor

InterceptorChain is an ordered list of interceptors that are applied in sequence. Interceptors are applied in reverse order during wrapping so that they execute in forward order during request processing.

func (InterceptorChain) Wrap ¶

func (c InterceptorChain) Wrap(final RoundTripFunc) RoundTripFunc

Wrap chains all interceptors around the final RoundTripFunc. Interceptors are wrapped in reverse order so they execute in forward order.

Example: Given interceptors [A, B, C] and final function F:

Wrapping produces: A(B(C(F)))
Execution order: A -> B -> C -> F -> C -> B -> A

type Logger ¶

type Logger interface {
	// Debug level logging
	Debug(msg string, args ...interface{})
	// Info level logging
	Info(msg string, args ...interface{})
	// Warning level logging
	Warn(msg string, args ...interface{})
	// Error level logging
	Error(msg string, args ...interface{})
}

Logger is an interface for logging. It matches the interface from github.com/agentuity/go-common/logger. Any logger implementing this interface can be used with interceptors.

type LoggerFunc ¶

type LoggerFunc func(level string, msg string, args ...interface{})

LoggerFunc is an adapter to allow using ordinary functions as loggers.

func (LoggerFunc) Debug ¶

func (f LoggerFunc) Debug(msg string, args ...interface{})

func (LoggerFunc) Error ¶

func (f LoggerFunc) Error(msg string, args ...interface{})

func (LoggerFunc) Info ¶

func (f LoggerFunc) Info(msg string, args ...interface{})

func (LoggerFunc) Warn ¶

func (f LoggerFunc) Warn(msg string, args ...interface{})

type MapRegistry ¶

type MapRegistry struct {
	// contains filtered or unexported fields
}

MapRegistry is a simple registry that stores providers by name. It provides thread-safe registration and lookup.

func NewRegistry ¶

func NewRegistry() *MapRegistry

NewRegistry creates a new empty registry.

func (*MapRegistry) Get ¶

func (r *MapRegistry) Get(name string) (Provider, bool)

Get retrieves a provider by name.

func (*MapRegistry) Match ¶

func (r *MapRegistry) Match(req *http.Request) (Provider, error)

Match is not implemented for MapRegistry and returns nil. Use a more sophisticated implementation for request-based routing.

func (*MapRegistry) Register ¶

func (r *MapRegistry) Register(p Provider)

Register adds a provider to the registry under its name. If a provider with the same name exists, it is replaced.

type Message ¶

type Message struct {
	// Role is the role of the message author (e.g., "user", "assistant", "system").
	Role string `json:"role"`
	// Content is the content of the message (can be string or array for multimodal).
	Content any `json:"content"`
	// Custom holds provider-specific message fields that don't map to standard fields.
	Custom map[string]any `json:"-"`
}

Message represents a single message in a chat completion request.

func (Message) MarshalJSON ¶ added in v0.0.7

func (m Message) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling to include Custom fields.

func (*Message) UnmarshalJSON ¶ added in v0.0.7

func (m *Message) UnmarshalJSON(data []byte) error

UnmarshalJSON implements custom JSON unmarshaling to capture unknown fields.

type MetaContextKey ¶

type MetaContextKey struct{}

MetaContextKey is the context key for storing request metadata.

type MetaContextValue ¶

type MetaContextValue struct {
	Meta    BodyMetadata
	RawBody []byte
	OrgID   string
}

MetaContextValue holds the metadata stored in request context.

func GetMetaFromContext ¶

func GetMetaFromContext(ctx context.Context) MetaContextValue

GetMetaFromContext retrieves the metadata stored in a context. Returns an empty MetaContextValue if the context is nil or doesn't contain metadata.

type ModelProviderLookup ¶ added in v0.0.6

type ModelProviderLookup func(model string) string

ModelProviderLookup is a function that finds the provider for a given model name. This can be backed by models.dev or another model registry.

type OpenAIStreamChoice ¶ added in v0.0.6

type OpenAIStreamChoice struct {
	Index        int                   `json:"index"`
	Delta        *OpenAIStreamDelta    `json:"delta,omitempty"`
	FinishReason string                `json:"finish_reason,omitempty"`
	Logprobs     *OpenAIStreamLogprobs `json:"logprobs,omitempty"`
}

type OpenAIStreamChunk ¶ added in v0.0.6

type OpenAIStreamChunk struct {
	ID      string               `json:"id"`
	Object  string               `json:"object"`
	Created int64                `json:"created"`
	Model   string               `json:"model"`
	Choices []OpenAIStreamChoice `json:"choices"`
	Usage   *OpenAIStreamUsage   `json:"usage,omitempty"`
}

func ParseOpenAISSEEvent ¶ added in v0.0.6

func ParseOpenAISSEEvent(data []byte) (*OpenAIStreamChunk, error)

type OpenAIStreamCompletionDetails ¶ added in v0.0.6

type OpenAIStreamCompletionDetails struct {
	ReasoningTokens          int `json:"reasoning_tokens,omitempty"`
	AudioTokens              int `json:"audio_tokens,omitempty"`
	AcceptedPredictionTokens int `json:"accepted_prediction_tokens,omitempty"`
	RejectedPredictionTokens int `json:"rejected_prediction_tokens,omitempty"`
}

type OpenAIStreamDelta ¶ added in v0.0.6

type OpenAIStreamDelta struct {
	Role    string `json:"role,omitempty"`
	Content string `json:"content,omitempty"`
}

type OpenAIStreamLogprobContent ¶ added in v0.0.6

type OpenAIStreamLogprobContent struct {
	Token   string  `json:"token"`
	Logprob float64 `json:"logprob"`
}

type OpenAIStreamLogprobs ¶ added in v0.0.6

type OpenAIStreamLogprobs struct {
	Content []OpenAIStreamLogprobContent `json:"content,omitempty"`
}

type OpenAIStreamPromptDetails ¶ added in v0.0.6

type OpenAIStreamPromptDetails struct {
	CachedTokens int `json:"cached_tokens,omitempty"`
	AudioTokens  int `json:"audio_tokens,omitempty"`
}

type OpenAIStreamUsage ¶ added in v0.0.6

type OpenAIStreamUsage struct {
	PromptTokens            int                            `json:"prompt_tokens"`
	CompletionTokens        int                            `json:"completion_tokens"`
	TotalTokens             int                            `json:"total_tokens"`
	PromptTokensDetails     *OpenAIStreamPromptDetails     `json:"prompt_tokens_details,omitempty"`
	CompletionTokensDetails *OpenAIStreamCompletionDetails `json:"completion_tokens_details,omitempty"`
}

type Provider ¶

type Provider interface {
	// Name returns the provider's unique identifier (e.g., "openai", "anthropic").
	Name() string

	// BodyParser returns the parser for extracting request metadata.
	BodyParser() BodyParser

	// RequestEnricher returns the enricher for modifying outgoing requests.
	RequestEnricher() RequestEnricher

	// ResponseExtractor returns the extractor for parsing responses.
	ResponseExtractor() ResponseExtractor

	// URLResolver returns the resolver for determining upstream URLs.
	URLResolver() URLResolver
}

Provider composes all the components needed to handle requests for an LLM provider.

A provider brings together:

BodyParser: To extract request metadata
RequestEnricher: To modify outgoing requests
ResponseExtractor: To parse responses
URLResolver: To determine upstream URLs

Implementations can use BaseProvider for a configurable default, or implement Provider directly for complete control.

type ProviderDetector ¶ added in v0.0.6

type ProviderDetector interface {
	Detect(hint ProviderHint) string
}

ProviderDetector determines the upstream provider based on request characteristics.

type ProviderDetectorFunc ¶ added in v0.0.6

type ProviderDetectorFunc func(hint ProviderHint) string

ProviderDetectorFunc is a function that implements ProviderDetector.

func (ProviderDetectorFunc) Detect ¶ added in v0.0.6

func (f ProviderDetectorFunc) Detect(hint ProviderHint) string

type ProviderError ¶ added in v0.0.6

type ProviderError struct {
	Message string
}

func (*ProviderError) Error ¶ added in v0.0.6

func (e *ProviderError) Error() string

type ProviderHint ¶ added in v0.0.6

type ProviderHint struct {
	Model   string
	Headers http.Header
}

ProviderHint contains information that can be used to detect the provider.

type ProviderOption ¶

type ProviderOption func(*BaseProvider)

ProviderOption configures a BaseProvider during construction.

func WithBodyParser ¶

func WithBodyParser(bp BodyParser) ProviderOption

WithBodyParser sets the body parser for the provider.

func WithRequestEnricher ¶

func WithRequestEnricher(re RequestEnricher) ProviderOption

WithRequestEnricher sets the request enricher for the provider.

func WithResponseExtractor ¶

func WithResponseExtractor(re ResponseExtractor) ProviderOption

WithResponseExtractor sets the response extractor for the provider.

func WithURLResolver ¶

func WithURLResolver(ur URLResolver) ProviderOption

WithURLResolver sets the URL resolver for the provider.

type Proxy ¶

type Proxy struct {
	// contains filtered or unexported fields
}

Proxy forwards requests to an upstream LLM provider.

Proxy handles the complete request lifecycle:

Reads and parses the request body
Resolves the upstream URL
Creates and enriches the upstream request
Executes the request through the interceptor chain
Extracts metadata from the response
Re-attaches the raw response body

Use NewProxy with functional options to configure:

proxy := NewProxy(provider,
    WithInterceptor(loggingInterceptor),
    WithHTTPClient(customClient),
)

func NewProxy ¶

func NewProxy(provider Provider, opts ...ProxyOption) *Proxy

NewProxy creates a new proxy for the given provider. Options can be used to add interceptors or customize the HTTP client.

func (*Proxy) Forward ¶

func (p *Proxy) Forward(ctx context.Context, req *http.Request) (*http.Response, ResponseMetadata, error)

Forward sends a request to the upstream provider and returns the response.

The method:

Reads and parses the request body to extract metadata
Resolves the upstream URL based on the metadata
Creates a new request for the upstream, copying headers
Enriches the request with provider-specific headers
Executes the request through the interceptor chain
Extracts metadata from the response
Re-attaches the raw response body so the caller can read it

The returned response body contains the original raw bytes from the upstream and can be read by the caller. Any custom/unsupported fields in the JSON are preserved.

type ProxyOption ¶

type ProxyOption func(*Proxy)

ProxyOption configures a Proxy during construction.

func WithHTTPClient ¶

func WithHTTPClient(c *http.Client) ProxyOption

WithHTTPClient sets a custom HTTP client for upstream requests. If not set, http.DefaultClient is used.

func WithInterceptor ¶

func WithInterceptor(i Interceptor) ProxyOption

WithInterceptor adds an interceptor to the global chain. Interceptors are applied in the order they are added.

type Registry ¶

type Registry interface {
	// Register adds a provider to the registry.
	Register(p Provider)

	// Get retrieves a provider by name.
	// Returns the provider and true if found, nil and false otherwise.
	Get(name string) (Provider, bool)

	// Match selects a provider for the given request.
	// Implementations may parse the request body to determine routing.
	// Returns the matched provider, or an error if no match is found.
	Match(req *http.Request) (Provider, error)
}

Registry manages a collection of providers and supports routing requests to the appropriate provider based on request characteristics.

type RequestEnricher ¶

type RequestEnricher interface {
	// Enrich modifies the request with provider-specific enhancements.
	// The meta parameter contains parsed body metadata for decision-making.
	// The rawBody contains the original request body bytes.
	//
	// Implementations should modify req in place (headers, URL, etc.)
	// and return nil on success, or an error to abort the request.
	Enrich(req *http.Request, meta BodyMetadata, rawBody []byte) error
}

RequestEnricher modifies an outgoing request before it's sent to the upstream provider.

Typical uses include:

Setting authentication headers (Authorization, X-API-Key, etc.)
Adding provider-specific headers
Modifying the request body or URL

The rawBody is provided for cases where the enricher needs to modify the body content.

type ResponseExtractor ¶

type ResponseExtractor interface {
	Extract(resp *http.Response) (metadata ResponseMetadata, rawBody []byte, err error)
}

type ResponseMetadata ¶

type ResponseMetadata struct {
	// ID is the unique identifier for the response (provider-specific).
	ID string `json:"id,omitempty"`
	// Object is the object type (e.g., "chat.completion").
	Object string `json:"object,omitempty"`
	// Model is the model used for the completion.
	Model string `json:"model,omitempty"`
	// Usage contains token consumption statistics.
	Usage Usage `json:"usage"`
	// Choices contains the completion choices.
	Choices []Choice `json:"choices,omitempty"`
	// Custom holds provider-specific response fields.
	Custom map[string]any `json:"-"`
}

ResponseMetadata contains extracted metadata from a provider response. It provides a unified view of response data across different providers.

type ResponsesStreamEvent ¶ added in v0.0.8

type ResponsesStreamEvent struct {
	Type     string          `json:"type"`
	Response json.RawMessage `json:"response,omitempty"`
}

func ParseResponsesSSEEvent ¶ added in v0.0.8

func ParseResponsesSSEEvent(data []byte) (*ResponsesStreamEvent, error)

type ResponsesStreamInputDetails ¶ added in v0.0.8

type ResponsesStreamInputDetails struct {
	CachedTokens int `json:"cached_tokens,omitempty"`
}

type ResponsesStreamOutputDetails ¶ added in v0.0.8

type ResponsesStreamOutputDetails struct {
	ReasoningTokens int `json:"reasoning_tokens,omitempty"`
}

type ResponsesStreamResponse ¶ added in v0.0.8

type ResponsesStreamResponse struct {
	ID     string                `json:"id"`
	Object string                `json:"object"`
	Model  string                `json:"model"`
	Status string                `json:"status"`
	Usage  *ResponsesStreamUsage `json:"usage,omitempty"`
}

type ResponsesStreamUsage ¶ added in v0.0.8

type ResponsesStreamUsage struct {
	InputTokens         int                           `json:"input_tokens"`
	OutputTokens        int                           `json:"output_tokens"`
	TotalTokens         int                           `json:"total_tokens"`
	InputTokensDetails  *ResponsesStreamInputDetails  `json:"input_tokens_details,omitempty"`
	OutputTokensDetails *ResponsesStreamOutputDetails `json:"output_tokens_details,omitempty"`
}

type RoundTripFunc ¶

type RoundTripFunc func(*http.Request) (*http.Response, ResponseMetadata, []byte, error)

RoundTripFunc is the signature for executing a request through the chain. It returns the response, metadata, raw response body, and error.

type SSEEvent ¶ added in v0.0.6

type SSEEvent struct {
	ID    []byte
	Event []byte
	Data  []byte
	Retry []byte
}

type SSEParser ¶ added in v0.0.6

type SSEParser struct {
	// contains filtered or unexported fields
}

func NewSSEParser ¶ added in v0.0.6

func NewSSEParser(r io.Reader) *SSEParser

func (*SSEParser) Next ¶ added in v0.0.6

func (p *SSEParser) Next() (*SSEEvent, error)

type StreamingHandler ¶ added in v0.0.6

type StreamingHandler interface {
	HandleStream(resp *http.Response, w http.ResponseWriter, meta BodyMetadata) (ResponseMetadata, error)
}

type StreamingResponseExtractor ¶ added in v0.0.6

type StreamingResponseExtractor interface {
	ResponseExtractor
	ExtractStreamingWithController(resp *http.Response, w http.ResponseWriter, rc *http.ResponseController) (ResponseMetadata, error)
	IsStreamingResponse(resp *http.Response) bool
}

type StreamingUsage ¶ added in v0.0.6

type StreamingUsage struct {
	PromptTokens     int
	CompletionTokens int
	TotalTokens      int
	CacheUsage       *CacheUsage
	ReasoningTokens  int
}

func ExtractUsageFromAnthropicEvent ¶ added in v0.0.6

func ExtractUsageFromAnthropicEvent(event *AnthropicStreamEvent) *StreamingUsage

func ExtractUsageFromOpenAIChunk ¶ added in v0.0.6

func ExtractUsageFromOpenAIChunk(chunk *OpenAIStreamChunk) *StreamingUsage

func ExtractUsageFromResponsesEvent ¶ added in v0.0.8

func ExtractUsageFromResponsesEvent(event *ResponsesStreamEvent) *StreamingUsage

func ExtractWSUsage ¶ added in v0.0.8

func ExtractWSUsage(data []byte) (*StreamingUsage, error)

ExtractWSUsage extracts usage from a response.completed WebSocket message. Returns nil,nil for non-response.completed events.

type TeeReader ¶ added in v0.0.6

type TeeReader struct {
	// contains filtered or unexported fields
}

func NewTeeReader ¶ added in v0.0.6

func NewTeeReader(r io.Reader, w io.Writer) *TeeReader

func (*TeeReader) Read ¶ added in v0.0.6

func (t *TeeReader) Read(p []byte) (n int, err error)

type URLResolver ¶

type URLResolver interface {
	// Resolve returns the upstream URL for the given request metadata.
	// The returned URL should be the full endpoint for the completion request.
	//
	// Implementations can use metadata fields (like Model) to make routing decisions.
	Resolve(meta BodyMetadata) (*url.URL, error)
}

URLResolver determines the upstream provider URL for a given request.

This allows routing requests to different endpoints based on the request metadata, such as model name. Some providers may use different endpoints for different models or have region-specific URLs.

type Usage ¶

type Usage struct {
	// PromptTokens is the number of tokens in the prompt.
	PromptTokens int `json:"prompt_tokens"`
	// CompletionTokens is the number of tokens generated in the completion.
	CompletionTokens int `json:"completion_tokens"`
	// TotalTokens is the sum of prompt and completion tokens.
	TotalTokens int `json:"total_tokens"`
}

Usage tracks token consumption for a completion request.

type WSBillingCallback ¶ added in v0.0.8

type WSBillingCallback func(turn int, meta ResponseMetadata, billing *BillingResult)

WSBillingCallback is invoked per completed response turn.

type WSConn ¶ added in v0.0.8

type WSConn interface {
	ReadMessage() (messageType int, p []byte, err error)
	WriteMessage(messageType int, data []byte) error
	Close() error
}

WSConn abstracts a WebSocket connection for reading and writing messages.

gorilla/websocket's *Conn satisfies this interface directly.

type WSDialer ¶ added in v0.0.8

type WSDialer interface {
	DialContext(ctx context.Context, urlStr string, requestHeader http.Header) (WSConn, *http.Response, error)
}

WSDialer dials a WebSocket connection to an upstream server. Consumers wrap their WebSocket library's dialer to implement this.

type WSEventCallback ¶ added in v0.0.8

type WSEventCallback func(eventType string, data []byte, usage *StreamingUsage)

WSEventCallback is an optional callback for WebSocket events. usage is non-nil for response.completed events that include usage data.

type WSMessage ¶ added in v0.0.8

type WSMessage struct {
	Type               string          `json:"type"`
	Model              string          `json:"model,omitempty"`
	PreviousResponseID string          `json:"previous_response_id,omitempty"`
	Raw                json.RawMessage `json:"-"`
}

WSMessage is a lightweight parsed view of a WebSocket JSON message.

func ParseWSMessage ¶ added in v0.0.8

func ParseWSMessage(data []byte) (*WSMessage, error)

ParseWSMessage parses a WebSocket JSON message and extracts commonly used fields.

type WSResponseCompleted ¶ added in v0.0.8

type WSResponseCompleted struct {
	Type     string              `json:"type"`
	Response *WSResponseEnvelope `json:"response,omitempty"`
	Usage    *WSResponseUsage    `json:"usage,omitempty"`
}

WSResponseCompleted is the minimal shape needed to extract usage from OpenAI Responses API WebSocket response.completed events.

type WSResponseEnvelope ¶ added in v0.0.8

type WSResponseEnvelope struct {
	Usage *WSResponseUsage `json:"usage,omitempty"`
}

type WSResponseInputDetails ¶ added in v0.0.8

type WSResponseInputDetails struct {
	CachedTokens int `json:"cached_tokens,omitempty"`
}

type WSResponseOutputDetails ¶ added in v0.0.8

type WSResponseOutputDetails struct {
	ReasoningTokens int `json:"reasoning_tokens,omitempty"`
}

type WSResponseUsage ¶ added in v0.0.8

type WSResponseUsage struct {
	InputTokens         int                      `json:"input_tokens"`
	OutputTokens        int                      `json:"output_tokens"`
	TotalTokens         int                      `json:"total_tokens"`
	InputTokensDetails  *WSResponseInputDetails  `json:"input_tokens_details,omitempty"`
	OutputTokensDetails *WSResponseOutputDetails `json:"output_tokens_details,omitempty"`
}

type WSUpgrader ¶ added in v0.0.8

type WSUpgrader interface {
	Upgrade(w http.ResponseWriter, r *http.Request, responseHeader http.Header) (WSConn, error)
}

WSUpgrader upgrades an HTTP request to a WebSocket connection. Consumers wrap their WebSocket library's upgrader to implement this.

type WebSocketCapableProvider ¶ added in v0.0.8

type WebSocketCapableProvider interface {
	Provider
	// WebSocketURL returns the upstream WebSocket URL for this provider.
	WebSocketURL(meta BodyMetadata) (*url.URL, error)
}

WebSocketCapableProvider is implemented by providers that support WebSocket mode.

Directories ¶

Path	Synopsis
examples
basic command
interceptors
internal
fastjson
pricing
modelsdev Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json).	Package modelsdev provides an adapter for loading model pricing data from models.dev (https://models.dev/api.json).
providers
anthropic Package anthropic provides a provider implementation for Anthropic's Claude API.	Package anthropic provides a provider implementation for Anthropic's Claude API.
azure
bedrock Package bedrock provides a provider implementation for AWS Bedrock.	Package bedrock provides a provider implementation for AWS Bedrock.
fireworks Package fireworks provides a provider implementation for Fireworks AI's API.	Package fireworks provides a provider implementation for Fireworks AI's API.
googleai Package googleai provides a provider implementation for Google AI's Gemini API.	Package googleai provides a provider implementation for Google AI's Gemini API.
groq Package groq provides a provider implementation for Groq's API.	Package groq provides a provider implementation for Groq's API.
openai Package openai provides a provider implementation for OpenAI's API.	Package openai provides a provider implementation for OpenAI's API.
openai_compatible Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs.	Package openai_compatible provides a reusable implementation for LLM providers that use OpenAI-compatible APIs.
perplexity
xai Package xai provides a provider implementation for x.AI's Grok API.	Package xai provides a provider implementation for x.AI's Grok API.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL