Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AddHeaderInterceptor ¶
func NewAddHeader ¶
func NewAddHeader(requestHeaders, responseHeaders []Header) *AddHeaderInterceptor
func NewAddRequestHeader ¶
func NewAddRequestHeader(headers ...Header) *AddHeaderInterceptor
func NewAddResponseHeader ¶
func NewAddResponseHeader(headers ...Header) *AddHeaderInterceptor
func (*AddHeaderInterceptor) Intercept ¶
func (i *AddHeaderInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
type BillingInterceptor ¶
type BillingInterceptor struct {
// Lookup is the function that returns pricing for a provider/model.
Lookup llmproxy.CostLookup
// OnResult is called with the billing result after each successful request.
// This can be used to log, record to a database, or aggregate metrics.
OnResult func(llmproxy.BillingResult)
}
BillingInterceptor calculates and records the cost of each request. It uses a CostLookup function to determine pricing for each model.
func NewBilling ¶
func NewBilling(lookup llmproxy.CostLookup, onResult func(llmproxy.BillingResult)) *BillingInterceptor
NewBilling creates a new billing interceptor with the given lookup function.
Example:
lookup := func(provider, model string) (llmproxy.CostInfo, bool) {
// Your pricing database lookup
if model == "gpt-4" {
return llmproxy.CostInfo{Input: 30, Output: 60}, true
}
return llmproxy.CostInfo{}, false
}
billing := interceptors.NewBilling(lookup, func(r llmproxy.BillingResult) {
log.Printf("Cost: $%.6f for %s", r.TotalCost, r.Model)
})
func (*BillingInterceptor) Intercept ¶
func (i *BillingInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
Intercept calculates the cost after a successful request and calls OnResult. If the model is not found in the lookup, no billing is recorded.
type HeaderBanInterceptor ¶
func NewHeaderBan ¶
func NewHeaderBan(requestHeaders, responseHeaders []string) *HeaderBanInterceptor
func NewRequestHeaderBan ¶
func NewRequestHeaderBan(headers ...string) *HeaderBanInterceptor
func NewResponseHeaderBan ¶
func NewResponseHeaderBan(headers ...string) *HeaderBanInterceptor
func (*HeaderBanInterceptor) Intercept ¶
func (i *HeaderBanInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
type LoggingInterceptor ¶
type LoggingInterceptor struct {
// Logger is the destination for log output.
// If nil, a default logger is used.
Logger llmproxy.Logger
}
LoggingInterceptor logs request and response details. It records the model, method, URL, latency, and token usage.
func NewLogging ¶
func NewLogging(logger llmproxy.Logger) *LoggingInterceptor
NewLogging creates a new logging interceptor with the given logger. Pass nil to use a default logger that wraps log.Default().
func (*LoggingInterceptor) Intercept ¶
func (i *LoggingInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
Intercept logs the request before execution and the response after. Log format:
- Request: [model] METHOD /path
- Success: [model] OK: tokens=prompt/completion (duration)
- Error: [model] ERROR: err (duration)
type Metrics ¶
type Metrics struct {
// TotalRequests is the total number of requests processed.
TotalRequests int64
// TotalTokens is the sum of all tokens consumed.
TotalTokens int64
// TotalPromptTokens is the sum of all prompt tokens consumed.
TotalPromptTokens int64
// TotalCompletionTokens is the sum of all completion tokens generated.
TotalCompletionTokens int64
// TotalLatency is the cumulative latency in nanoseconds.
TotalLatency int64
// Errors is the count of failed requests.
Errors int64
}
Metrics holds aggregated statistics about proxied requests. All fields are safe for concurrent access via atomic operations.
type MetricsInterceptor ¶
type MetricsInterceptor struct {
// Metrics is the destination for collected metrics.
Metrics *Metrics
}
MetricsInterceptor collects metrics about proxied requests. It tracks request counts, token usage, latency, and errors.
func NewMetrics ¶
func NewMetrics(m *Metrics) *MetricsInterceptor
NewMetrics creates a new metrics interceptor that records to the given Metrics struct. The Metrics struct should be created once and shared across all requests.
Example:
m := &interceptors.Metrics{}
proxy := llmproxy.NewProxy(provider,
llmproxy.WithInterceptor(interceptors.NewMetrics(m)),
)
// Later, read m.TotalRequests, etc.
func (*MetricsInterceptor) Intercept ¶
func (i *MetricsInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
Intercept increments metrics counters and measures latency. It records:
- TotalRequests (always)
- TotalLatency (always)
- Errors (on failure)
- Token counts (on success)
type RetryInterceptor ¶
type RetryInterceptor struct {
MaxAttempts int
Delay time.Duration
IsRetryable func(*http.Response, error) bool
UseRateLimitHeaders bool
}
func NewRetryWithPredicate ¶
func NewRetryWithRateLimitHeaders ¶
func NewRetryWithRateLimitHeaders(maxAttempts int, defaultDelay time.Duration) *RetryInterceptor
func (*RetryInterceptor) Intercept ¶
func (i *RetryInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
type TraceExtractor ¶
TraceExtractor extracts trace information from a request context. Return empty TraceInfo if no trace context is available.
type TraceInfo ¶
type TraceInfo struct {
// TraceID is the 16-byte trace identifier (32 hex chars).
TraceID [16]byte
// SpanID is the 8-byte span identifier (16 hex chars).
SpanID [8]byte
// Sampled indicates whether the trace is sampled.
Sampled bool
}
TraceInfo holds OpenTelemetry trace context information.
type TracingInterceptor ¶
type TracingInterceptor struct {
// Extract extracts trace info from the incoming request context.
// If nil, no trace headers are added.
Extract TraceExtractor
// ResponseHeader is the header name for the trace ID in the response.
// Defaults to "X-Request-ID" if empty.
ResponseHeader string
}
TracingInterceptor adds OpenTelemetry trace headers to upstream requests and propagates the trace ID back as a response header for correlation.
func NewTracing ¶
func NewTracing(extractor TraceExtractor) *TracingInterceptor
NewTracing creates a tracing interceptor with the given trace extractor.
The extractor function should pull trace context from the incoming request and return TraceInfo. For OpenTelemetry, you can use:
func otelExtractor(ctx context.Context) interceptors.TraceInfo {
span := trace.SpanFromContext(ctx)
if !span.SpanContext().IsValid() {
return interceptors.TraceInfo{}
}
return interceptors.TraceInfo{
TraceID: span.SpanContext().TraceID(),
SpanID: span.SpanContext().SpanID(),
Sampled: span.SpanContext().IsSampled(),
}
}
Example:
tracing := interceptors.NewTracing(otelExtractor) proxy := llmproxy.NewProxy(provider, llmproxy.WithInterceptor(tracing))
func NewTracingWithHeader ¶
func NewTracingWithHeader(extractor TraceExtractor, responseHeader string) *TracingInterceptor
NewTracingWithHeader creates a tracing interceptor with a custom response header name.
func (*TracingInterceptor) Intercept ¶
func (i *TracingInterceptor) Intercept(req *http.Request, meta llmproxy.BodyMetadata, rawBody []byte, next llmproxy.RoundTripFunc) (*http.Response, llmproxy.ResponseMetadata, []byte, error)
Intercept adds trace headers to the upstream request and sets the response header.
Upstream headers set:
- X-Request-ID: the trace ID (32 hex chars)
- traceparent: W3C Trace Context format (version-traceid-spanid-flags)
Response header set:
- X-Request-ID (or custom ResponseHeader): the trace ID for correlation