metrics

package

v0.6.0 Latest Latest Go to latest Published: May 5, 2026 License: Apache-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/envoyproxy/ai-gateway

Links

Documentation ¶

Index ¶

func NewMeterFromEnv(ctx context.Context, stdout io.Writer, promReader sdkmetric.Reader) (metric.Meter, func(context.Context) error, error)
type Factory
- func NewMetricsFactory(meter metric.Meter, requestHeaderLabelMapping map[string]string, ...) Factory
type GenAIOperation
type MCPErrorType
type MCPMetrics
- func NewMCP(meter metric.Meter, requestHeaderAttributeMapping map[string]string) MCPMetrics
type MCPStatusType
type Metrics
type TokenUsage
- func ExtractTokenUsageFromExplicitCaching(inputTokens, outputTokens int64, cacheReadTokens, cacheCreationTokens *int64) TokenUsage

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewMeterFromEnv ¶ added in v0.5.0

func NewMeterFromEnv(ctx context.Context, stdout io.Writer, promReader sdkmetric.Reader) (metric.Meter, func(context.Context) error, error)

NewMeterFromEnv configures an OpenTelemetry MeterProvider based on environment variables, always incorporating the provided Prometheus reader. It optionally includes additional exporters (e.g., console or OTLP) if enabled via environment variables. The function returns a metric.Meter for instrumentation and a shutdown function to gracefully close the provider.

The stdout parameter directs output for the console exporter (use os.Stdout in production). Environment variables checked directly include:

OTEL_SDK_DISABLED: If "true", disables OTEL exporters.
OTEL_METRICS_EXPORTER: Supported values are "none", "console", "prometheus", "otlp".
OTEL_EXPORTER_OTLP_ENDPOINT or OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: Enables OTLP if set.

Prometheus is always enabled via the provided promReader; other exporters are added conditionally.

Types ¶

type Factory ¶ added in v0.5.0

type Factory interface {
	// NewMetrics creates a new Metrics instance for the specified operation name.
	NewMetrics() Metrics
}

Factory is a closure that creates a new Metrics instance for a given operation.

func NewMetricsFactory ¶ added in v0.5.0

func NewMetricsFactory(meter metric.Meter, requestHeaderLabelMapping map[string]string, operation GenAIOperation) Factory

NewMetricsFactory returns a Factory to create a new Metrics instance.

type GenAIOperation ¶ added in v0.5.0

type GenAIOperation string

GenAIOperation represents the type of generative AI operation i.e. the endpoint being called.

const (
	GenAIOperationChat            GenAIOperation = "chat"
	GenAIOperationCompletion      GenAIOperation = "completion"
	GenAIOperationEmbedding       GenAIOperation = "embeddings"
	GenAIOperationMessages        GenAIOperation = "messages"
	GenAIOperationImageGeneration GenAIOperation = "image_generation"
	GenAIOperationResponses       GenAIOperation = "responses"
	GenAIOperationSpeech          GenAIOperation = "speech"
	GenAIOperationRerank          GenAIOperation = "rerank"
)

type MCPErrorType ¶ added in v0.4.0

type MCPErrorType string

MCPErrorType defines the type of error that occurred during an MCP request.

const (
	// MCPErrorUnsupportedProtocolVersion indicates that the protocol version is not supported.
	MCPErrorUnsupportedProtocolVersion MCPErrorType = "unsupported_protocol_version"
	// MCPErrorInvalidJSONRPC indicates that the JSON-RPC request is invalid.
	MCPErrorInvalidJSONRPC MCPErrorType = "invalid_json_rpc"
	// MCPErrorUnsupportedMethod indicates that the method is not supported.
	MCPErrorUnsupportedMethod MCPErrorType = "unsupported_method"
	// MCPErrorUnsupportedResponse indicates that the response is not supported.
	MCPErrorUnsupportedResponse MCPErrorType = "unsupported_response"
	// MCPErrorInvalidParam indicates that a parameter is invalid.
	MCPErrorInvalidParam MCPErrorType = "invalid_param"
	// MCPErrorInvalidSessionID indicates that the session ID is invalid.
	MCPErrorInvalidSessionID MCPErrorType = "invalid_session_id"
	// MCPErrorInternal indicates that an internal error occurred.
	MCPErrorInternal MCPErrorType = "internal_error"
)

type MCPMetrics ¶ added in v0.4.0

type MCPMetrics interface {
	// WithRequestAttributes returns a new MCPMetrics instance with default attributes extracted from the HTTP request.
	WithRequestAttributes(req *http.Request) MCPMetrics
	// WithBackend returns a new MCPMetrics instance with the backend attribute set.
	// This allows metrics to be filtered/sorted by the upstream MCP backend that handled the request.
	WithBackend(backend string) MCPMetrics
	// RecordRequestDuration records the duration of a success MCP request.
	RecordRequestDuration(ctx context.Context, startAt time.Time, meta mcpsdk.Params)
	// RecordRequestErrorDuration records the duration of an MCP request that resulted in an error.
	RecordRequestErrorDuration(ctx context.Context, startAt time.Time, errType MCPErrorType, meta mcpsdk.Params)
	// RecordMethodCount records the count of method invocations.
	RecordMethodCount(ctx context.Context, methodName string, meta mcpsdk.Params)
	// RecordMethodErrorCount records the count of method invocations with error status.
	RecordMethodErrorCount(ctx context.Context, methodName string, meta mcpsdk.Params, status MCPStatusType)
	// RecordInitializationDuration records the duration of MCP initialization.
	RecordInitializationDuration(ctx context.Context, startAt time.Time, meta mcpsdk.Params)
	// RecordClientCapabilities records the negotiated client capabilities.
	RecordClientCapabilities(ctx context.Context, capabilities *mcpsdk.ClientCapabilities, meta mcpsdk.Params)
	// RecordServerCapabilities records the negotiated server capabilities.
	RecordServerCapabilities(ctx context.Context, capabilities *mcpsdk.ServerCapabilities, meta mcpsdk.Params)
	// RecordProgress records a progress notification sent/received.
	RecordProgress(ctx context.Context, meta mcpsdk.Params)
}

MCPMetrics holds metrics for MCP.

func NewMCP ¶ added in v0.4.0

func NewMCP(meter metric.Meter, requestHeaderAttributeMapping map[string]string) MCPMetrics

NewMCP creates a new mcp metrics instance.

type MCPStatusType ¶ added in v0.5.0

type MCPStatusType string

MCPStatusType defines the status of an MCP request.

const (
	MCPStatusSuccess MCPStatusType = "success"
	MCPStatusFailed  MCPStatusType = "failed"
	MCPStatusError   MCPStatusType = "error"
)

type Metrics ¶ added in v0.5.0

type Metrics interface {
	// StartRequest initializes timing for a new request.
	StartRequest(headers map[string]string)
	// SetOriginalModel sets the original model from the incoming request body before any virtualization applies.
	// This is usually called after parsing the request body. Example: gpt-5
	SetOriginalModel(originalModel internalapi.OriginalModel)
	// SetRequestModel sets the model from the request. This is usually called after parsing the request body.
	// Example: gpt-5-nano
	SetRequestModel(requestModel internalapi.RequestModel)
	// SetResponseModel sets the model that ultimately generated the response.
	// Example: gpt-5-nano-2025-08-07
	SetResponseModel(responseModel internalapi.ResponseModel)
	// SetBackend sets the selected backend when the routing decision has been made. This is usually called
	// after parsing the request body to determine the model and invoke the routing logic.
	SetBackend(backend *filterapi.Backend)
	// RecordRequestCompletion records the completion of the request, including success status.
	RecordRequestCompletion(ctx context.Context, success bool, requestHeaders map[string]string)
	// RecordTokenUsage records token usage metrics.
	//
	// Depending on the endpoint, some token types are not available and should be passed as OptUint32None.
	RecordTokenUsage(ctx context.Context, usage TokenUsage, requestHeaders map[string]string)

	// GetTimeToFirstTokenMs returns the time to first token in stream mode in milliseconds.
	GetTimeToFirstTokenMs() float64
	// GetInterTokenLatencyMs returns the inter token latency in stream mode in milliseconds.
	GetInterTokenLatencyMs() float64
	// RecordTokenLatency records latency metrics for token generation.
	RecordTokenLatency(ctx context.Context, accumulatedOutputToken uint32, endOfStream bool, requestHeaders map[string]string)
}

Metrics is the interface for the base AI Gateway metrics.

type TokenUsage ¶ added in v0.5.0

type TokenUsage struct {
	// contains filtered or unexported fields
}

TokenUsage represents the token usage reported usually by the backend API in the response body.

Fields are not exported to control the optionality of each field via the accompanying boolean flags.

func ExtractTokenUsageFromExplicitCaching ¶ added in v0.5.0

func ExtractTokenUsageFromExplicitCaching(inputTokens, outputTokens int64, cacheReadTokens, cacheCreationTokens *int64) TokenUsage

ExtractTokenUsageFromExplicitCaching extracts the correct token usage from upstream Anthropic or AWS Bedrock token usage response. The total input tokens is the summation of: input_tokens + cache_creation_input_tokens + cache_read_input_tokens This is to unify the usage response returned by envoy ai gateway for both explicit and implicit caching.

This function works for both streaming and non-streaming responses by accepting the common usage fields that exist from anthropic or AWS bedrock usage structures.

func (*TokenUsage) AddCacheCreationInputTokens ¶ added in v0.5.0

func (u *TokenUsage) AddCacheCreationInputTokens(tokens uint32)

AddCacheCreationInputTokens increments the recorded cache creation input tokens and marks the field as set.

func (*TokenUsage) AddCachedInputTokens ¶ added in v0.5.0

func (u *TokenUsage) AddCachedInputTokens(tokens uint32)

AddCachedInputTokens increments the recorded cached input tokens and marks the field as set.

func (*TokenUsage) AddInputTokens ¶ added in v0.5.0

func (u *TokenUsage) AddInputTokens(tokens uint32)

AddInputTokens increments the recorded input tokens and marks the field as set.

func (*TokenUsage) AddOutputTokens ¶ added in v0.5.0

func (u *TokenUsage) AddOutputTokens(tokens uint32)

AddOutputTokens increments the recorded output tokens and marks the field as set.

func (*TokenUsage) AddReasoningTokens ¶ added in v0.6.0

func (u *TokenUsage) AddReasoningTokens(tokens uint32)

AddReasoningTokens increments the recorded reasoning tokens and marks the field as set.

func (*TokenUsage) CacheCreationInputTokens ¶ added in v0.5.0

func (u *TokenUsage) CacheCreationInputTokens() (uint32, bool)

CacheCreationInputTokens returns the number of cache creation input tokens and whether it was set.

func (*TokenUsage) CachedInputTokens ¶ added in v0.5.0

func (u *TokenUsage) CachedInputTokens() (uint32, bool)

CachedInputTokens returns the number of cached input tokens and whether it was set.

func (*TokenUsage) InputTokens ¶ added in v0.5.0

func (u *TokenUsage) InputTokens() (uint32, bool)

InputTokens returns the number of input tokens and whether it was set.

func (*TokenUsage) OutputTokens ¶ added in v0.5.0

func (u *TokenUsage) OutputTokens() (uint32, bool)

OutputTokens returns the number of output tokens and whether it was set.

func (*TokenUsage) Override ¶ added in v0.5.0

func (u *TokenUsage) Override(other TokenUsage)

Override updates the TokenUsage fields with values from another TokenUsage instance. Only fields that are marked as set in the other instance will override the current values.

func (*TokenUsage) ReasoningTokens ¶ added in v0.6.0

func (u *TokenUsage) ReasoningTokens() (uint32, bool)

ReasoningTokens returns the number of reasoning tokens and whether it was set.

func (*TokenUsage) SetCacheCreationInputTokens ¶ added in v0.5.0

func (u *TokenUsage) SetCacheCreationInputTokens(tokens uint32)

SetCacheCreationInputTokens sets the number of cache creation input tokens and marks the field as set.

func (*TokenUsage) SetCachedInputTokens ¶ added in v0.5.0

func (u *TokenUsage) SetCachedInputTokens(tokens uint32)

SetCachedInputTokens sets the number of cached input tokens and marks the field as set.

func (*TokenUsage) SetInputTokens ¶ added in v0.5.0

func (u *TokenUsage) SetInputTokens(tokens uint32)

SetInputTokens sets the number of input tokens and marks the field as set.

func (*TokenUsage) SetOutputTokens ¶ added in v0.5.0

func (u *TokenUsage) SetOutputTokens(tokens uint32)

SetOutputTokens sets the number of output tokens and marks the field as set.

func (*TokenUsage) SetReasoningTokens ¶ added in v0.6.0

func (u *TokenUsage) SetReasoningTokens(tokens uint32)

SetReasoningTokens sets the number of reasoning tokens and marks the field as set.

func (*TokenUsage) SetTotalTokens ¶ added in v0.5.0

func (u *TokenUsage) SetTotalTokens(tokens uint32)

SetTotalTokens sets the number of total tokens and marks the field as set.

func (*TokenUsage) TotalTokens ¶ added in v0.5.0

func (u *TokenUsage) TotalTokens() (uint32, bool)

TotalTokens returns the number of total tokens and whether it was set.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL