v1alpha1

package

v0.3.0 Latest Latest Go to latest Published: Aug 21, 2025 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/envoyproxy/ai-gateway

Links

Open Source Insights

Documentation ¶

Overview ¶

Package v1alpha1 contains API schema definitions for the aigateway.envoyproxy.io API group.

+kubebuilder:object:generate=true +groupName=aigateway.envoyproxy.io

Index ¶

Constants
Variables
type AIGatewayFilterConfig
- func (in *AIGatewayFilterConfig) DeepCopy() *AIGatewayFilterConfig
- func (in *AIGatewayFilterConfig) DeepCopyInto(out *AIGatewayFilterConfig)
type AIGatewayFilterConfigExternalProcessor
- func (in *AIGatewayFilterConfigExternalProcessor) DeepCopy() *AIGatewayFilterConfigExternalProcessor
- func (in *AIGatewayFilterConfigExternalProcessor) DeepCopyInto(out *AIGatewayFilterConfigExternalProcessor)
type AIGatewayFilterConfigType
type AIGatewayRoute
- func (in *AIGatewayRoute) DeepCopy() *AIGatewayRoute
- func (in *AIGatewayRoute) DeepCopyInto(out *AIGatewayRoute)
- func (in *AIGatewayRoute) DeepCopyObject() runtime.Object
type AIGatewayRouteList
- func (in *AIGatewayRouteList) DeepCopy() *AIGatewayRouteList
- func (in *AIGatewayRouteList) DeepCopyInto(out *AIGatewayRouteList)
- func (in *AIGatewayRouteList) DeepCopyObject() runtime.Object
type AIGatewayRouteRule
- func (in *AIGatewayRouteRule) DeepCopy() *AIGatewayRouteRule
- func (in *AIGatewayRouteRule) DeepCopyInto(out *AIGatewayRouteRule)
- func (r *AIGatewayRouteRule) GetTimeoutsOrDefault() *gwapiv1.HTTPRouteTimeouts
- func (r *AIGatewayRouteRule) HasAIServiceBackends() bool
- func (r *AIGatewayRouteRule) HasInferencePoolBackends() bool
type AIGatewayRouteRuleBackendRef
- func (in *AIGatewayRouteRuleBackendRef) DeepCopy() *AIGatewayRouteRuleBackendRef
- func (in *AIGatewayRouteRuleBackendRef) DeepCopyInto(out *AIGatewayRouteRuleBackendRef)
- func (ref *AIGatewayRouteRuleBackendRef) IsAIServiceBackend() bool
- func (ref *AIGatewayRouteRuleBackendRef) IsInferencePool() bool
type AIGatewayRouteRuleMatch
- func (in *AIGatewayRouteRuleMatch) DeepCopy() *AIGatewayRouteRuleMatch
- func (in *AIGatewayRouteRuleMatch) DeepCopyInto(out *AIGatewayRouteRuleMatch)
type AIGatewayRouteSpec
- func (in *AIGatewayRouteSpec) DeepCopy() *AIGatewayRouteSpec
- func (in *AIGatewayRouteSpec) DeepCopyInto(out *AIGatewayRouteSpec)
type AIGatewayRouteStatus
- func (in *AIGatewayRouteStatus) DeepCopy() *AIGatewayRouteStatus
- func (in *AIGatewayRouteStatus) DeepCopyInto(out *AIGatewayRouteStatus)
type AIServiceBackend
- func (in *AIServiceBackend) DeepCopy() *AIServiceBackend
- func (in *AIServiceBackend) DeepCopyInto(out *AIServiceBackend)
- func (in *AIServiceBackend) DeepCopyObject() runtime.Object
type AIServiceBackendList
- func (in *AIServiceBackendList) DeepCopy() *AIServiceBackendList
- func (in *AIServiceBackendList) DeepCopyInto(out *AIServiceBackendList)
- func (in *AIServiceBackendList) DeepCopyObject() runtime.Object
type AIServiceBackendSpec
- func (in *AIServiceBackendSpec) DeepCopy() *AIServiceBackendSpec
- func (in *AIServiceBackendSpec) DeepCopyInto(out *AIServiceBackendSpec)
type AIServiceBackendStatus
- func (in *AIServiceBackendStatus) DeepCopy() *AIServiceBackendStatus
- func (in *AIServiceBackendStatus) DeepCopyInto(out *AIServiceBackendStatus)
type APISchema
type AWSCredentialsFile
- func (in *AWSCredentialsFile) DeepCopy() *AWSCredentialsFile
- func (in *AWSCredentialsFile) DeepCopyInto(out *AWSCredentialsFile)
type AWSOIDCExchangeToken
- func (in *AWSOIDCExchangeToken) DeepCopy() *AWSOIDCExchangeToken
- func (in *AWSOIDCExchangeToken) DeepCopyInto(out *AWSOIDCExchangeToken)
type AzureOIDCExchangeToken
- func (in *AzureOIDCExchangeToken) DeepCopy() *AzureOIDCExchangeToken
- func (in *AzureOIDCExchangeToken) DeepCopyInto(out *AzureOIDCExchangeToken)
type BackendSecurityPolicy
- func (in *BackendSecurityPolicy) DeepCopy() *BackendSecurityPolicy
- func (in *BackendSecurityPolicy) DeepCopyInto(out *BackendSecurityPolicy)
- func (in *BackendSecurityPolicy) DeepCopyObject() runtime.Object
type BackendSecurityPolicyAPIKey
- func (in *BackendSecurityPolicyAPIKey) DeepCopy() *BackendSecurityPolicyAPIKey
- func (in *BackendSecurityPolicyAPIKey) DeepCopyInto(out *BackendSecurityPolicyAPIKey)
type BackendSecurityPolicyAWSCredentials
- func (in *BackendSecurityPolicyAWSCredentials) DeepCopy() *BackendSecurityPolicyAWSCredentials
- func (in *BackendSecurityPolicyAWSCredentials) DeepCopyInto(out *BackendSecurityPolicyAWSCredentials)
type BackendSecurityPolicyAzureCredentials
- func (in *BackendSecurityPolicyAzureCredentials) DeepCopy() *BackendSecurityPolicyAzureCredentials
- func (in *BackendSecurityPolicyAzureCredentials) DeepCopyInto(out *BackendSecurityPolicyAzureCredentials)
type BackendSecurityPolicyGCPCredentials
- func (in *BackendSecurityPolicyGCPCredentials) DeepCopy() *BackendSecurityPolicyGCPCredentials
- func (in *BackendSecurityPolicyGCPCredentials) DeepCopyInto(out *BackendSecurityPolicyGCPCredentials)
type BackendSecurityPolicyList
- func (in *BackendSecurityPolicyList) DeepCopy() *BackendSecurityPolicyList
- func (in *BackendSecurityPolicyList) DeepCopyInto(out *BackendSecurityPolicyList)
- func (in *BackendSecurityPolicyList) DeepCopyObject() runtime.Object
type BackendSecurityPolicyOIDC
- func (in *BackendSecurityPolicyOIDC) DeepCopy() *BackendSecurityPolicyOIDC
- func (in *BackendSecurityPolicyOIDC) DeepCopyInto(out *BackendSecurityPolicyOIDC)
type BackendSecurityPolicySpec
- func (in *BackendSecurityPolicySpec) DeepCopy() *BackendSecurityPolicySpec
- func (in *BackendSecurityPolicySpec) DeepCopyInto(out *BackendSecurityPolicySpec)
type BackendSecurityPolicyStatus
- func (in *BackendSecurityPolicyStatus) DeepCopy() *BackendSecurityPolicyStatus
- func (in *BackendSecurityPolicyStatus) DeepCopyInto(out *BackendSecurityPolicyStatus)
type BackendSecurityPolicyType
type GCPCredentialsFile
- func (in *GCPCredentialsFile) DeepCopy() *GCPCredentialsFile
- func (in *GCPCredentialsFile) DeepCopyInto(out *GCPCredentialsFile)
type GCPOIDCExchangeToken
- func (in *GCPOIDCExchangeToken) DeepCopy() *GCPOIDCExchangeToken
- func (in *GCPOIDCExchangeToken) DeepCopyInto(out *GCPOIDCExchangeToken)
type GCPServiceAccountImpersonationConfig
- func (in *GCPServiceAccountImpersonationConfig) DeepCopy() *GCPServiceAccountImpersonationConfig
- func (in *GCPServiceAccountImpersonationConfig) DeepCopyInto(out *GCPServiceAccountImpersonationConfig)
type GCPWorkloadIdentityFederationConfig
- func (in *GCPWorkloadIdentityFederationConfig) DeepCopy() *GCPWorkloadIdentityFederationConfig
- func (in *GCPWorkloadIdentityFederationConfig) DeepCopyInto(out *GCPWorkloadIdentityFederationConfig)
type GCPWorkloadIdentityProvider
- func (in *GCPWorkloadIdentityProvider) DeepCopy() *GCPWorkloadIdentityProvider
- func (in *GCPWorkloadIdentityProvider) DeepCopyInto(out *GCPWorkloadIdentityProvider)
type LLMRequestCost
- func (in *LLMRequestCost) DeepCopy() *LLMRequestCost
- func (in *LLMRequestCost) DeepCopyInto(out *LLMRequestCost)
type LLMRequestCostType
type VersionedAPISchema
- func (in *VersionedAPISchema) DeepCopy() *VersionedAPISchema
- func (in *VersionedAPISchema) DeepCopyInto(out *VersionedAPISchema)

Constants ¶

View Source

const (
	// ConditionTypeAccepted is a condition type for the reconciliation result
	// where resources are accepted.
	ConditionTypeAccepted = "Accepted"
	// ConditionTypeNotAccepted is a condition type for the reconciliation result
	// where resources are not accepted.
	ConditionTypeNotAccepted = "NotAccepted"
)

View Source

const (
	// AIGatewayFilterMetadataNamespace is the namespace for the ai-gateway filter metadata.
	AIGatewayFilterMetadataNamespace = "io.envoy.ai_gateway"
)

View Source

const (
	// AIModelHeaderKey is the header key whose value is extracted from the request by the ai-gateway.
	// This can be used to describe the routing behavior in HTTPRoute referenced by AIGatewayRoute.
	AIModelHeaderKey = "x-ai-eg-model"
)

View Source

const GroupName = "aigateway.envoyproxy.io"

Variables ¶

View Source

var (

	// SchemeBuilder is used to add go types to the GroupVersionKind scheme.
	SchemeBuilder = &scheme.Builder{GroupVersion: schemeGroupVersion}

	// AddToScheme adds the types in this group-version to the given scheme.
	AddToScheme = SchemeBuilder.AddToScheme
)

Functions ¶

This section is empty.

Types ¶

type AIGatewayFilterConfig ¶

type AIGatewayFilterConfig struct {
	// Type specifies the type of the filter configuration.
	//
	// Currently, only ExternalProcessor is supported, and default is ExternalProcessor.
	//
	// +kubebuilder:default=ExternalProcessor
	Type AIGatewayFilterConfigType `json:"type"`

	// ExternalProcessor is the configuration for the external processor filter.
	// This is optional, and if not set, the default values of Deployment spec will be used.
	//
	// +optional
	ExternalProcessor *AIGatewayFilterConfigExternalProcessor `json:"externalProcessor,omitempty"`
}

func (*AIGatewayFilterConfig) DeepCopy ¶

func (in *AIGatewayFilterConfig) DeepCopy() *AIGatewayFilterConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayFilterConfig.

func (*AIGatewayFilterConfig) DeepCopyInto ¶

func (in *AIGatewayFilterConfig) DeepCopyInto(out *AIGatewayFilterConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIGatewayFilterConfigExternalProcessor ¶

type AIGatewayFilterConfigExternalProcessor struct {
	// Resources required by the external processor container.
	// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
	//
	// Note: when multiple AIGatewayRoute resources are attached to the same Gateway, and each
	// AIGatewayRoute has a different resource configuration, the ai-gateway will pick one of them
	// to configure the resource requirements of the external processor container.
	//
	// +optional
	Resources *corev1.ResourceRequirements `json:"resources,omitempty"`
}

func (*AIGatewayFilterConfigExternalProcessor) DeepCopy ¶

func (in *AIGatewayFilterConfigExternalProcessor) DeepCopy() *AIGatewayFilterConfigExternalProcessor

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayFilterConfigExternalProcessor.

func (*AIGatewayFilterConfigExternalProcessor) DeepCopyInto ¶

func (in *AIGatewayFilterConfigExternalProcessor) DeepCopyInto(out *AIGatewayFilterConfigExternalProcessor)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIGatewayFilterConfigType ¶

type AIGatewayFilterConfigType string

AIGatewayFilterConfigType specifies the type of the filter configuration.

+kubebuilder:validation:Enum=ExternalProcessor;DynamicModule

const (
	AIGatewayFilterConfigTypeExternalProcessor AIGatewayFilterConfigType = "ExternalProcessor"
	AIGatewayFilterConfigTypeDynamicModule     AIGatewayFilterConfigType = "DynamicModule" // Reserved for https://github.com/envoyproxy/ai-gateway/issues/90
)

type AIGatewayRoute ¶

type AIGatewayRoute struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
	// Spec defines the details of the AIGatewayRoute.
	Spec AIGatewayRouteSpec `json:"spec,omitempty"`
	// Status defines the status details of the AIGatewayRoute.
	Status AIGatewayRouteStatus `json:"status,omitempty"`
}

AIGatewayRoute combines multiple AIServiceBackends and attaching them to Gateway(s) resources.

This serves as a way to define a "unified" AI API for a Gateway which allows downstream clients to use a single schema API to interact with multiple AI backends.

Envoy AI Gateway will generate the following k8s resources corresponding to the AIGatewayRoute:

HTTPRoute of the Gateway API as a top-level resource to bind all backends. The name of the HTTPRoute is the same as the AIGatewayRoute.
HTTPRouteFilter of the Envoy Gateway API per namespace for automatic hostname rewrite. The name of the HTTPRouteFilter is `ai-eg-host-rewrite-${AIGatewayRoute.Name}`.

All of these resources are created in the same namespace as the AIGatewayRoute. Note that this is the implementation detail subject to change. If you want to customize the default behavior of the Envoy AI Gateway, you can use these resources as a reference and create your own resources. Alternatively, you can use EnvoyPatchPolicy API of the Envoy Gateway to patch the generated resources. For example, you can configure the retry fallback behavior by attaching BackendTrafficPolicy API of Envoy Gateway to the generated HTTPRoute.

+kubebuilder:object:root=true +kubebuilder:subresource:status +kubebuilder:printcolumn:name="Status",type=string,JSONPath=`.status.conditions[-1:].type`

func (*AIGatewayRoute) DeepCopy ¶

func (in *AIGatewayRoute) DeepCopy() *AIGatewayRoute

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRoute.

func (*AIGatewayRoute) DeepCopyInto ¶

func (in *AIGatewayRoute) DeepCopyInto(out *AIGatewayRoute)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIGatewayRoute) DeepCopyObject ¶

func (in *AIGatewayRoute) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type AIGatewayRouteList ¶

type AIGatewayRouteList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []AIGatewayRoute `json:"items"`
}

AIGatewayRouteList contains a list of AIGatewayRoute.

+kubebuilder:object:root=true

func (*AIGatewayRouteList) DeepCopy ¶

func (in *AIGatewayRouteList) DeepCopy() *AIGatewayRouteList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteList.

func (*AIGatewayRouteList) DeepCopyInto ¶

func (in *AIGatewayRouteList) DeepCopyInto(out *AIGatewayRouteList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIGatewayRouteList) DeepCopyObject ¶

func (in *AIGatewayRouteList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type AIGatewayRouteRule ¶

type AIGatewayRouteRule struct {
	// BackendRefs is the list of backends that this rule will route the traffic to.
	// Each backend can have a weight that determines the traffic distribution.
	//
	// The namespace of each backend is "local", i.e. the same namespace as the AIGatewayRoute.
	//
	// BackendRefs can reference either AIServiceBackend resources (default) or InferencePool resources
	// from the Gateway API Inference Extension. When referencing InferencePool resources:
	// - Only one InferencePool backend is allowed per rule
	// - Cannot mix InferencePool with AIServiceBackend references in the same rule
	// - Fallback behavior is handled by the InferencePool's endpoint picker
	//
	// For AIServiceBackend references, you can achieve fallback behavior by configuring multiple backends
	// combined with the BackendTrafficPolicy of Envoy Gateway.
	// Please refer to https://gateway.envoyproxy.io/docs/tasks/traffic/failover/ as well as
	// https://gateway.envoyproxy.io/docs/tasks/traffic/retry/.
	//
	// +optional
	// +kubebuilder:validation:MaxItems=128
	BackendRefs []AIGatewayRouteRuleBackendRef `json:"backendRefs,omitempty"`

	// Matches is the list of AIGatewayRouteMatch that this rule will match the traffic to.
	// This is a subset of the HTTPRouteMatch in the Gateway API. See for the details:
	// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPRouteMatch
	//
	// +optional
	// +kubebuilder:validation:MaxItems=128
	Matches []AIGatewayRouteRuleMatch `json:"matches,omitempty"`

	// Timeouts defines the timeouts that can be configured for an HTTP request.
	//
	// If this field is not set, or the timeout.requestTimeout is nil, Envoy AI Gateway defaults to
	// set 60s for the request timeout as opposed to 15s of the Envoy Gateway's default value.
	//
	// For streaming responses (like chat completions with stream=true), consider setting
	// longer timeouts as the response may take time until the completion.
	//
	// +optional
	Timeouts *gwapiv1.HTTPRouteTimeouts `json:"timeouts,omitempty"`

	// ModelsOwnedBy represents the owner of the running models serving by the backends,
	// which will be exported as the field of "OwnedBy" in openai-compatible API "/models".
	//
	// This is used only when this rule contains "x-ai-eg-model" in its header matching
	// where the header value will be recognized as a "model" in "/models" endpoint.
	// All the matched models will share the same owner.
	//
	// Default to "Envoy AI Gateway" if not set.
	//
	// +optional
	// +kubebuilder:default="Envoy AI Gateway"
	ModelsOwnedBy *string `json:"modelsOwnedBy,omitempty"`

	// ModelsCreatedAt represents the creation timestamp of the running models serving by the backends,
	// which will be exported as the field of "Created" in openai-compatible API "/models".
	// It follows the format of RFC 3339, for example "2024-05-21T10:00:00Z".
	//
	// This is used only when this rule contains "x-ai-eg-model" in its header matching
	// where the header value will be recognized as a "model" in "/models" endpoint.
	// All the matched models will share the same creation time.
	//
	// Default to the creation timestamp of the AIGatewayRoute if not set.
	//
	// +optional
	// +kubebuilder:validation:Format=date-time
	ModelsCreatedAt *metav1.Time `json:"modelsCreatedAt,omitempty"`
}

AIGatewayRouteRule is a rule that defines the routing behavior of the AIGatewayRoute.

+kubebuilder:validation:XValidation:rule="!has(self.backendRefs) || size(self.backendRefs) == 0 || (self.backendRefs.all(ref, !has(ref.group) && !has(ref.kind)) || self.backendRefs.all(ref, has(ref.group) && has(ref.kind)))", message="cannot mix InferencePool and AIServiceBackend references in the same rule" +kubebuilder:validation:XValidation:rule="!has(self.backendRefs) || size(self.backendRefs) == 0 || !self.backendRefs.exists(ref, has(ref.group) && has(ref.kind)) || size(self.backendRefs) == 1", message="only one InferencePool backend is allowed per rule"

func (*AIGatewayRouteRule) DeepCopy ¶

func (in *AIGatewayRouteRule) DeepCopy() *AIGatewayRouteRule

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteRule.

func (*AIGatewayRouteRule) DeepCopyInto ¶

func (in *AIGatewayRouteRule) DeepCopyInto(out *AIGatewayRouteRule)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIGatewayRouteRule) GetTimeoutsOrDefault ¶ added in v0.3.0

func (r *AIGatewayRouteRule) GetTimeoutsOrDefault() *gwapiv1.HTTPRouteTimeouts

GetTimeoutsWithDefaults returns the timeouts with default values applied when not specified. This ensures that AI Gateway routes have appropriate timeout defaults for AI workloads.

func (*AIGatewayRouteRule) HasAIServiceBackends ¶ added in v0.3.0

func (r *AIGatewayRouteRule) HasAIServiceBackends() bool

HasAIServiceBackends returns true if the rule contains any AIServiceBackend references.

func (*AIGatewayRouteRule) HasInferencePoolBackends ¶ added in v0.3.0

func (r *AIGatewayRouteRule) HasInferencePoolBackends() bool

HasInferencePoolBackends returns true if the rule contains any InferencePool backend references.

type AIGatewayRouteRuleBackendRef ¶

type AIGatewayRouteRuleBackendRef struct {
	// Name is the name of the backend resource.
	// When Group and Kind are not specified, this refers to an AIServiceBackend.
	// When Group and Kind are specified, this refers to the resource of the specified type.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	Name string `json:"name"`

	// Group is the group of the backend resource.
	// When not specified, defaults to aigateway.envoyproxy.io (AIServiceBackend).
	// Currently, only "inference.networking.x-k8s.io" is supported for InferencePool resources.
	//
	// +optional
	// +kubebuilder:validation:MaxLength=253
	// +kubebuilder:validation:Pattern=`^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$`
	Group *string `json:"group,omitempty"`

	// Kind is the kind of the backend resource.
	// When not specified, defaults to AIServiceBackend.
	// Currently, only "InferencePool" is supported when Group is specified.
	//
	// +optional
	// +kubebuilder:validation:MaxLength=63
	// +kubebuilder:validation:Pattern=`^$|^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$`
	Kind *string `json:"kind,omitempty"`

	// Name of the model in the backend. If provided this will override the name provided in the request.
	// This field is ignored when referencing InferencePool resources.
	//
	// +optional
	ModelNameOverride string `json:"modelNameOverride,omitempty"`

	// Weight is the weight of the backend. This is exactly the same as the weight in
	// the BackendRef in the Gateway API. See for the details:
	// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.BackendRef
	//
	// Default is 1.
	//
	// +optional
	// +kubebuilder:validation:Minimum=0
	// +kubebuilder:default=1
	Weight *int32 `json:"weight,omitempty"`
	// Priority is the priority of the backend. This sets the priority on the underlying endpoints.
	// See: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/priority
	// Note: This will override the `faillback` property of the underlying Envoy Gateway Backend
	// This field is ignored when referencing InferencePool resources.
	//
	// Default is 0.
	//
	// +optional
	// +kubebuilder:validation:Minimum=0
	// +kubebuilder:default=0
	Priority *uint32 `json:"priority,omitempty"`
}

AIGatewayRouteRuleBackendRef is a reference to a backend with a weight. It can reference either an AIServiceBackend or an InferencePool resource.

+kubebuilder:validation:XValidation:rule="!has(self.group) && !has(self.kind) || (has(self.group) && has(self.kind))", message="group and kind must be specified together" +kubebuilder:validation:XValidation:rule="!has(self.group) || (self.group == 'inference.networking.x-k8s.io' && self.kind == 'InferencePool')", message="only InferencePool from inference.networking.x-k8s.io group is supported"

func (*AIGatewayRouteRuleBackendRef) DeepCopy ¶

func (in *AIGatewayRouteRuleBackendRef) DeepCopy() *AIGatewayRouteRuleBackendRef

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteRuleBackendRef.

func (*AIGatewayRouteRuleBackendRef) DeepCopyInto ¶

func (in *AIGatewayRouteRuleBackendRef) DeepCopyInto(out *AIGatewayRouteRuleBackendRef)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIGatewayRouteRuleBackendRef) IsAIServiceBackend ¶ added in v0.3.0

func (ref *AIGatewayRouteRuleBackendRef) IsAIServiceBackend() bool

IsAIServiceBackend returns true if the backend reference points to an AIServiceBackend resource.

func (*AIGatewayRouteRuleBackendRef) IsInferencePool ¶ added in v0.3.0

func (ref *AIGatewayRouteRuleBackendRef) IsInferencePool() bool

IsInferencePool returns true if the backend reference points to an InferencePool resource.

type AIGatewayRouteRuleMatch ¶

type AIGatewayRouteRuleMatch struct {
	// Headers specifies HTTP request header matchers. See HeaderMatch in the Gateway API for the details:
	// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPHeaderMatch
	//
	// +listType=map
	// +listMapKey=name
	// +optional
	// +kubebuilder:validation:MaxItems=16
	Headers []gwapiv1.HTTPHeaderMatch `json:"headers,omitempty"`
}

func (*AIGatewayRouteRuleMatch) DeepCopy ¶

func (in *AIGatewayRouteRuleMatch) DeepCopy() *AIGatewayRouteRuleMatch

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteRuleMatch.

func (*AIGatewayRouteRuleMatch) DeepCopyInto ¶

func (in *AIGatewayRouteRuleMatch) DeepCopyInto(out *AIGatewayRouteRuleMatch)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIGatewayRouteSpec ¶

type AIGatewayRouteSpec struct {
	// TargetRefs are the names of the Gateway resources this AIGatewayRoute is being attached to.
	//
	// Deprecated: use the ParentRefs field instead. This field will be dropped in Envoy AI Gateway v0.4.0.
	//
	// +kubebuilder:validation:MaxItems=128
	//
	// +optional
	TargetRefs []gwapiv1a2.LocalPolicyTargetReferenceWithSectionName `json:"targetRefs,omitempty"`

	// ParentRefs are the names of the Gateway resources this AIGatewayRoute is being attached to.
	// Cross namespace references are not supported. In other words, the Gateway resources must be in the
	// same namespace as the AIGatewayRoute. Currently, each reference's Kind must be Gateway.
	//
	// +kubebuilder:validation:MaxItems=16
	// +kubebuilder:validation:XValidation:rule="self.all(match, match.kind == 'Gateway')", message="only Gateway is supported"
	//
	// +optional
	ParentRefs []gwapiv1.ParentReference `json:"parentRefs,omitempty"`

	// APISchema specifies the API schema of the input that the target Gateway(s) will receive.
	// Based on this schema, the ai-gateway will perform the necessary transformation to the
	// output schema specified in the selected AIServiceBackend during the routing process.
	//
	// Currently, the supported schemas are OpenAI and Anthropic as input schemas.
	//
	// Deprecated: this field is no longer used. It will be dropped in Envoy AI Gateway v0.4.0.
	//
	// +optional
	APISchema *VersionedAPISchema `json:"schema,omitempty"`

	// Rules is the list of AIGatewayRouteRule that this AIGatewayRoute will match the traffic to.
	// Each rule is a subset of the HTTPRoute in the Gateway API (https://gateway-api.sigs.k8s.io/api-types/httproute/).
	//
	// AI Gateway controller will generate a HTTPRoute based on the configuration given here with the additional
	// modifications to achieve the necessary jobs, notably inserting the AI Gateway filter responsible for
	// the transformation of the request and response, etc.
	//
	// In the matching conditions in the AIGatewayRouteRule, `x-ai-eg-model` header is available
	// if we want to describe the routing behavior based on the model name. The model name is extracted
	// from the request content before the routing decision.
	//
	// How multiple rules are matched is the same as the Gateway API. See for the details:
	// https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPRoute
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MaxItems=128
	Rules []AIGatewayRouteRule `json:"rules"`

	// FilterConfig is the configuration for the AI Gateway filter inserted in the generated HTTPRoute.
	//
	// An AI Gateway filter is responsible for the transformation of the request and response
	// as well as the routing behavior based on the model name extracted from the request content, etc.
	//
	// Currently, the filter is only implemented as an external processor filter, which might be
	// extended to other types of filters in the future. See https://github.com/envoyproxy/ai-gateway/issues/90
	//
	// +optional
	FilterConfig *AIGatewayFilterConfig `json:"filterConfig,omitempty"`

	// LLMRequestCosts specifies how to capture the cost of the LLM-related request, notably the token usage.
	// The AI Gateway filter will capture each specified number and store it in the Envoy's dynamic
	// metadata per HTTP request. The namespaced key is "io.envoy.ai_gateway",
	//
	// For example, let's say we have the following LLMRequestCosts configuration:
	// “`yaml
	//	llmRequestCosts:
	//	- metadataKey: llm_input_token
	//	  type: InputToken
	//	- metadataKey: llm_output_token
	//	  type: OutputToken
	//	- metadataKey: llm_total_token
	//	  type: TotalToken
	// “`
	// Then, with the following BackendTrafficPolicy of Envoy Gateway, you can have three
	// rate limit buckets for each unique x-user-id header value. One bucket is for the input token,
	// the other is for the output token, and the last one is for the total token.
	// Each bucket will be reduced by the corresponding token usage captured by the AI Gateway filter.
	//
	// “`yaml
	//	apiVersion: gateway.envoyproxy.io/v1alpha1
	//	kind: BackendTrafficPolicy
	//	metadata:
	//	  name: some-example-token-rate-limit
	//	  namespace: default
	//	spec:
	//	  targetRefs:
	//	  - group: gateway.networking.k8s.io
	//	     kind: HTTPRoute
	//	     name: usage-rate-limit
	//	  rateLimit:
	//	    type: Global
	//	    global:
	//	      rules:
	//	        - clientSelectors:
	//	            # Do the rate limiting based on the x-user-id header.
	//	            - headers:
	//	                - name: x-user-id
	//	                  type: Distinct
	//	          limit:
	//	            # Configures the number of "tokens" allowed per hour.
	//	            requests: 10000
	//	            unit: Hour
	//	          cost:
	//	            request:
	//	              from: Number
	//	              # Setting the request cost to zero allows to only check the rate limit budget,
	//	              # and not consume the budget on the request path.
	//	              number: 0
	//	            # This specifies the cost of the response retrieved from the dynamic metadata set by the AI Gateway filter.
	//	            # The extracted value will be used to consume the rate limit budget, and subsequent requests will be rate limited
	//	            # if the budget is exhausted.
	//	            response:
	//	              from: Metadata
	//	              metadata:
	//	                namespace: io.envoy.ai_gateway
	//	                key: llm_input_token
	//	        - clientSelectors:
	//	            - headers:
	//	                - name: x-user-id
	//	                  type: Distinct
	//	          limit:
	//	            requests: 10000
	//	            unit: Hour
	//	          cost:
	//	            request:
	//	              from: Number
	//	              number: 0
	//	            response:
	//	              from: Metadata
	//	              metadata:
	//	                namespace: io.envoy.ai_gateway
	//	                key: llm_output_token
	//	        - clientSelectors:
	//	            - headers:
	//	                - name: x-user-id
	//	                  type: Distinct
	//	          limit:
	//	            requests: 10000
	//	            unit: Hour
	//	          cost:
	//	            request:
	//	              from: Number
	//	              number: 0
	//	            response:
	//	              from: Metadata
	//	              metadata:
	//	                namespace: io.envoy.ai_gateway
	//	                key: llm_total_token
	// “`
	//
	// Note that when multiple AIGatewayRoute resources are attached to the same Gateway, and
	// different costs are configured for the same metadata key, the ai-gateway will pick one of them
	// to configure the metadata key in the generated HTTPRoute, and ignore the rest.
	//
	// +optional
	// +kubebuilder:validation:MaxItems=36
	LLMRequestCosts []LLMRequestCost `json:"llmRequestCosts,omitempty"`
}

AIGatewayRouteSpec details the AIGatewayRoute configuration.

+kubebuilder:validation:XValidation:rule="!has(self.parentRefs) || !has(self.targetRefs) || size(self.targetRefs) == 0", message="targetRefs is deprecated, use parentRefs only"

func (*AIGatewayRouteSpec) DeepCopy ¶

func (in *AIGatewayRouteSpec) DeepCopy() *AIGatewayRouteSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteSpec.

func (*AIGatewayRouteSpec) DeepCopyInto ¶

func (in *AIGatewayRouteSpec) DeepCopyInto(out *AIGatewayRouteSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIGatewayRouteStatus ¶ added in v0.2.0

type AIGatewayRouteStatus struct {
	// Conditions is the list of conditions by the reconciliation result.
	// Currently, at most one condition is set.
	//
	// Known .status.conditions.type are: "Accepted", "NotAccepted".
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

AIGatewayRouteStatus contains the conditions by the reconciliation result.

func (*AIGatewayRouteStatus) DeepCopy ¶ added in v0.2.0

func (in *AIGatewayRouteStatus) DeepCopy() *AIGatewayRouteStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIGatewayRouteStatus.

func (*AIGatewayRouteStatus) DeepCopyInto ¶ added in v0.2.0

func (in *AIGatewayRouteStatus) DeepCopyInto(out *AIGatewayRouteStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIServiceBackend ¶

type AIServiceBackend struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
	// Spec defines the details of AIServiceBackend.
	Spec AIServiceBackendSpec `json:"spec,omitempty"`
	// Status defines the status details of the AIServiceBackend.
	Status AIServiceBackendStatus `json:"status,omitempty"`
}

AIServiceBackend is a resource that represents a single backend for AIGatewayRoute. A backend is a service that handles traffic with a concrete API specification.

A AIServiceBackend is "attached" to a Backend which is either a k8s Service or a Backend resource of the Envoy Gateway.

When a backend with an attached AIServiceBackend is used as a routing target in the AIGatewayRoute (more precisely, the HTTPRouteSpec defined in the AIGatewayRoute), the ai-gateway will generate the necessary configuration to do the backend specific logic in the final HTTPRoute.

+kubebuilder:object:root=true +kubebuilder:subresource:status +kubebuilder:printcolumn:name="Status",type=string,JSONPath=`.status.conditions[-1:].type`

func (*AIServiceBackend) DeepCopy ¶

func (in *AIServiceBackend) DeepCopy() *AIServiceBackend

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIServiceBackend.

func (*AIServiceBackend) DeepCopyInto ¶

func (in *AIServiceBackend) DeepCopyInto(out *AIServiceBackend)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIServiceBackend) DeepCopyObject ¶

func (in *AIServiceBackend) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type AIServiceBackendList ¶

type AIServiceBackendList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []AIServiceBackend `json:"items"`
}

AIServiceBackendList contains a list of AIServiceBackends.

+kubebuilder:object:root=true

func (*AIServiceBackendList) DeepCopy ¶

func (in *AIServiceBackendList) DeepCopy() *AIServiceBackendList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIServiceBackendList.

func (*AIServiceBackendList) DeepCopyInto ¶

func (in *AIServiceBackendList) DeepCopyInto(out *AIServiceBackendList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*AIServiceBackendList) DeepCopyObject ¶

func (in *AIServiceBackendList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type AIServiceBackendSpec ¶

type AIServiceBackendSpec struct {
	// APISchema specifies the API schema of the output format of requests from
	// Envoy that this AIServiceBackend can accept as incoming requests.
	// Based on this schema, the ai-gateway will perform the necessary transformation for
	// the pair of AIGatewayRouteSpec.APISchema and AIServiceBackendSpec.APISchema.
	//
	// This is required to be set.
	//
	// +kubebuilder:validation:Required
	APISchema VersionedAPISchema `json:"schema"`
	// BackendRef is the reference to the Backend resource that this AIServiceBackend corresponds to.
	//
	// A backend must be a Backend resource of Envoy Gateway. Note that k8s Service will be supported
	// as a backend in the future.
	//
	// This is required to be set.
	//
	// +kubebuilder:validation:Required
	BackendRef gwapiv1.BackendObjectReference `json:"backendRef"`

	// BackendSecurityPolicyRef is the name of the BackendSecurityPolicy resources this backend
	// is being attached to.
	//
	// Deprecated: Use BackendSecurityPolicy.spec.targetRefs instead. This field will be dropped after Envoy AI Gateway v0.3 release.
	// When this field is set, the BackendSecurityPolicy.spec.targetRefs will be ignored. To migrate to the new field,
	// set the targetRefs in the BackendSecurityPolicy to point to this AIServiceBackend first, apply the change,
	// and then remove this field from the AIServiceBackend.
	//
	// +optional
	BackendSecurityPolicyRef *gwapiv1.LocalObjectReference `json:"backendSecurityPolicyRef,omitempty"`
}

AIServiceBackendSpec details the AIServiceBackend configuration.

func (*AIServiceBackendSpec) DeepCopy ¶

func (in *AIServiceBackendSpec) DeepCopy() *AIServiceBackendSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIServiceBackendSpec.

func (*AIServiceBackendSpec) DeepCopyInto ¶

func (in *AIServiceBackendSpec) DeepCopyInto(out *AIServiceBackendSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AIServiceBackendStatus ¶ added in v0.2.0

type AIServiceBackendStatus struct {
	// Conditions is the list of conditions by the reconciliation result.
	// Currently, at most one condition is set.
	//
	// Known .status.conditions.type are: "Accepted", "NotAccepted".
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

AIServiceBackendStatus contains the conditions by the reconciliation result.

func (*AIServiceBackendStatus) DeepCopy ¶ added in v0.2.0

func (in *AIServiceBackendStatus) DeepCopy() *AIServiceBackendStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AIServiceBackendStatus.

func (*AIServiceBackendStatus) DeepCopyInto ¶ added in v0.2.0

func (in *AIServiceBackendStatus) DeepCopyInto(out *AIServiceBackendStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type APISchema ¶

type APISchema string

APISchema defines the API schema.

const (
	// APISchemaOpenAI is the OpenAI schema.
	//
	// https://github.com/openai/openai-openapi
	APISchemaOpenAI APISchema = "OpenAI"
	// APISchemaAWSBedrock is the AWS Bedrock schema.
	//
	// https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock_Runtime.html
	APISchemaAWSBedrock APISchema = "AWSBedrock"
	// APISchemaAzureOpenAI APISchemaAzure is the Azure OpenAI schema.
	//
	// https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#api-specs
	APISchemaAzureOpenAI APISchema = "AzureOpenAI"
	// APISchemaGCPVertexAI is the schema followed by Gemini models hosted on GCP's Vertex AI platform.
	// Note: Using this schema requires a BackendSecurityPolicy to be configured and attached,
	// as the transformation will use the gcp-region and project-name from the BackendSecurityPolicy.
	//
	// https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/generateContent?hl=en
	APISchemaGCPVertexAI APISchema = "GCPVertexAI"
	// APISchemaGCPAnthropic is the schema for Anthropic models hosted on GCP's Vertex AI platform.
	// Returns native Anthropic format responses for seamless integration.
	//
	// https://docs.anthropic.com/en/api/claude-on-vertex-ai
	APISchemaGCPAnthropic APISchema = "GCPAnthropic"
)

type AWSCredentialsFile ¶

type AWSCredentialsFile struct {
	// SecretRef is the reference to the credential file.
	//
	// The secret should contain the AWS credentials file keyed on "credentials".
	SecretRef *gwapiv1.SecretObjectReference `json:"secretRef"`

	// Profile is the profile to use in the credentials file.
	//
	// +kubebuilder:default=default
	Profile string `json:"profile,omitempty"`
}

AWSCredentialsFile specifies the credentials file to use for the AWS provider. Envoy reads the secret file, and the profile to use is specified by the Profile field.

func (*AWSCredentialsFile) DeepCopy ¶

func (in *AWSCredentialsFile) DeepCopy() *AWSCredentialsFile

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AWSCredentialsFile.

func (*AWSCredentialsFile) DeepCopyInto ¶

func (in *AWSCredentialsFile) DeepCopyInto(out *AWSCredentialsFile)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AWSOIDCExchangeToken ¶

type AWSOIDCExchangeToken struct {
	// BackendSecurityPolicyOIDC is the generic OIDC fields.
	BackendSecurityPolicyOIDC `json:",inline"`

	// AwsRoleArn is the AWS IAM Role with the permission to use specific resources in AWS account
	// which maps to the temporary AWS security credentials exchanged using the authentication token issued by OIDC provider.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	AwsRoleArn string `json:"awsRoleArn"`
}

AWSOIDCExchangeToken specifies credentials to obtain oidc token from a sso server. For AWS, the controller will query STS to obtain AWS AccessKeyId, SecretAccessKey, and SessionToken, and store them in a temporary credentials file.

func (*AWSOIDCExchangeToken) DeepCopy ¶

func (in *AWSOIDCExchangeToken) DeepCopy() *AWSOIDCExchangeToken

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AWSOIDCExchangeToken.

func (*AWSOIDCExchangeToken) DeepCopyInto ¶

func (in *AWSOIDCExchangeToken) DeepCopyInto(out *AWSOIDCExchangeToken)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type AzureOIDCExchangeToken ¶ added in v0.2.0

type AzureOIDCExchangeToken struct {
	// BackendSecurityPolicyOIDC is the generic OIDC fields.
	BackendSecurityPolicyOIDC `json:",inline"`
}

AzureOIDCExchangeToken specifies credentials to obtain oidc token from a sso server. For Azure, the controller will query Azure Entra ID to get an Azure Access Token, and store them in a secret.

func (*AzureOIDCExchangeToken) DeepCopy ¶ added in v0.2.0

func (in *AzureOIDCExchangeToken) DeepCopy() *AzureOIDCExchangeToken

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new AzureOIDCExchangeToken.

func (*AzureOIDCExchangeToken) DeepCopyInto ¶ added in v0.2.0

func (in *AzureOIDCExchangeToken) DeepCopyInto(out *AzureOIDCExchangeToken)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicy ¶

type BackendSecurityPolicy struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
	Spec              BackendSecurityPolicySpec `json:"spec,omitempty"`
	// Status defines the status details of the BackendSecurityPolicy.
	Status BackendSecurityPolicyStatus `json:"status,omitempty"`
}

BackendSecurityPolicy specifies configuration for authentication and authorization rules on the traffic exiting the gateway to the backend.

+kubebuilder:object:root=true +kubebuilder:subresource:status +kubebuilder:printcolumn:name="Status",type=string,JSONPath=`.status.conditions[-1:].type` +kubebuilder:metadata:labels="gateway.networking.k8s.io/policy=direct"

func (*BackendSecurityPolicy) DeepCopy ¶

func (in *BackendSecurityPolicy) DeepCopy() *BackendSecurityPolicy

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicy.

func (*BackendSecurityPolicy) DeepCopyInto ¶

func (in *BackendSecurityPolicy) DeepCopyInto(out *BackendSecurityPolicy)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*BackendSecurityPolicy) DeepCopyObject ¶

func (in *BackendSecurityPolicy) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type BackendSecurityPolicyAPIKey ¶

type BackendSecurityPolicyAPIKey struct {
	// SecretRef is the reference to the secret containing the API key.
	// ai-gateway must be given the permission to read this secret.
	// The key of the secret should be "apiKey".
	SecretRef *gwapiv1.SecretObjectReference `json:"secretRef"`
}

BackendSecurityPolicyAPIKey specifies the API key.

func (*BackendSecurityPolicyAPIKey) DeepCopy ¶

func (in *BackendSecurityPolicyAPIKey) DeepCopy() *BackendSecurityPolicyAPIKey

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyAPIKey.

func (*BackendSecurityPolicyAPIKey) DeepCopyInto ¶

func (in *BackendSecurityPolicyAPIKey) DeepCopyInto(out *BackendSecurityPolicyAPIKey)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyAWSCredentials ¶

type BackendSecurityPolicyAWSCredentials struct {
	// Region specifies the AWS region associated with the policy.
	//
	// +kubebuilder:validation:MinLength=1
	Region string `json:"region"`

	// CredentialsFile specifies the credentials file to use for the AWS provider.
	//
	// +optional
	CredentialsFile *AWSCredentialsFile `json:"credentialsFile,omitempty"`

	// OIDCExchangeToken specifies the oidc configurations used to obtain an oidc token. The oidc token will be
	// used to obtain temporary credentials to access AWS.
	//
	// +optional
	OIDCExchangeToken *AWSOIDCExchangeToken `json:"oidcExchangeToken,omitempty"`
}

BackendSecurityPolicyAWSCredentials contains the supported authentication mechanisms to access aws.

func (*BackendSecurityPolicyAWSCredentials) DeepCopy ¶

func (in *BackendSecurityPolicyAWSCredentials) DeepCopy() *BackendSecurityPolicyAWSCredentials

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyAWSCredentials.

func (*BackendSecurityPolicyAWSCredentials) DeepCopyInto ¶

func (in *BackendSecurityPolicyAWSCredentials) DeepCopyInto(out *BackendSecurityPolicyAWSCredentials)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyAzureCredentials ¶ added in v0.2.0

type BackendSecurityPolicyAzureCredentials struct {
	// ClientID is a unique identifier for an application in Azure.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	ClientID string `json:"clientID"`

	// TenantId is a unique identifier for an Azure Active Directory instance.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	TenantID string `json:"tenantID"`

	// ClientSecretRef is the reference to the secret containing the Azure client secret.
	// ai-gateway must be given the permission to read this secret.
	// The key of secret should be "client-secret".
	//
	// +optional
	ClientSecretRef *gwapiv1.SecretObjectReference `json:"clientSecretRef,omitempty"`

	// OIDCExchangeToken specifies the oidc configurations used to obtain an oidc token. The oidc token will be
	// used to obtain temporary credentials to access Azure.
	//
	// +optional
	OIDCExchangeToken *AzureOIDCExchangeToken `json:"oidcExchangeToken,omitempty"`
}

BackendSecurityPolicyAzureCredentials contains the supported authentication mechanisms to access Azure. Only one of ClientSecretRef or OIDCExchangeToken must be specified. Credentials will not be generated if neither are set.

+kubebuilder:validation:XValidation:rule="(has(self.clientSecretRef) && !has(self.oidcExchangeToken)) || (!has(self.clientSecretRef) && has(self.oidcExchangeToken))",message="Exactly one of clientSecretRef or oidcExchangeToken must be specified"

func (*BackendSecurityPolicyAzureCredentials) DeepCopy ¶ added in v0.2.0

func (in *BackendSecurityPolicyAzureCredentials) DeepCopy() *BackendSecurityPolicyAzureCredentials

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyAzureCredentials.

func (*BackendSecurityPolicyAzureCredentials) DeepCopyInto ¶ added in v0.2.0

func (in *BackendSecurityPolicyAzureCredentials) DeepCopyInto(out *BackendSecurityPolicyAzureCredentials)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyGCPCredentials ¶ added in v0.3.0

type BackendSecurityPolicyGCPCredentials struct {
	// ProjectName is the GCP project name.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	ProjectName string `json:"projectName"`
	// Region is the GCP region associated with the policy.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	Region string `json:"region"`

	// CredentialsFile specifies the service account credentials file to use for the GCP provider.
	//
	// +optional
	CredentialsFile *GCPCredentialsFile `json:"credentialsFile,omitempty"`

	// WorkloadIdentityFederationConfig is the configuration for the GCP Workload Identity Federation.
	//
	// +optional
	WorkloadIdentityFederationConfig *GCPWorkloadIdentityFederationConfig `json:"workloadIdentityFederationConfig,omitempty"`
}

BackendSecurityPolicyGCPCredentials contains the supported authentication mechanisms to access GCP. +kubebuilder:validation:XValidation:rule="(has(self.credentialsFile) && !has(self.workloadIdentityFederationConfig)) || (has(self.workloadIdentityFederationConfig) && !has(self.credentialsFile))",message="Exactly one of GCPWorkloadIdentityFederationConfig or GCPCredentialsFile must be specified"

func (*BackendSecurityPolicyGCPCredentials) DeepCopy ¶ added in v0.3.0

func (in *BackendSecurityPolicyGCPCredentials) DeepCopy() *BackendSecurityPolicyGCPCredentials

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyGCPCredentials.

func (*BackendSecurityPolicyGCPCredentials) DeepCopyInto ¶ added in v0.3.0

func (in *BackendSecurityPolicyGCPCredentials) DeepCopyInto(out *BackendSecurityPolicyGCPCredentials)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyList ¶

type BackendSecurityPolicyList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []BackendSecurityPolicy `json:"items"`
}

BackendSecurityPolicyList contains a list of BackendSecurityPolicy

+kubebuilder:object:root=true

func (*BackendSecurityPolicyList) DeepCopy ¶

func (in *BackendSecurityPolicyList) DeepCopy() *BackendSecurityPolicyList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyList.

func (*BackendSecurityPolicyList) DeepCopyInto ¶

func (in *BackendSecurityPolicyList) DeepCopyInto(out *BackendSecurityPolicyList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*BackendSecurityPolicyList) DeepCopyObject ¶

func (in *BackendSecurityPolicyList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type BackendSecurityPolicyOIDC ¶ added in v0.2.0

type BackendSecurityPolicyOIDC struct {
	// OIDC is used to obtain oidc tokens via an SSO server which will be used to exchange for provider credentials.
	//
	// +kubebuilder:validation:Required
	OIDC egv1a1.OIDC `json:"oidc"`

	// GrantType is the method application gets access token.
	//
	// +optional
	GrantType string `json:"grantType,omitempty"`

	// Aud defines the audience that this ID Token is intended for.
	//
	// +optional
	Aud string `json:"aud,omitempty"`
}

BackendSecurityPolicyOIDC specifies OIDC related fields.

func (*BackendSecurityPolicyOIDC) DeepCopy ¶ added in v0.2.0

func (in *BackendSecurityPolicyOIDC) DeepCopy() *BackendSecurityPolicyOIDC

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyOIDC.

func (*BackendSecurityPolicyOIDC) DeepCopyInto ¶ added in v0.2.0

func (in *BackendSecurityPolicyOIDC) DeepCopyInto(out *BackendSecurityPolicyOIDC)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicySpec ¶

type BackendSecurityPolicySpec struct {
	// TargetRefs are the names of the AIServiceBackend resources this BackendSecurityPolicy is being attached to.
	// Attaching multiple BackendSecurityPolicies to the same AIServiceBackend is invalid and will result in an error
	// during the reconciliation of AIServiceBackend.
	//
	// +optional
	// +kubebuilder:validation:MaxItems=16
	// +kubebuilder:validation:XValidation:rule="self.all(ref, ref.group == 'aigateway.envoyproxy.io' && ref.kind == 'AIServiceBackend')", message="targetRefs must reference AIServiceBackend resources"
	TargetRefs []gwapiv1a2.LocalPolicyTargetReference `json:"targetRefs,omitempty"`

	// Type specifies the type of the backend security policy.
	//
	// +kubebuilder:validation:Enum=APIKey;AWSCredentials;AzureCredentials;GCPCredentials
	Type BackendSecurityPolicyType `json:"type"`

	// APIKey is a mechanism to access a backend(s). The API key will be injected into the Authorization header.
	//
	// +optional
	APIKey *BackendSecurityPolicyAPIKey `json:"apiKey,omitempty"`

	// AWSCredentials is a mechanism to access a backend(s). AWS specific logic will be applied.
	//
	// +optional
	AWSCredentials *BackendSecurityPolicyAWSCredentials `json:"awsCredentials,omitempty"`

	// AzureCredentials is a mechanism to access a backend(s). Azure OpenAI specific logic will be applied.
	//
	// +optional
	AzureCredentials *BackendSecurityPolicyAzureCredentials `json:"azureCredentials,omitempty"`
	// GCPCredentials is a mechanism to access a backend(s). GCP specific logic will be applied.
	//
	// +optional
	GCPCredentials *BackendSecurityPolicyGCPCredentials `json:"gcpCredentials,omitempty"`
}

BackendSecurityPolicySpec specifies authentication rules on access the provider from the Gateway. Only one mechanism to access a backend(s) can be specified.

Only one type of BackendSecurityPolicy can be defined. +kubebuilder:validation:MaxProperties=3 +kubebuilder:validation:XValidation:rule="self.type == 'APIKey' ? (has(self.apiKey) && !has(self.awsCredentials) && !has(self.azureCredentials) && !has(self.gcpCredentials)) : true",message="When type is APIKey, only apiKey field should be set" +kubebuilder:validation:XValidation:rule="self.type == 'AWSCredentials' ? (has(self.awsCredentials) && !has(self.apiKey) && !has(self.azureCredentials) && !has(self.gcpCredentials)) : true",message="When type is AWSCredentials, only awsCredentials field should be set" +kubebuilder:validation:XValidation:rule="self.type == 'AzureCredentials' ? (has(self.azureCredentials) && !has(self.apiKey) && !has(self.awsCredentials) && !has(self.gcpCredentials)) : true",message="When type is AzureCredentials, only azureCredentials field should be set" +kubebuilder:validation:XValidation:rule="self.type == 'GCPCredentials' ? (has(self.gcpCredentials) && !has(self.apiKey) && !has(self.awsCredentials) && !has(self.azureCredentials)) : true",message="When type is GCPCredentials, only gcpCredentials field should be set"

func (*BackendSecurityPolicySpec) DeepCopy ¶

func (in *BackendSecurityPolicySpec) DeepCopy() *BackendSecurityPolicySpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicySpec.

func (*BackendSecurityPolicySpec) DeepCopyInto ¶

func (in *BackendSecurityPolicySpec) DeepCopyInto(out *BackendSecurityPolicySpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyStatus ¶ added in v0.2.0

type BackendSecurityPolicyStatus struct {
	// Conditions is the list of conditions by the reconciliation result.
	// Currently, at most one condition is set.
	//
	// Known .status.conditions.type are: "Accepted", "NotAccepted".
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

BackendSecurityPolicyStatus contains the conditions by the reconciliation result.

func (*BackendSecurityPolicyStatus) DeepCopy ¶ added in v0.2.0

func (in *BackendSecurityPolicyStatus) DeepCopy() *BackendSecurityPolicyStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendSecurityPolicyStatus.

func (*BackendSecurityPolicyStatus) DeepCopyInto ¶ added in v0.2.0

func (in *BackendSecurityPolicyStatus) DeepCopyInto(out *BackendSecurityPolicyStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendSecurityPolicyType ¶

type BackendSecurityPolicyType string

BackendSecurityPolicyType specifies the type of auth mechanism used to access a backend.

const (
	BackendSecurityPolicyTypeAPIKey           BackendSecurityPolicyType = "APIKey"
	BackendSecurityPolicyTypeAWSCredentials   BackendSecurityPolicyType = "AWSCredentials"
	BackendSecurityPolicyTypeAzureCredentials BackendSecurityPolicyType = "AzureCredentials"
	BackendSecurityPolicyTypeGCPCredentials   BackendSecurityPolicyType = "GCPCredentials"
)

type GCPCredentialsFile ¶ added in v0.3.0

type GCPCredentialsFile struct {
	// SecretRef is the reference to the credential file.
	//
	// The secret should contain the GCP service account credentials file keyed on "service_account.json".
	SecretRef *gwapiv1.SecretObjectReference `json:"secretRef"`
}

GCPCredentialsFile specifies the service account key json file to authenticate with GCP provider.

func (*GCPCredentialsFile) DeepCopy ¶ added in v0.3.0

func (in *GCPCredentialsFile) DeepCopy() *GCPCredentialsFile

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GCPCredentialsFile.

func (*GCPCredentialsFile) DeepCopyInto ¶ added in v0.3.0

func (in *GCPCredentialsFile) DeepCopyInto(out *GCPCredentialsFile)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type GCPOIDCExchangeToken ¶ added in v0.3.0

type GCPOIDCExchangeToken struct {
	// BackendSecurityPolicyOIDC is the generic OIDC fields.
	BackendSecurityPolicyOIDC `json:",inline"`
}

func (*GCPOIDCExchangeToken) DeepCopy ¶ added in v0.3.0

func (in *GCPOIDCExchangeToken) DeepCopy() *GCPOIDCExchangeToken

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GCPOIDCExchangeToken.

func (*GCPOIDCExchangeToken) DeepCopyInto ¶ added in v0.3.0

func (in *GCPOIDCExchangeToken) DeepCopyInto(out *GCPOIDCExchangeToken)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type GCPServiceAccountImpersonationConfig ¶ added in v0.3.0

type GCPServiceAccountImpersonationConfig struct {
	// ServiceAccountName is the name of the service account to impersonate.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	ServiceAccountName string `json:"serviceAccountName"`
}

func (*GCPServiceAccountImpersonationConfig) DeepCopy ¶ added in v0.3.0

func (in *GCPServiceAccountImpersonationConfig) DeepCopy() *GCPServiceAccountImpersonationConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GCPServiceAccountImpersonationConfig.

func (*GCPServiceAccountImpersonationConfig) DeepCopyInto ¶ added in v0.3.0

func (in *GCPServiceAccountImpersonationConfig) DeepCopyInto(out *GCPServiceAccountImpersonationConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type GCPWorkloadIdentityFederationConfig ¶ added in v0.3.0

type GCPWorkloadIdentityFederationConfig struct {
	// ProjectID is the GCP project ID.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	ProjectID string `json:"projectID"`

	// WorkloadIdentityProviderName is the name of the external identity provider as registered on Google Cloud Platform.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	WorkloadIdentityProviderName string `json:"workloadIdentityProviderName"`

	// OIDCExchangeToken specifies the oidc configurations used to obtain an oidc token. The oidc token will be
	// used to obtain temporary credentials to access GCP.
	//
	// +kubebuilder:validation:Required
	OIDCExchangeToken GCPOIDCExchangeToken `json:"oidcExchangeToken"`

	// WorkloadIdentityPoolName is the name of the workload identity pool defined in GCP.
	// https://cloud.google.com/iam/docs/workload-identity-federation?hl=en
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	WorkloadIdentityPoolName string `json:"workloadIdentityPoolName"`

	// ServiceAccountImpersonation is the service account impersonation configuration.
	// This is used to impersonate a service account when getting access token.
	//
	// +optional
	ServiceAccountImpersonation *GCPServiceAccountImpersonationConfig `json:"serviceAccountImpersonation,omitempty"`
}

func (*GCPWorkloadIdentityFederationConfig) DeepCopy ¶ added in v0.3.0

func (in *GCPWorkloadIdentityFederationConfig) DeepCopy() *GCPWorkloadIdentityFederationConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GCPWorkloadIdentityFederationConfig.

func (*GCPWorkloadIdentityFederationConfig) DeepCopyInto ¶ added in v0.3.0

func (in *GCPWorkloadIdentityFederationConfig) DeepCopyInto(out *GCPWorkloadIdentityFederationConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type GCPWorkloadIdentityProvider ¶ added in v0.3.0

type GCPWorkloadIdentityProvider struct {
	// Name of the external identity provider as registered on Google Cloud Platform.
	//
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:MinLength=1
	Name string `json:"name"`

	// OIDCProvider is the generic OIDCProvider fields.
	//
	// +kubebuilder:validation:Required
	OIDCProvider BackendSecurityPolicyOIDC `json:"OIDCProvider"`
}

GCPWorkloadIdentityProvider specifies the external identity provider to be used to authenticate against GCP. The external identity provider can be AWS, Microsoft, etc but must be pre-registered in the GCP project

https://cloud.google.com/iam/docs/workload-identity-federation

func (*GCPWorkloadIdentityProvider) DeepCopy ¶ added in v0.3.0

func (in *GCPWorkloadIdentityProvider) DeepCopy() *GCPWorkloadIdentityProvider

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GCPWorkloadIdentityProvider.

func (*GCPWorkloadIdentityProvider) DeepCopyInto ¶ added in v0.3.0

func (in *GCPWorkloadIdentityProvider) DeepCopyInto(out *GCPWorkloadIdentityProvider)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type LLMRequestCost ¶

type LLMRequestCost struct {
	// MetadataKey is the key of the metadata to store this cost of the request.
	//
	// +kubebuilder:validation:Required
	MetadataKey string `json:"metadataKey"`
	// Type specifies the type of the request cost. The default is "OutputToken",
	// and it uses "output token" as the cost. The other types are "InputToken", "TotalToken",
	// and "CEL".
	//
	// +kubebuilder:validation:Enum=OutputToken;InputToken;TotalToken;CEL
	Type LLMRequestCostType `json:"type"`
	// CEL is the CEL expression to calculate the cost of the request.
	// The CEL expression must return a signed or unsigned integer. If the
	// return value is negative, it will be error.
	//
	// The expression can use the following variables:
	//
	//	* model: the model name extracted from the request content. Type: string.
	//	* backend: the backend name in the form of "name.namespace". Type: string.
	//	* input_tokens: the number of input tokens. Type: unsigned integer.
	//	* output_tokens: the number of output tokens. Type: unsigned integer.
	//	* total_tokens: the total number of tokens. Type: unsigned integer.
	//
	// For example, the following expressions are valid:
	//
	// 	* "model == 'llama' ?  input_tokens + output_token * 0.5 : total_tokens"
	//	* "backend == 'foo.default' ?  input_tokens + output_tokens : total_tokens"
	//	* "input_tokens + output_tokens + total_tokens"
	//	* "input_tokens * output_tokens"
	//
	// +optional
	CEL *string `json:"cel,omitempty"`
}

LLMRequestCost configures each request cost.

func (*LLMRequestCost) DeepCopy ¶

func (in *LLMRequestCost) DeepCopy() *LLMRequestCost

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LLMRequestCost.

func (*LLMRequestCost) DeepCopyInto ¶

func (in *LLMRequestCost) DeepCopyInto(out *LLMRequestCost)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type LLMRequestCostType ¶

type LLMRequestCostType string

LLMRequestCostType specifies the type of the LLMRequestCost.

const (
	// LLMRequestCostTypeInputToken is the cost type of the input token.
	LLMRequestCostTypeInputToken LLMRequestCostType = "InputToken"
	// LLMRequestCostTypeOutputToken is the cost type of the output token.
	LLMRequestCostTypeOutputToken LLMRequestCostType = "OutputToken"
	// LLMRequestCostTypeTotalToken is the cost type of the total token.
	LLMRequestCostTypeTotalToken LLMRequestCostType = "TotalToken"
	// LLMRequestCostTypeCEL is for calculating the cost using the CEL expression.
	LLMRequestCostTypeCEL LLMRequestCostType = "CEL"
)

type VersionedAPISchema ¶

type VersionedAPISchema struct {
	// Name is the name of the API schema of the AIGatewayRoute or AIServiceBackend.
	//
	// +kubebuilder:validation:Enum=OpenAI;AWSBedrock;AzureOpenAI;GCPVertexAI;GCPAnthropic
	Name APISchema `json:"name"`

	// Version is the version of the API schema.
	//
	// When the name is set to "OpenAI", this equals to the prefix of the OpenAI API endpoints. This defaults to "v1"
	// if not set or empty string. For example, "chat completions" API endpoint will be "/v1/chat/completions"
	// if the version is set to "v1".
	//
	// This is especially useful when routing to the backend that has an OpenAI compatible API but has a different
	// versioning scheme. For example, Gemini OpenAI compatible API (https://ai.google.dev/gemini-api/docs/openai) uses
	// "/v1beta/openai" version prefix. Another example is that Cohere AI (https://docs.cohere.com/v2/docs/compatibility-api)
	// uses "/compatibility/v1" version prefix. On the other hand, DeepSeek (https://api-docs.deepseek.com/) doesn't
	// use version prefix, so the version can be set to an empty string.
	//
	// When the name is set to AzureOpenAI, this version maps to "API Version" in the
	// Azure OpenAI API documentation (https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning).
	Version *string `json:"version,omitempty"`
}

VersionedAPISchema defines the API schema of either AIGatewayRoute (the input) or AIServiceBackend (the output).

This allows the ai-gateway to understand the input and perform the necessary transformation depending on the API schema pair (input, output).

Note that this is vendor specific, and the stability of the API schema is not guaranteed by the ai-gateway, but by the vendor via proper versioning.

func (*VersionedAPISchema) DeepCopy ¶

func (in *VersionedAPISchema) DeepCopy() *VersionedAPISchema

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new VersionedAPISchema.

func (*VersionedAPISchema) DeepCopyInto ¶

func (in *VersionedAPISchema) DeepCopyInto(out *VersionedAPISchema)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL