v1alpha1

package

v0.1.4 Latest Latest Go to latest Published: Jun 10, 2025 License: Apache-2.0 Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/inftyai/llmaz

Links

Open Source Insights

Documentation ¶

Overview ¶

Package v1alpha1 contains API Schema definitions for the inference v1alpha1 API group +kubebuilder:object:generate=true +groupName=inference.llmaz.io

Index ¶

Constants
Variables
func Resource(resource string) schema.GroupResource
type BackendName
type BackendRuntime
- func (in *BackendRuntime) DeepCopy() *BackendRuntime
- func (in *BackendRuntime) DeepCopyInto(out *BackendRuntime)
- func (in *BackendRuntime) DeepCopyObject() runtime.Object
type BackendRuntimeConfig
- func (in *BackendRuntimeConfig) DeepCopy() *BackendRuntimeConfig
- func (in *BackendRuntimeConfig) DeepCopyInto(out *BackendRuntimeConfig)
type BackendRuntimeList
- func (in *BackendRuntimeList) DeepCopy() *BackendRuntimeList
- func (in *BackendRuntimeList) DeepCopyInto(out *BackendRuntimeList)
- func (in *BackendRuntimeList) DeepCopyObject() runtime.Object
type BackendRuntimeSpec
- func (in *BackendRuntimeSpec) DeepCopy() *BackendRuntimeSpec
- func (in *BackendRuntimeSpec) DeepCopyInto(out *BackendRuntimeSpec)
type BackendRuntimeStatus
- func (in *BackendRuntimeStatus) DeepCopy() *BackendRuntimeStatus
- func (in *BackendRuntimeStatus) DeepCopyInto(out *BackendRuntimeStatus)
type ElasticConfig
- func (in *ElasticConfig) DeepCopy() *ElasticConfig
- func (in *ElasticConfig) DeepCopyInto(out *ElasticConfig)
type HPATrigger
- func (in *HPATrigger) DeepCopy() *HPATrigger
- func (in *HPATrigger) DeepCopyInto(out *HPATrigger)
type Playground
- func (in *Playground) DeepCopy() *Playground
- func (in *Playground) DeepCopyInto(out *Playground)
- func (in *Playground) DeepCopyObject() runtime.Object
type PlaygroundList
- func (in *PlaygroundList) DeepCopy() *PlaygroundList
- func (in *PlaygroundList) DeepCopyInto(out *PlaygroundList)
- func (in *PlaygroundList) DeepCopyObject() runtime.Object
type PlaygroundSpec
- func (in *PlaygroundSpec) DeepCopy() *PlaygroundSpec
- func (in *PlaygroundSpec) DeepCopyInto(out *PlaygroundSpec)
type PlaygroundStatus
- func (in *PlaygroundStatus) DeepCopy() *PlaygroundStatus
- func (in *PlaygroundStatus) DeepCopyInto(out *PlaygroundStatus)
type RecommendedConfig
- func (in *RecommendedConfig) DeepCopy() *RecommendedConfig
- func (in *RecommendedConfig) DeepCopyInto(out *RecommendedConfig)
type ResourceRequirements
- func (in *ResourceRequirements) DeepCopy() *ResourceRequirements
- func (in *ResourceRequirements) DeepCopyInto(out *ResourceRequirements)
type ScaleTrigger
- func (in *ScaleTrigger) DeepCopy() *ScaleTrigger
- func (in *ScaleTrigger) DeepCopyInto(out *ScaleTrigger)
type Service
- func (in *Service) DeepCopy() *Service
- func (in *Service) DeepCopyInto(out *Service)
- func (in *Service) DeepCopyObject() runtime.Object
type ServiceList
- func (in *ServiceList) DeepCopy() *ServiceList
- func (in *ServiceList) DeepCopyInto(out *ServiceList)
- func (in *ServiceList) DeepCopyObject() runtime.Object
type ServiceSpec
- func (in *ServiceSpec) DeepCopy() *ServiceSpec
- func (in *ServiceSpec) DeepCopyInto(out *ServiceSpec)
type ServiceStatus
- func (in *ServiceStatus) DeepCopy() *ServiceStatus
- func (in *ServiceStatus) DeepCopyInto(out *ServiceStatus)

Constants ¶

View Source

const (
	// PlaygroundProgressing means the Playground is progressing now, such as waiting for the
	// inference service creation, rolling update or scaling up and down.
	PlaygroundProgressing = "Progressing"
	// PlaygroundAvailable indicates the corresponding inference service is available now.
	PlaygroundAvailable string = "Available"
	// SkipModelLoaderAnnoKey indicates whether to skip the model loader,
	// enabling the inference engine to manage model loading directly.
	SkipModelLoaderAnnoKey = "llmaz.io/skip-model-loader"
)

View Source

const (
	// ServiceAvailable means the inferenceService is available and all the
	// workloads are running as expected.
	ServiceAvailable = "Available"
	// ServiceProgressing means the inferenceService is progressing now, such as
	// in creation, rolling update or scaling up and down.
	ServiceProgressing = "Progressing"
)

View Source

const (
	// InferenceServiceFlavorsAnnoKey is the annotation key for the flavors specified
	// in the inference service, the value is a comma-separated list of flavor names.
	InferenceServiceFlavorsAnnoKey = "llmaz.io/inference-service-flavors"
)

Variables ¶

View Source

var (
	// GroupVersion is group version used to register these objects
	GroupVersion = schema.GroupVersion{Group: "inference.llmaz.io", Version: "v1alpha1"}

	// SchemeGroupVersion is alias to GroupVersion for client-go libraries.
	// It is required by pkg/client/informers/externalversions/...
	SchemeGroupVersion = GroupVersion

	// SchemeBuilder is used to add go types to the GroupVersionKind scheme
	SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

	// AddToScheme adds the types in this group-version to the given scheme.
	AddToScheme = SchemeBuilder.AddToScheme
)

Functions ¶

func Resource ¶

func Resource(resource string) schema.GroupResource

Resource is required by pkg/client/listers/...

Types ¶

type BackendName ¶

type BackendName string

const (
	DefaultBackend BackendName = "vllm"
)

type BackendRuntime ¶ added in v0.0.7

type BackendRuntime struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   BackendRuntimeSpec   `json:"spec,omitempty"`
	Status BackendRuntimeStatus `json:"status,omitempty"`
}

BackendRuntime is the Schema for the backendRuntime API

func (*BackendRuntime) DeepCopy ¶ added in v0.0.7

func (in *BackendRuntime) DeepCopy() *BackendRuntime

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntime.

func (*BackendRuntime) DeepCopyInto ¶ added in v0.0.7

func (in *BackendRuntime) DeepCopyInto(out *BackendRuntime)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*BackendRuntime) DeepCopyObject ¶ added in v0.0.7

func (in *BackendRuntime) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type BackendRuntimeConfig ¶ added in v0.0.7

type BackendRuntimeConfig struct {
	// BackendName represents the inference backend under the hood, e.g. vLLM.
	// +kubebuilder:default=vllm
	// +optional
	BackendName *BackendName `json:"backendName,omitempty"`
	// Version represents the backend version if you want a different one
	// from the default version.
	// +optional
	Version *string `json:"version,omitempty"`
	// Envs represents the environments set to the container.
	// +optional
	Envs []corev1.EnvVar `json:"envs,omitempty"`
	// ConfigName represents the recommended configuration name for the backend,
	// It will be inferred from the models in the runtime if not specified, e.g. default,
	// speculative-decoding.
	ConfigName *string `json:"configName,omitempty"`
	// Args defined here will "append" the args defined in the recommendedConfig,
	// either explicitly configured in configName or inferred in the runtime.
	// +optional
	Args []string `json:"args,omitempty"`
	// Resources represents the resource requirements for backend, like cpu/mem,
	// accelerators like GPU should not be defined here, but at the model flavors,
	// or the values here will be overwritten.
	// Resources defined here will "overwrite" the resources in the recommendedConfig.
	// +optional
	Resources *ResourceRequirements `json:"resources,omitempty"`
	// SharedMemorySize represents the size of /dev/shm required in the runtime of
	// inference workload.
	// SharedMemorySize defined here will "overwrite" the sharedMemorySize in the recommendedConfig.
	// +optional
	SharedMemorySize *resource.Quantity `json:"sharedMemorySize,omitempty"`
}

func (*BackendRuntimeConfig) DeepCopy ¶ added in v0.0.7

func (in *BackendRuntimeConfig) DeepCopy() *BackendRuntimeConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeConfig.

func (*BackendRuntimeConfig) DeepCopyInto ¶ added in v0.0.7

func (in *BackendRuntimeConfig) DeepCopyInto(out *BackendRuntimeConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendRuntimeList ¶ added in v0.0.7

type BackendRuntimeList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []BackendRuntime `json:"items"`
}

BackendRuntimeList contains a list of BackendRuntime

func (*BackendRuntimeList) DeepCopy ¶ added in v0.0.7

func (in *BackendRuntimeList) DeepCopy() *BackendRuntimeList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeList.

func (*BackendRuntimeList) DeepCopyInto ¶ added in v0.0.7

func (in *BackendRuntimeList) DeepCopyInto(out *BackendRuntimeList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*BackendRuntimeList) DeepCopyObject ¶ added in v0.0.7

func (in *BackendRuntimeList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type BackendRuntimeSpec ¶ added in v0.0.7

type BackendRuntimeSpec struct {
	// Command represents the default command for the backendRuntime.
	// +optional
	Command []string `json:"command,omitempty"`
	// Image represents the default image registry of the backendRuntime.
	// It will work together with version to make up a real image.
	Image string `json:"image"`
	// Version represents the default version of the backendRuntime.
	// It will be appended to the image as a tag.
	Version string `json:"version"`
	// Envs represents the environments set to the container.
	// +optional
	Envs []corev1.EnvVar `json:"envs,omitempty"`
	// Lifecycle represents hooks executed during the lifecycle of the container.
	// +optional
	Lifecycle *corev1.Lifecycle `json:"lifecycle,omitempty"`
	// Periodic probe of backend liveness.
	// Backend will be restarted if the probe fails.
	// Cannot be updated.
	// +optional
	LivenessProbe *corev1.Probe `json:"livenessProbe,omitempty"`
	// Periodic probe of backend readiness.
	// Backend will be removed from service endpoints if the probe fails.
	// +optional
	ReadinessProbe *corev1.Probe `json:"readinessProbe,omitempty"`
	// StartupProbe indicates that the Backend has successfully initialized.
	// If specified, no other probes are executed until this completes successfully.
	// If this probe fails, the backend will be restarted, just as if the livenessProbe failed.
	// This can be used to provide different probe parameters at the beginning of a backend's lifecycle,
	// when it might take a long time to load data or warm a cache, than during steady-state operation.
	// +optional
	StartupProbe *corev1.Probe `json:"startupProbe,omitempty"`
	// RecommendedConfigs represents the recommended configurations for the backendRuntime.
	// +optional
	RecommendedConfigs []RecommendedConfig `json:"recommendedConfigs,omitempty"`
}

BackendRuntimeSpec defines the desired state of BackendRuntime

func (*BackendRuntimeSpec) DeepCopy ¶ added in v0.0.7

func (in *BackendRuntimeSpec) DeepCopy() *BackendRuntimeSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeSpec.

func (*BackendRuntimeSpec) DeepCopyInto ¶ added in v0.0.7

func (in *BackendRuntimeSpec) DeepCopyInto(out *BackendRuntimeSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type BackendRuntimeStatus ¶ added in v0.0.7

type BackendRuntimeStatus struct {
	// Conditions represents the Inference condition.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

BackendRuntimeStatus defines the observed state of BackendRuntime

func (*BackendRuntimeStatus) DeepCopy ¶ added in v0.0.7

func (in *BackendRuntimeStatus) DeepCopy() *BackendRuntimeStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BackendRuntimeStatus.

func (*BackendRuntimeStatus) DeepCopyInto ¶ added in v0.0.7

func (in *BackendRuntimeStatus) DeepCopyInto(out *BackendRuntimeStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ElasticConfig ¶

type ElasticConfig struct {
	// MinReplicas indicates the minimum number of inference workloads based on the traffic.
	// Default to 1.
	// MinReplicas couldn't be 0 now, will support serverless in the future.
	// +kubebuilder:default=1
	// +optional
	MinReplicas *int32 `json:"minReplicas,omitempty"`
	// MaxReplicas indicates the maximum number of inference workloads based on the traffic.
	// Default to nil means there's no limit for the instance number.
	// +kubebuilder:validation:Required
	// +kubebuilder:validation:Minimum:=1
	MaxReplicas int32 `json:"maxReplicas,omitempty"`
	// ScaleTrigger defines the rules to scale the workloads.
	// Only one trigger cloud work at a time, mostly used in Playground.
	// ScaleTrigger defined here will "overwrite" the scaleTrigger in the recommendedConfig.
	// +optional
	ScaleTrigger *ScaleTrigger `json:"scaleTrigger,omitempty"`
}

func (*ElasticConfig) DeepCopy ¶

func (in *ElasticConfig) DeepCopy() *ElasticConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ElasticConfig.

func (*ElasticConfig) DeepCopyInto ¶

func (in *ElasticConfig) DeepCopyInto(out *ElasticConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type HPATrigger ¶ added in v0.1.0

type HPATrigger struct {
	// metrics contains the specifications for which to use to calculate the
	// desired replica count (the maximum replica count across all metrics will
	// be used).  The desired replica count is calculated multiplying the
	// ratio between the target value and the current value by the current
	// number of pods.  Ergo, metrics used must decrease as the pod count is
	// increased, and vice-versa.  See the individual metric source types for
	// more information about how each type of metric must respond.
	// +optional
	Metrics []autoscalingv2.MetricSpec `json:"metrics,omitempty"`
	// behavior configures the scaling behavior of the target
	// in both Up and Down directions (scaleUp and scaleDown fields respectively).
	// If not set, the default HPAScalingRules for scale up and scale down are used.
	// +optional
	Behavior *autoscalingv2.HorizontalPodAutoscalerBehavior `json:"behavior,omitempty"`
}

HPATrigger represents the configuration of the HorizontalPodAutoscaler. Inspired by kubernetes.io/pkg/apis/autoscaling/types.go#HorizontalPodAutoscalerSpec. Note: HPA component should be installed in prior.

func (*HPATrigger) DeepCopy ¶ added in v0.1.0

func (in *HPATrigger) DeepCopy() *HPATrigger

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HPATrigger.

func (*HPATrigger) DeepCopyInto ¶ added in v0.1.0

func (in *HPATrigger) DeepCopyInto(out *HPATrigger)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type Playground ¶

type Playground struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   PlaygroundSpec   `json:"spec,omitempty"`
	Status PlaygroundStatus `json:"status,omitempty"`
}

Playground is the Schema for the playgrounds API

func (*Playground) DeepCopy ¶

func (in *Playground) DeepCopy() *Playground

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Playground.

func (*Playground) DeepCopyInto ¶

func (in *Playground) DeepCopyInto(out *Playground)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*Playground) DeepCopyObject ¶

func (in *Playground) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PlaygroundList ¶

type PlaygroundList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []Playground `json:"items"`
}

PlaygroundList contains a list of Playground

func (*PlaygroundList) DeepCopy ¶

func (in *PlaygroundList) DeepCopy() *PlaygroundList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundList.

func (*PlaygroundList) DeepCopyInto ¶

func (in *PlaygroundList) DeepCopyInto(out *PlaygroundList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*PlaygroundList) DeepCopyObject ¶

func (in *PlaygroundList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type PlaygroundSpec ¶

type PlaygroundSpec struct {
	// Replicas represents the replica number of inference workloads.
	// +kubebuilder:default=1
	// +optional
	Replicas *int32 `json:"replicas,omitempty"`
	// ModelClaim represents claiming for one model, it's a simplified use case
	// of modelClaims. Most of the time, modelClaim is enough.
	// ModelClaim and modelClaims are exclusive configured.
	// +optional
	ModelClaim *coreapi.ModelClaim `json:"modelClaim,omitempty"`
	// ModelClaims represents claiming for multiple models for more complicated
	// use cases like speculative-decoding.
	// ModelClaims and modelClaim are exclusive configured.
	// +optional
	ModelClaims *coreapi.ModelClaims `json:"modelClaims,omitempty"`
	// BackendRuntimeConfig represents the inference backendRuntime configuration
	// under the hood, e.g. vLLM, which is the default backendRuntime.
	// +optional
	BackendRuntimeConfig *BackendRuntimeConfig `json:"backendRuntimeConfig,omitempty"`
	// ElasticConfig defines the configuration for elastic usage,
	// e.g. the max/min replicas.
	// +optional
	ElasticConfig *ElasticConfig `json:"elasticConfig,omitempty"`
}

PlaygroundSpec defines the desired state of Playground

func (*PlaygroundSpec) DeepCopy ¶

func (in *PlaygroundSpec) DeepCopy() *PlaygroundSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundSpec.

func (*PlaygroundSpec) DeepCopyInto ¶

func (in *PlaygroundSpec) DeepCopyInto(out *PlaygroundSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type PlaygroundStatus ¶

type PlaygroundStatus struct {
	// Conditions represents the Inference condition.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
	// Replicas track the replicas that have been created, whether ready or not.
	Replicas int32 `json:"replicas"`
	// Selector points to the string form of a label selector which will be used by HPA.
	Selector string `json:"selector,omitempty"`
}

PlaygroundStatus defines the observed state of Playground

func (*PlaygroundStatus) DeepCopy ¶

func (in *PlaygroundStatus) DeepCopy() *PlaygroundStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PlaygroundStatus.

func (*PlaygroundStatus) DeepCopyInto ¶

func (in *PlaygroundStatus) DeepCopyInto(out *PlaygroundStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type RecommendedConfig ¶ added in v0.1.1

type RecommendedConfig struct {
	// Name represents the identifier of the config.
	Name string `json:"name"`
	// Args represents all the arguments for the command.
	// Argument around with {{ .CONFIG }} is a configuration waiting for render.
	// +optional
	Args []string `json:"args,omitempty"`
	// Resources represents the resource requirements for backend, like cpu/mem,
	// accelerators like GPU should not be defined here, but at the model flavors,
	// or the values here will be overwritten.
	// +optional
	Resources *ResourceRequirements `json:"resources,omitempty"`
	// SharedMemorySize represents the size of /dev/shm required in the runtime of
	// inference workload.
	// +optional
	SharedMemorySize *resource.Quantity `json:"sharedMemorySize,omitempty"`
	// ScaleTrigger defines the rules to scale the workloads.
	// Only one trigger cloud work at a time.
	// +optional
	ScaleTrigger *ScaleTrigger `json:"scaleTrigger,omitempty"`
}

RecommendedConfig represents the recommended configurations for the backendRuntime, user can choose one of them to apply.

func (*RecommendedConfig) DeepCopy ¶ added in v0.1.1

func (in *RecommendedConfig) DeepCopy() *RecommendedConfig

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RecommendedConfig.

func (*RecommendedConfig) DeepCopyInto ¶ added in v0.1.1

func (in *RecommendedConfig) DeepCopyInto(out *RecommendedConfig)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ResourceRequirements ¶

type ResourceRequirements struct {
	// Limits describes the maximum amount of compute resources allowed.
	// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
	// +optional
	Limits corev1.ResourceList `json:"limits,omitempty"`
	// Requests describes the minimum amount of compute resources required.
	// If Requests is omitted for a container, it defaults to Limits if that is explicitly specified,
	// otherwise to an implementation-defined value. Requests cannot exceed Limits.
	// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
	// +optional
	Requests corev1.ResourceList `json:"requests,omitempty"`
}

TODO: Do not support DRA yet, we can support that once needed.

func (*ResourceRequirements) DeepCopy ¶

func (in *ResourceRequirements) DeepCopy() *ResourceRequirements

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ResourceRequirements.

func (*ResourceRequirements) DeepCopyInto ¶

func (in *ResourceRequirements) DeepCopyInto(out *ResourceRequirements)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ScaleTrigger ¶ added in v0.1.0

type ScaleTrigger struct {
	// HPA represents the trigger configuration of the HorizontalPodAutoscaler.
	HPA *HPATrigger `json:"hpa,omitempty"`
}

ScaleTrigger defines the rules to scale the workloads. Only one trigger cloud work at a time, mostly used in Playground.

func (*ScaleTrigger) DeepCopy ¶ added in v0.1.0

func (in *ScaleTrigger) DeepCopy() *ScaleTrigger

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScaleTrigger.

func (*ScaleTrigger) DeepCopyInto ¶ added in v0.1.0

func (in *ScaleTrigger) DeepCopyInto(out *ScaleTrigger)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type Service ¶

type Service struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   ServiceSpec   `json:"spec,omitempty"`
	Status ServiceStatus `json:"status,omitempty"`
}

Service is the Schema for the services API

func (*Service) DeepCopy ¶

func (in *Service) DeepCopy() *Service

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Service.

func (*Service) DeepCopyInto ¶

func (in *Service) DeepCopyInto(out *Service)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*Service) DeepCopyObject ¶

func (in *Service) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type ServiceList ¶

type ServiceList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []Service `json:"items"`
}

ServiceList contains a list of Service

func (*ServiceList) DeepCopy ¶

func (in *ServiceList) DeepCopy() *ServiceList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceList.

func (*ServiceList) DeepCopyInto ¶

func (in *ServiceList) DeepCopyInto(out *ServiceList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*ServiceList) DeepCopyObject ¶

func (in *ServiceList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type ServiceSpec ¶

type ServiceSpec struct {
	// ModelClaims represents multiple claims for different models.
	ModelClaims coreapi.ModelClaims `json:"modelClaims,omitempty"`
	// Replicas represents the replica number of inference workloads.
	// +kubebuilder:default=1
	// +optional
	Replicas *int32 `json:"replicas,omitempty"`
	// WorkloadTemplate defines the template for leader/worker pods
	WorkloadTemplate lws.LeaderWorkerTemplate `json:"workloadTemplate"`
	// RolloutStrategy defines the strategy that will be applied to update replicas
	// when a revision is made to the leaderWorkerTemplate.
	// +kubebuilder:default:={type: "RollingUpdate", rollingUpdateConfiguration: {"maxUnavailable": 1, "maxSurge": 0}}
	// +optional
	RolloutStrategy *lws.RolloutStrategy `json:"rolloutStrategy,omitempty"`
}

ServiceSpec defines the desired state of Service. Service controller will maintain multi-flavor of workloads with different accelerators for cost or performance considerations.

func (*ServiceSpec) DeepCopy ¶

func (in *ServiceSpec) DeepCopy() *ServiceSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceSpec.

func (*ServiceSpec) DeepCopyInto ¶

func (in *ServiceSpec) DeepCopyInto(out *ServiceSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ServiceStatus ¶

type ServiceStatus struct {
	// Conditions represents the Inference condition.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
	// Replicas track the replicas that have been created, whether ready or not.
	Replicas int32 `json:"replicas"`
	// Selector points to the string form of a label selector, the HPA will be
	// able to autoscale your resource.
	Selector string `json:"selector,omitempty"`
}

ServiceStatus defines the observed state of Service

func (*ServiceStatus) DeepCopy ¶

func (in *ServiceStatus) DeepCopy() *ServiceStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServiceStatus.

func (*ServiceStatus) DeepCopyInto ¶

func (in *ServiceStatus) DeepCopyInto(out *ServiceStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL