v1alpha1

package
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2024 License: Apache-2.0 Imports: 5 Imported by: 1

Documentation

Overview

Package v1alpha1 contains API Schema definitions for the v1alpha1 API group +kubebuilder:object:generate=true +groupName=llmaz.io

Index

Constants

View Source
const (
	ModelFamilyNameLabelKey = "llmaz.io/model-family-name"
	ModelNameLabelKey       = "llmaz.io/model-name"

	HUGGING_FACE = "Huggingface"
	MODEL_SCOPE  = "ModelScope"
)

Variables

View Source
var (
	// GroupVersion is group version used to register these objects
	GroupVersion = schema.GroupVersion{Group: "llmaz.io", Version: "v1alpha1"}

	// SchemeGroupVersion is alias to GroupVersion for client-go libraries.
	// It is required by pkg/client/informers/externalversions/...
	SchemeGroupVersion = GroupVersion

	// SchemeBuilder is used to add go types to the GroupVersionKind scheme
	SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

	// AddToScheme adds the types in this group-version to the given scheme.
	AddToScheme = SchemeBuilder.AddToScheme
)

Functions

func Resource

func Resource(resource string) schema.GroupResource

Resource is required by pkg/client/listers/...

Types

type Flavor

type Flavor struct {
	// Name represents the flavor name, which will be used in model claim.
	Name FlavorName `json:"name"`
	// Requests defines the required accelerators to serve the model, like nvidia.com/gpu: 8.
	// When GPU number is greater than 8, like 32, then multi-host inference is enabled and
	// 32/8=4 hosts will be grouped as an unit, each host will have a resource request as
	// nvidia.com/gpu: 8. The may change in the future if the GPU number limit is broken.
	// Not recommended to set the cpu and memory usage here.
	// If using playground, you can define the cpu/mem usage at backendConfig.
	// If using service, you can define the cpu/mem at the container resources.
	// Note: if you define the same accelerator requests at playground/service as well,
	// the requests here will be covered.
	// +optional
	Requests v1.ResourceList `json:"requests,omitempty"`
	// NodeSelector defines the labels to filter specified nodes, like
	// cloud-provider.com/accelerator: nvidia-a100.
	// NodeSelector will be auto injected to the Pods as scheduling primitives.
	// +optional
	NodeSelector map[string]string `json:"nodeSelector,omitempty"`
	// Params stores other useful parameters and will be consumed by the autoscaling components
	// like cluster-autoscaler, Karpenter.
	// E.g. when scaling up nodes with 8x Nvidia A00, the parameter can be injected with
	// instance-type: p4d.24xlarge for AWS.
	// +optional
	Params map[string]string `json:"params,omitempty"`
}

Flavor defines the accelerator requirements for a model and the necessary parameters in autoscaling. Right now, it will be used in two places: - Pod scheduling with node selectors specified. - Cluster autoscaling with essential parameters provided.

func (*Flavor) DeepCopy

func (in *Flavor) DeepCopy() *Flavor

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Flavor.

func (*Flavor) DeepCopyInto

func (in *Flavor) DeepCopyInto(out *Flavor)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type FlavorName

type FlavorName string

type ModelClaim

type ModelClaim struct {
	// ModelName represents a list of models, there maybe multiple models here
	// to support state-of-the-art technologies like speculative decoding.
	ModelName ModelName `json:"modelName,omitempty"`
	// InferenceFlavors represents a list of flavors with fungibility supports
	// to serve the model. The flavor names should be a subset of the model
	// configured flavors. If not set, will use the model configured flavors.
	// +optional
	InferenceFlavors []FlavorName `json:"inferenceFlavors,omitempty"`
}

ModelClaim represents the references to one model. It's a simple config for most of the cases compared to multiModelsClaim.

func (*ModelClaim) DeepCopy

func (in *ModelClaim) DeepCopy() *ModelClaim

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelClaim.

func (*ModelClaim) DeepCopyInto

func (in *ModelClaim) DeepCopyInto(out *ModelClaim)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelHub

type ModelHub struct {
	// Name refers to the model registry, such as huggingface.
	// +kubebuilder:default=Huggingface
	// +kubebuilder:validation:Enum={Huggingface,ModelScope}
	// +optional
	Name *string `json:"name,omitempty"`
	// ModelID refers to the model identifier on model hub,
	// such as meta-llama/Meta-Llama-3-8B.
	ModelID string `json:"modelID,omitempty"`
	// Filename refers to a specified model file rather than the whole repo.
	// This is helpful to download a specified GGUF model rather than downloading
	// the whole repo which includes all kinds of quantized models.
	// TODO: this is only supported with Huggingface, add support for ModelScope
	// in the near future.
	Filename *string `json:"filename,omitempty"`
	// Revision refers to a Git revision id which can be a branch name, a tag, or a commit hash.
	// Most of the time, you don't need to specify it.
	// +optional
	Revision *string `json:"revision,omitempty"`
}

ModelHub represents the model registry for model downloads.

func (*ModelHub) DeepCopy

func (in *ModelHub) DeepCopy() *ModelHub

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelHub.

func (*ModelHub) DeepCopyInto

func (in *ModelHub) DeepCopyInto(out *ModelHub)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelName

type ModelName string

type ModelSource

type ModelSource struct {
	// ModelHub represents the model registry for model downloads.
	// +optional
	ModelHub *ModelHub `json:"modelHub,omitempty"`
	// URI represents a various kinds of model sources following the uri protocol, e.g.:
	// - OSS: oss://<bucket>.<endpoint>/<path-to-your-model>
	//
	// +optional
	URI *URIProtocol `json:"uri,omitempty"`
}

ModelSource represents the source of the model. Only one model source will be used.

func (*ModelSource) DeepCopy

func (in *ModelSource) DeepCopy() *ModelSource

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelSource.

func (*ModelSource) DeepCopyInto

func (in *ModelSource) DeepCopyInto(out *ModelSource)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelSpec

type ModelSpec struct {
	// FamilyName represents the model type, like llama2, which will be auto injected
	// to the labels with the key of `llmaz.io/model-family-name`.
	FamilyName ModelName `json:"familyName"`
	// Source represents the source of the model, there're several ways to load
	// the model such as loading from huggingface, OCI registry, s3, host path and so on.
	Source ModelSource `json:"source"`
	// InferenceFlavors represents the accelerator requirements to serve the model.
	// Flavors are fungible following the priority of slice order.
	// +optional
	InferenceFlavors []Flavor `json:"inferenceFlavors,omitempty"`
}

ModelSpec defines the desired state of Model

func (*ModelSpec) DeepCopy

func (in *ModelSpec) DeepCopy() *ModelSpec

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelSpec.

func (*ModelSpec) DeepCopyInto

func (in *ModelSpec) DeepCopyInto(out *ModelSpec)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type ModelStatus

type ModelStatus struct {
	// Conditions represents the Inference condition.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
}

ModelStatus defines the observed state of Model

func (*ModelStatus) DeepCopy

func (in *ModelStatus) DeepCopy() *ModelStatus

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ModelStatus.

func (*ModelStatus) DeepCopyInto

func (in *ModelStatus) DeepCopyInto(out *ModelStatus)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type MultiModelsClaim

type MultiModelsClaim struct {
	// ModelNames represents a list of models, there maybe multiple models here
	// to support state-of-the-art technologies like speculative decoding.
	// +kubebuilder:validation:MinItems=1
	ModelNames []ModelName `json:"modelNames,omitempty"`
	// InferenceFlavors represents a list of flavors with fungibility supported
	// to serve the model.
	// - If not set, always apply with the 0-index model by default.
	// - If set, will lookup the flavor names following the model orders.
	// +optional
	InferenceFlavors []FlavorName `json:"inferenceFlavors,omitempty"`
	// Rate works only when multiple claims declared, it represents the replicas rates of
	// the sub-workload, like when claim1.rate:claim2.rate = 1:2 and 3 replicas defined in
	// workload, then sub-workload1 will have 1 replica, and sub-workload2 will have 2 replicas.
	// This is mostly designed for state-of-the-art technology called splitwise, the prefill
	// and decode phase will be separated and requires different accelerators.
	// The sum of the rates should be divisible by replicas.
	Rate *int32 `json:"rate,omitempty"`
}

MultiModelsClaim represents the references to multiple models. It's an advanced and more complicated config comparing to modelClaim.

func (*MultiModelsClaim) DeepCopy

func (in *MultiModelsClaim) DeepCopy() *MultiModelsClaim

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MultiModelsClaim.

func (*MultiModelsClaim) DeepCopyInto

func (in *MultiModelsClaim) DeepCopyInto(out *MultiModelsClaim)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

type OpenModel

type OpenModel struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   ModelSpec   `json:"spec,omitempty"`
	Status ModelStatus `json:"status,omitempty"`
}

OpenModel is the Schema for the open models API

func (*OpenModel) DeepCopy

func (in *OpenModel) DeepCopy() *OpenModel

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new OpenModel.

func (*OpenModel) DeepCopyInto

func (in *OpenModel) DeepCopyInto(out *OpenModel)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*OpenModel) DeepCopyObject

func (in *OpenModel) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type OpenModelList

type OpenModelList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []OpenModel `json:"items"`
}

OpenModelList contains a list of OpenModel

func (*OpenModelList) DeepCopy

func (in *OpenModelList) DeepCopy() *OpenModelList

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new OpenModelList.

func (*OpenModelList) DeepCopyInto

func (in *OpenModelList) DeepCopyInto(out *OpenModelList)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*OpenModelList) DeepCopyObject

func (in *OpenModelList) DeepCopyObject() runtime.Object

DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.

type URIProtocol

type URIProtocol string

URIProtocol represents the protocol of the URI.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL