Documentation
¶
Index ¶
- Constants
- func DetectArgFrom(playground *inferenceapi.Playground, isMultiNodesInference bool) string
- func FetchModelsByPlayground(ctx context.Context, k8sClient client.Client, ...) (models []*coreapi.OpenModel, err error)
- func FetchModelsByService(ctx context.Context, k8sClient client.Client, service *inferenceapi.Service) (models []*coreapi.OpenModel, err error)
- func FirstAssignedFlavor(model *coreapi.OpenModel, playground *inferenceapi.Playground) []coreapi.Flavor
- func MultiHostInference(model *coreapi.OpenModel, playground *inferenceapi.Playground) (int32, bool)
- type BackendRuntimeParser
- func (p *BackendRuntimeParser) Args(playground *inferenceapi.Playground, models []*coreapi.OpenModel, ...) ([]string, error)
- func (p *BackendRuntimeParser) Commands() []string
- func (p *BackendRuntimeParser) Envs() []corev1.EnvVar
- func (p *BackendRuntimeParser) Image(version string) string
- func (p *BackendRuntimeParser) LeaderCommands() []string
- func (p *BackendRuntimeParser) Resources() inferenceapi.ResourceRequirements
- func (p *BackendRuntimeParser) Version() string
- func (p *BackendRuntimeParser) WorkerCommands() []string
Constants ¶
View Source
const ( DefaultArg string = "default" SpeculativeDecodingArg string = "speculative-decoding" ModelParallelismArg string = "model-parallelism" )
These two modes are preset.
Variables ¶
This section is empty.
Functions ¶
func DetectArgFrom ¶ added in v0.0.9
func DetectArgFrom(playground *inferenceapi.Playground, isMultiNodesInference bool) string
DetectArgFrom wil auto detect the arg from model roles if not set explicitly.
func FetchModelsByPlayground ¶
func FetchModelsByPlayground(ctx context.Context, k8sClient client.Client, playground *inferenceapi.Playground) (models []*coreapi.OpenModel, err error)
func FetchModelsByService ¶
func FirstAssignedFlavor ¶ added in v0.1.0
func FirstAssignedFlavor(model *coreapi.OpenModel, playground *inferenceapi.Playground) []coreapi.Flavor
FirstAssignedFlavor will return the first assigned flavor of the model, always the 0-index flavor.
func MultiHostInference ¶ added in v0.1.0
func MultiHostInference(model *coreapi.OpenModel, playground *inferenceapi.Playground) (int32, bool)
MultiHostInference returns two values, the first one is the TP size, the second one is whether this is a multi-host inference.
Types ¶
type BackendRuntimeParser ¶
type BackendRuntimeParser struct {
// contains filtered or unexported fields
}
TODO: add unit tests.
func NewBackendRuntimeParser ¶
func NewBackendRuntimeParser(backendRuntime *inferenceapi.BackendRuntime) *BackendRuntimeParser
func (*BackendRuntimeParser) Args ¶
func (p *BackendRuntimeParser) Args(playground *inferenceapi.Playground, models []*coreapi.OpenModel, multiNodes bool) ([]string, error)
func (*BackendRuntimeParser) Commands ¶
func (p *BackendRuntimeParser) Commands() []string
func (*BackendRuntimeParser) Envs ¶
func (p *BackendRuntimeParser) Envs() []corev1.EnvVar
func (*BackendRuntimeParser) Image ¶
func (p *BackendRuntimeParser) Image(version string) string
func (*BackendRuntimeParser) LeaderCommands ¶ added in v0.1.0
func (p *BackendRuntimeParser) LeaderCommands() []string
func (*BackendRuntimeParser) Resources ¶
func (p *BackendRuntimeParser) Resources() inferenceapi.ResourceRequirements
func (*BackendRuntimeParser) Version ¶
func (p *BackendRuntimeParser) Version() string
func (*BackendRuntimeParser) WorkerCommands ¶ added in v0.1.0
func (p *BackendRuntimeParser) WorkerCommands() []string
Click to show internal directories.
Click to hide internal directories.