helper

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 18, 2025 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultArg             string = "default"
	SpeculativeDecodingArg string = "speculative-decoding"
	ModelParallelismArg    string = "model-parallelism"
)

These two modes are preset.

Variables

This section is empty.

Functions

func DetectArgFrom added in v0.0.9

func DetectArgFrom(playground *inferenceapi.Playground, isMultiNodesInference bool) string

DetectArgFrom wil auto detect the arg from model roles if not set explicitly.

func FetchModelsByPlayground

func FetchModelsByPlayground(ctx context.Context, k8sClient client.Client, playground *inferenceapi.Playground) (models []*coreapi.OpenModel, err error)

func FetchModelsByService

func FetchModelsByService(ctx context.Context, k8sClient client.Client, service *inferenceapi.Service) (models []*coreapi.OpenModel, err error)

func FirstAssignedFlavor added in v0.1.0

func FirstAssignedFlavor(model *coreapi.OpenModel, playground *inferenceapi.Playground) []coreapi.Flavor

FirstAssignedFlavor will return the first assigned flavor of the model.

func MultiHostInference added in v0.1.0

func MultiHostInference(model *coreapi.OpenModel, playground *inferenceapi.Playground) (int32, bool)

MultiHostInference returns two values, the first one is the PP size, the second one is whether this is a multi-host inference.

func RecommendedConfigName added in v0.1.1

func RecommendedConfigName(playground *inferenceapi.Playground, multiNodes bool) string

Types

This section is empty.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL