Documentation
¶
Index ¶
Constants ¶
View Source
const (
// Name is the backend name.
Name = "vllm"
)
Variables ¶
View Source
var StatusNotFound = errors.New("vLLM binary not found")
Functions ¶
func GetMaxModelLen ¶
func GetMaxModelLen(modelCfg types.Config, backendCfg *inference.BackendConfiguration) *uint64
GetMaxModelLen returns the max model length (context size) from model config or backend config. Model config takes precedence over backend config. Returns nil if neither is specified (vLLM will auto-derive from model).
Types ¶
type Config ¶
type Config struct {
// Args are the base arguments that are always included.
Args []string
}
Config is the configuration for the vLLM backend.
func NewDefaultVLLMConfig ¶
func NewDefaultVLLMConfig() *Config
NewDefaultVLLMConfig creates a new VLLMConfig with default values.
func (*Config) GetArgs ¶
func (c *Config) GetArgs(bundle types.ModelBundle, socket string, mode inference.BackendMode, config *inference.BackendConfiguration) ([]string, error)
GetArgs implements BackendConfig.GetArgs.
Click to show internal directories.
Click to hide internal directories.