vllm

package

v1.0.8 Latest Latest Go to latest Published: Dec 11, 2025 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/docker/model-runner

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func GetMaxModelLen(modelCfg types.Config, backendCfg *inference.BackendConfiguration) *int32
func New(log logging.Logger, modelManager *models.Manager, serverLog logging.Logger, ...) (inference.Backend, error)
type Config
- func NewDefaultVLLMConfig() *Config
- func (c *Config) GetArgs(bundle types.ModelBundle, socket string, mode inference.BackendMode, ...) ([]string, error)

Constants ¶

View Source

const (
	// Name is the backend name.
	Name = "vllm"
)

Variables ¶

View Source

var ErrorNotFound = errors.New("vLLM binary not found")

Functions ¶

func GetMaxModelLen ¶

func GetMaxModelLen(modelCfg types.Config, backendCfg *inference.BackendConfiguration) *int32

GetMaxModelLen returns the max model length (context size) from model config or backend config. Model config takes precedence over backend config. Returns nil if neither is specified (vLLM will auto-derive from model).

func New ¶

func New(log logging.Logger, modelManager *models.Manager, serverLog logging.Logger, conf *Config) (inference.Backend, error)

New creates a new vLLM-based backend.

Types ¶

type Config ¶

type Config struct {
	// Args are the base arguments that are always included.
	Args []string
}

Config is the configuration for the vLLM backend.

func NewDefaultVLLMConfig ¶

func NewDefaultVLLMConfig() *Config

NewDefaultVLLMConfig creates a new VLLMConfig with default values.

func (*Config) GetArgs ¶

func (c *Config) GetArgs(bundle types.ModelBundle, socket string, mode inference.BackendMode, config *inference.BackendConfiguration) ([]string, error)

GetArgs implements BackendConfig.GetArgs.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL