create

package
v0.18.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 18, 2026 License: MIT Imports: 17 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CreateImageGenModel

func CreateImageGenModel(modelName, modelDir, quantize string, createLayer LayerCreator, createTensorLayer QuantizingTensorLayerCreator, writeManifest ManifestWriter, fn func(status string)) error

CreateImageGenModel imports an image generation model from a directory. Stores each tensor as a separate blob for fine-grained deduplication. If quantize is specified, linear weights in transformer/text_encoder are quantized. Supported quantization types: int4, int8, nvfp4, mxfp8 (or empty for no quantization). Layer creation and manifest writing are done via callbacks to avoid import cycles.

func CreateSafetensorsModel

func CreateSafetensorsModel(modelName, modelDir, quantize string, createLayer LayerCreator, createTensorLayer QuantizingTensorLayerCreator, writeManifest ManifestWriter, fn func(status string), createPackedLayer ...PackedTensorLayerCreator) error

CreateSafetensorsModel imports a standard safetensors model from a directory. This handles Hugging Face style models with config.json and *.safetensors files. Stores each tensor as a separate blob for fine-grained deduplication. Expert tensors are packed into per-layer blobs when createPackedLayer is non-nil. If quantize is non-empty (e.g., "int8"), eligible tensors will be quantized.

func DTypeSize added in v0.18.2

func DTypeSize(dtype string) (int, error)

DTypeSize returns the byte size of a single element for the given dtype string.

func DecodeFloatTensor added in v0.18.2

func DecodeFloatTensor(dtype string, raw []byte) ([]float32, error)

DecodeFloatTensor decodes raw bytes into []float32 according to the given dtype.

func EncodeFloatTensor added in v0.18.2

func EncodeFloatTensor(dtype string, values []float32) ([]byte, error)

EncodeFloatTensor encodes []float32 into raw bytes according to the given dtype.

func ExpertGroupPrefix added in v0.16.0

func ExpertGroupPrefix(tensorName string) string

ExpertGroupPrefix returns the group prefix for expert tensors that should be packed together. For example:

  • "model.layers.1.mlp.experts.0.down_proj.weight" -> "model.layers.1.mlp.experts"
  • "model.layers.1.mlp.shared_experts.down_proj.weight" -> "model.layers.1.mlp.shared_experts"
  • "model.layers.0.mlp.down_proj.weight" -> "" (dense layer, no experts)
  • "model.layers.1.mlp.gate.weight" -> "" (routing gate, not an expert)

func GetModelArchitecture

func GetModelArchitecture(modelName string) (string, error)

GetModelArchitecture returns the architecture from the model's config.json layer.

func GetTensorQuantization added in v0.15.5

func GetTensorQuantization(name string, shape []int32, quantize string) string

GetTensorQuantization returns the appropriate quantization type for a tensor. Returns "" if the tensor should not be quantized. This implements mixed-precision quantization:

  • Attention MLA weights (q_a, q_b, kv_a, kv_b): unquantized (most sensitive)
  • Output projection, gate/up weights: int4 (less sensitive)
  • Down projection weights: int8 (more sensitive, would be Q6 in GGML but no MLX kernel)
  • Norms, embeddings, biases, routing gates: no quantization

func IsImageGenModel

func IsImageGenModel(modelName string) bool

IsImageGenModel checks if a model is an image generation model (has image capability).

func IsSafetensorsLLMModel

func IsSafetensorsLLMModel(modelName string) bool

IsSafetensorsLLMModel checks if a model is a safetensors LLM model (has completion capability, not image generation).

func IsSafetensorsModel

func IsSafetensorsModel(modelName string) bool

IsSafetensorsModel checks if a model was created with the experimental safetensors builder by checking the model format in the config.

func IsSafetensorsModelDir

func IsSafetensorsModelDir(dir string) bool

IsSafetensorsModelDir checks if the directory contains a standard safetensors model by looking for config.json and at least one .safetensors file.

func IsTensorModelDir

func IsTensorModelDir(dir string) bool

IsTensorModelDir checks if the directory contains a diffusers-style tensor model by looking for model_index.json, which is the standard diffusers pipeline config.

func ShouldQuantize

func ShouldQuantize(name, component string) bool

ShouldQuantize returns true if a tensor should be quantized. For image gen models (component non-empty): quantizes linear weights, skipping VAE, embeddings, norms. For LLM models (component empty): quantizes linear weights, skipping embeddings, norms, and small tensors.

func ShouldQuantizeTensor

func ShouldQuantizeTensor(name string, shape []int32, quantize string) bool

ShouldQuantizeTensor returns true if a tensor should be quantized based on name, shape, and quantize type. This is a more detailed check that also considers tensor dimensions. The quantize parameter specifies the quantization type (e.g., "int4", "nvfp4", "int8", "mxfp8").

Types

type LayerCreator

type LayerCreator func(r io.Reader, mediaType, name string) (LayerInfo, error)

LayerCreator is called to create a blob layer. name is the path-style name (e.g., "tokenizer/tokenizer.json")

type LayerInfo

type LayerInfo struct {
	Digest    string
	Size      int64
	MediaType string
	Name      string // Path-style name: "component/tensor" or "path/to/config.json"
}

LayerInfo holds metadata for a created layer.

type Manifest

type Manifest struct {
	SchemaVersion int             `json:"schemaVersion"`
	MediaType     string          `json:"mediaType"`
	Config        ManifestLayer   `json:"config"`
	Layers        []ManifestLayer `json:"layers"`
}

Manifest represents the manifest JSON structure.

type ManifestLayer

type ManifestLayer struct {
	MediaType string `json:"mediaType"`
	Digest    string `json:"digest"`
	Size      int64  `json:"size"`
	Name      string `json:"name,omitempty"`
}

ManifestLayer represents a layer in the manifest.

type ManifestWriter

type ManifestWriter func(modelName string, config LayerInfo, layers []LayerInfo) error

ManifestWriter writes the manifest file.

type ModelConfig

type ModelConfig struct {
	ModelFormat  string   `json:"model_format"`
	Capabilities []string `json:"capabilities"`
}

ModelConfig represents the config blob stored with a model.

type PackedTensorInput added in v0.16.0

type PackedTensorInput struct {
	Name     string
	Dtype    string
	Shape    []int32
	Quantize string    // per-tensor quantization type (may differ within group)
	Reader   io.Reader // safetensors-wrapped tensor data
}

PackedTensorInput holds metadata for a tensor that will be packed into a multi-tensor blob.

type PackedTensorLayerCreator added in v0.16.0

type PackedTensorLayerCreator func(groupName string, tensors []PackedTensorInput) (LayerInfo, error)

PackedTensorLayerCreator creates a single blob layer containing multiple packed tensors. groupName is the group prefix (e.g., "model.layers.1.mlp.experts").

type QuantizingTensorLayerCreator

type QuantizingTensorLayerCreator func(r io.Reader, name, dtype string, shape []int32, quantize string) ([]LayerInfo, error)

QuantizingTensorLayerCreator creates tensor layers with optional quantization. When quantize is non-empty (e.g., "int8"), returns multiple layers (weight + scales + biases).

type TensorLayerCreator

type TensorLayerCreator func(r io.Reader, name, dtype string, shape []int32) (LayerInfo, error)

TensorLayerCreator creates a tensor blob layer with metadata. name is the path-style name including component (e.g., "text_encoder/model.embed_tokens.weight")

Directories

Path Synopsis
Package client provides client-side model creation for safetensors-based models.
Package client provides client-side model creation for safetensors-based models.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL