gguf

package
v0.7.16 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package gguf provides GGUF file format parsing and model loading.

GGUF (GGML Universal Format) is the file format used by llama.cpp for storing quantized LLM models. This package enables Born to load and use the 10,000+ pre-quantized models available on HuggingFace.

Specification: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md

Index

Constants

View Source
const (
	MagicGGUFLE uint32 = 0x46554747 // "GGUF" little-endian.
	MagicGGUFBE uint32 = 0x47475546 // "GGUF" big-endian (reversed).
)

Magic bytes for GGUF format.

View Source
const (
	Version1 uint32 = 1
	Version2 uint32 = 2
	Version3 uint32 = 3 // Current version.
)

Version constants.

View Source
const DefaultAlignment = 32

DefaultAlignment is the default alignment for tensor data.

Variables

This section is empty.

Functions

func Dequantize

func Dequantize(data []byte, dtype GGMLType, numElements int) ([]float32, error)

Dequantize преобразует quantized данные в float32. Поддерживает: F32, F16, Q4_0, Q4_1, Q5_0, Q5_1, Q8_0, Q8_1, Q4_K, Q5_K, Q6_K.

func DequantizeBlock

func DequantizeBlock(data []byte, dtype GGMLType) ([]float32, error)

DequantizeBlock дequантизирует один блок данных. Используется для streaming дequантизации больших тензоров.

func Float16ToFloat32

func Float16ToFloat32(h uint16) float32

Float16ToFloat32 конвертирует half precision (IEEE 754) в float32.

func LoadAllTensors

func LoadAllTensors(file *File) (map[string][]byte, error)

LoadAllTensors reads all tensors from the file. Returns a map of tensor name to raw data.

func LoadTensorData

func LoadTensorData(file *File, tensorName string) ([]byte, error)

LoadTensorData reads the raw data of a tensor from the file. Returns the tensor data as a byte slice.

Types

type ConvertedTensor

type ConvertedTensor struct {
	Name         string
	Shape        []int
	Data         []float32
	OriginalType GGMLType // Original GGUF type (for debugging).
}

ConvertedTensor holds converted tensor data.

type File

type File struct {
	Header     Header
	Metadata   map[string]interface{}
	TensorInfo []TensorInfo
	Alignment  int

	// Calculated offsets.
	TensorDataOffset int64

	// Source info.
	FilePath string
	FileSize int64
}

File represents a parsed GGUF file.

func Parse

func Parse(r io.ReadSeeker) (*File, error)

Parse reads and parses a GGUF file from the given reader.

func ParseFile

func ParseFile(path string) (*File, error)

ParseFile parses a GGUF file from disk.

func (*File) Architecture

func (f *File) Architecture() string

Architecture returns the model architecture (e.g., "llama", "gpt2").

func (*File) BlockCount

func (f *File) BlockCount() int

BlockCount returns the number of transformer blocks.

func (*File) ContextLength

func (f *File) ContextLength() int

ContextLength returns the maximum context length.

func (*File) EmbeddingLength

func (f *File) EmbeddingLength() int

EmbeddingLength returns the embedding dimension.

func (*File) FeedForwardLength

func (f *File) FeedForwardLength() int

FeedForwardLength returns the FFN intermediate size.

func (*File) GetTensor

func (f *File) GetTensor(name string) *TensorInfo

GetTensor finds a tensor by name.

func (*File) HeadCount

func (f *File) HeadCount() int

HeadCount returns the number of attention heads.

func (*File) HeadCountKV

func (f *File) HeadCountKV() int

HeadCountKV returns the number of KV heads (for GQA).

func (*File) Name

func (f *File) Name() string

Name returns the model name.

func (*File) VocabSize

func (f *File) VocabSize() int

VocabSize returns the vocabulary size.

type GGMLType

type GGMLType uint32

GGMLType represents the data type of tensor elements.

const (
	GGMLTypeF32  GGMLType = 0
	GGMLTypeF16  GGMLType = 1
	GGMLTypeQ4_0 GGMLType = 2
	GGMLTypeQ4_1 GGMLType = 3
	// Types 4, 5 are deprecated (Q4_2, Q4_3).
	GGMLTypeQ5_0    GGMLType = 6
	GGMLTypeQ5_1    GGMLType = 7
	GGMLTypeQ8_0    GGMLType = 8
	GGMLTypeQ8_1    GGMLType = 9
	GGMLTypeQ2_K    GGMLType = 10
	GGMLTypeQ3_K    GGMLType = 11
	GGMLTypeQ4_K    GGMLType = 12
	GGMLTypeQ5_K    GGMLType = 13
	GGMLTypeQ6_K    GGMLType = 14
	GGMLTypeQ8_K    GGMLType = 15
	GGMLTypeIQ2_XXS GGMLType = 16
	GGMLTypeIQ2_XS  GGMLType = 17
	GGMLTypeIQ3_XXS GGMLType = 18
	GGMLTypeIQ1_S   GGMLType = 19
	GGMLTypeIQ4_NL  GGMLType = 20
	GGMLTypeIQ3_S   GGMLType = 21
	GGMLTypeIQ2_S   GGMLType = 22
	GGMLTypeIQ4_XS  GGMLType = 23
	GGMLTypeI8      GGMLType = 24
	GGMLTypeI16     GGMLType = 25
	GGMLTypeI32     GGMLType = 26
	GGMLTypeI64     GGMLType = 27
	GGMLTypeF64     GGMLType = 28
	GGMLTypeBF16    GGMLType = 29
)

GGML tensor types (quantization formats). Note: Names use underscores to match GGML specification exactly (e.g., Q4_K, not Q4K).

func (GGMLType) IsQuantized

func (t GGMLType) IsQuantized() bool

IsQuantized returns true if the type is a quantized format.

func (GGMLType) RowSize

func (t GGMLType) RowSize(elements int) int

RowSize calculates the size in bytes for a row of elements.

func (GGMLType) String

func (t GGMLType) String() string

String returns the string representation of the GGML type.

func (GGMLType) Trait

func (t GGMLType) Trait() TypeTrait

Trait returns the type trait for this GGML type.

type Header struct {
	Magic           uint32
	Version         uint32
	TensorCount     uint64
	MetadataKVCount uint64
}

Header represents the GGUF file header.

type MetadataKV

type MetadataKV struct {
	Key       string
	ValueType ValueType
	Value     interface{}
}

MetadataKV represents a key-value pair in the metadata.

type TensorConverter

type TensorConverter struct {
	// contains filtered or unexported fields
}

TensorConverter converts GGUF tensors to Born tensor format.

func NewTensorConverter

func NewTensorConverter(file *File) (*TensorConverter, error)

NewTensorConverter creates a converter for the given GGUF file.

func (*TensorConverter) Close

func (c *TensorConverter) Close() error

Close releases resources.

func (*TensorConverter) Convert

func (c *TensorConverter) Convert(name string) ([]float32, []int, error)

Convert loads and dequantizes a single tensor. Returns raw float32 data and shape.

GGUF dimensions are stored in reverse order compared to Born's convention: GGUF [dim0, dim1, dim2] → Born [dim2, dim1, dim0].

func (*TensorConverter) ConvertAll

func (c *TensorConverter) ConvertAll() (map[string]ConvertedTensor, error)

ConvertAll loads and dequantizes all tensors.

type TensorInfo

type TensorInfo struct {
	Name       string
	NDims      uint32
	Dimensions []uint64
	Type       GGMLType
	Offset     uint64 // Offset from start of tensor data section.
}

TensorInfo contains metadata about a tensor in the file.

func (*TensorInfo) NumElements

func (t *TensorInfo) NumElements() uint64

NumElements returns the total number of elements in the tensor.

func (*TensorInfo) Size

func (t *TensorInfo) Size() uint64

Size returns the size in bytes of the tensor data.

type TensorReader

type TensorReader struct {
	// contains filtered or unexported fields
}

TensorReader provides streaming access to tensor data for large models.

func NewTensorReader

func NewTensorReader(file *File) (*TensorReader, error)

NewTensorReader creates a new tensor reader for the given file. The caller is responsible for closing the reader when done.

func (*TensorReader) Close

func (r *TensorReader) Close() error

Close closes the underlying file.

func (*TensorReader) ReadTensor

func (r *TensorReader) ReadTensor(name string) ([]byte, error)

ReadTensor reads the raw data of a tensor by name.

func (*TensorReader) ReadTensorInto

func (r *TensorReader) ReadTensorInto(name string, dst []byte) error

ReadTensorInto reads the raw data of a tensor into the provided buffer. The buffer must be large enough to hold the tensor data.

type TypeTrait

type TypeTrait struct {
	BlockSize int // Number of elements per block.
	TypeSize  int // Size in bytes per block.
	Quantized bool
}

TypeTrait contains metadata about a GGML type.

type ValueType

type ValueType uint32

ValueType represents the type of a metadata value.

const (
	ValueTypeUint8   ValueType = 0
	ValueTypeInt8    ValueType = 1
	ValueTypeUint16  ValueType = 2
	ValueTypeInt16   ValueType = 3
	ValueTypeUint32  ValueType = 4
	ValueTypeInt32   ValueType = 5
	ValueTypeFloat32 ValueType = 6
	ValueTypeBool    ValueType = 7
	ValueTypeString  ValueType = 8
	ValueTypeArray   ValueType = 9
	ValueTypeUint64  ValueType = 10
	ValueTypeInt64   ValueType = 11
	ValueTypeFloat64 ValueType = 12
)

Metadata value types as defined in GGUF specification.

func (ValueType) String

func (t ValueType) String() string

String returns the string representation of the value type.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL