Documentation
¶
Overview ¶
Package gguf provides GGUF file format parsing and model loading.
GGUF (GGML Universal Format) is the file format used by llama.cpp for storing quantized LLM models. This package enables Born to load and use the 10,000+ pre-quantized models available on HuggingFace.
Specification: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md
Index ¶
- Constants
- func Dequantize(data []byte, dtype GGMLType, numElements int) ([]float32, error)
- func DequantizeBlock(data []byte, dtype GGMLType) ([]float32, error)
- func Float16ToFloat32(h uint16) float32
- func LoadAllTensors(file *File) (map[string][]byte, error)
- func LoadTensorData(file *File, tensorName string) ([]byte, error)
- type ConvertedTensor
- type File
- func (f *File) Architecture() string
- func (f *File) BlockCount() int
- func (f *File) ContextLength() int
- func (f *File) EmbeddingLength() int
- func (f *File) FeedForwardLength() int
- func (f *File) GetTensor(name string) *TensorInfo
- func (f *File) HeadCount() int
- func (f *File) HeadCountKV() int
- func (f *File) Name() string
- func (f *File) VocabSize() int
- type GGMLType
- type Header
- type MetadataKV
- type TensorConverter
- type TensorInfo
- type TensorReader
- type TypeTrait
- type ValueType
Constants ¶
const ( MagicGGUFLE uint32 = 0x46554747 // "GGUF" little-endian. MagicGGUFBE uint32 = 0x47475546 // "GGUF" big-endian (reversed). )
Magic bytes for GGUF format.
const ( Version1 uint32 = 1 Version2 uint32 = 2 Version3 uint32 = 3 // Current version. )
Version constants.
const DefaultAlignment = 32
DefaultAlignment is the default alignment for tensor data.
Variables ¶
This section is empty.
Functions ¶
func Dequantize ¶
Dequantize преобразует quantized данные в float32. Поддерживает: F32, F16, Q4_0, Q4_1, Q5_0, Q5_1, Q8_0, Q8_1, Q4_K, Q5_K, Q6_K.
func DequantizeBlock ¶
DequantizeBlock дequантизирует один блок данных. Используется для streaming дequантизации больших тензоров.
func Float16ToFloat32 ¶
Float16ToFloat32 конвертирует half precision (IEEE 754) в float32.
func LoadAllTensors ¶
LoadAllTensors reads all tensors from the file. Returns a map of tensor name to raw data.
Types ¶
type ConvertedTensor ¶
type ConvertedTensor struct {
Name string
Shape []int
Data []float32
OriginalType GGMLType // Original GGUF type (for debugging).
}
ConvertedTensor holds converted tensor data.
type File ¶
type File struct {
Header Header
Metadata map[string]interface{}
TensorInfo []TensorInfo
Alignment int
// Calculated offsets.
TensorDataOffset int64
// Source info.
FilePath string
FileSize int64
}
File represents a parsed GGUF file.
func Parse ¶
func Parse(r io.ReadSeeker) (*File, error)
Parse reads and parses a GGUF file from the given reader.
func (*File) Architecture ¶
Architecture returns the model architecture (e.g., "llama", "gpt2").
func (*File) BlockCount ¶
BlockCount returns the number of transformer blocks.
func (*File) ContextLength ¶
ContextLength returns the maximum context length.
func (*File) EmbeddingLength ¶
EmbeddingLength returns the embedding dimension.
func (*File) FeedForwardLength ¶
FeedForwardLength returns the FFN intermediate size.
func (*File) GetTensor ¶
func (f *File) GetTensor(name string) *TensorInfo
GetTensor finds a tensor by name.
func (*File) HeadCountKV ¶
HeadCountKV returns the number of KV heads (for GQA).
type GGMLType ¶
type GGMLType uint32
GGMLType represents the data type of tensor elements.
const ( GGMLTypeF32 GGMLType = 0 GGMLTypeF16 GGMLType = 1 GGMLTypeQ4_0 GGMLType = 2 GGMLTypeQ4_1 GGMLType = 3 // Types 4, 5 are deprecated (Q4_2, Q4_3). GGMLTypeQ5_0 GGMLType = 6 GGMLTypeQ5_1 GGMLType = 7 GGMLTypeQ8_0 GGMLType = 8 GGMLTypeQ8_1 GGMLType = 9 GGMLTypeQ2_K GGMLType = 10 GGMLTypeQ3_K GGMLType = 11 GGMLTypeQ4_K GGMLType = 12 GGMLTypeQ5_K GGMLType = 13 GGMLTypeQ6_K GGMLType = 14 GGMLTypeQ8_K GGMLType = 15 GGMLTypeIQ2_XXS GGMLType = 16 GGMLTypeIQ2_XS GGMLType = 17 GGMLTypeIQ3_XXS GGMLType = 18 GGMLTypeIQ1_S GGMLType = 19 GGMLTypeIQ4_NL GGMLType = 20 GGMLTypeIQ3_S GGMLType = 21 GGMLTypeIQ2_S GGMLType = 22 GGMLTypeIQ4_XS GGMLType = 23 GGMLTypeI8 GGMLType = 24 GGMLTypeI16 GGMLType = 25 GGMLTypeI32 GGMLType = 26 GGMLTypeI64 GGMLType = 27 GGMLTypeF64 GGMLType = 28 GGMLTypeBF16 GGMLType = 29 )
GGML tensor types (quantization formats). Note: Names use underscores to match GGML specification exactly (e.g., Q4_K, not Q4K).
func (GGMLType) IsQuantized ¶
IsQuantized returns true if the type is a quantized format.
type MetadataKV ¶
MetadataKV represents a key-value pair in the metadata.
type TensorConverter ¶
type TensorConverter struct {
// contains filtered or unexported fields
}
TensorConverter converts GGUF tensors to Born tensor format.
func NewTensorConverter ¶
func NewTensorConverter(file *File) (*TensorConverter, error)
NewTensorConverter creates a converter for the given GGUF file.
func (*TensorConverter) Convert ¶
func (c *TensorConverter) Convert(name string) ([]float32, []int, error)
Convert loads and dequantizes a single tensor. Returns raw float32 data and shape.
GGUF dimensions are stored in reverse order compared to Born's convention: GGUF [dim0, dim1, dim2] → Born [dim2, dim1, dim0].
func (*TensorConverter) ConvertAll ¶
func (c *TensorConverter) ConvertAll() (map[string]ConvertedTensor, error)
ConvertAll loads and dequantizes all tensors.
type TensorInfo ¶
type TensorInfo struct {
Name string
NDims uint32
Dimensions []uint64
Type GGMLType
Offset uint64 // Offset from start of tensor data section.
}
TensorInfo contains metadata about a tensor in the file.
func (*TensorInfo) NumElements ¶
func (t *TensorInfo) NumElements() uint64
NumElements returns the total number of elements in the tensor.
func (*TensorInfo) Size ¶
func (t *TensorInfo) Size() uint64
Size returns the size in bytes of the tensor data.
type TensorReader ¶
type TensorReader struct {
// contains filtered or unexported fields
}
TensorReader provides streaming access to tensor data for large models.
func NewTensorReader ¶
func NewTensorReader(file *File) (*TensorReader, error)
NewTensorReader creates a new tensor reader for the given file. The caller is responsible for closing the reader when done.
func (*TensorReader) ReadTensor ¶
func (r *TensorReader) ReadTensor(name string) ([]byte, error)
ReadTensor reads the raw data of a tensor by name.
func (*TensorReader) ReadTensorInto ¶
func (r *TensorReader) ReadTensorInto(name string, dst []byte) error
ReadTensorInto reads the raw data of a tensor into the provided buffer. The buffer must be large enough to hold the tensor data.
type TypeTrait ¶
type TypeTrait struct {
BlockSize int // Number of elements per block.
TypeSize int // Size in bytes per block.
Quantized bool
}
TypeTrait contains metadata about a GGML type.
type ValueType ¶
type ValueType uint32
ValueType represents the type of a metadata value.
const ( ValueTypeUint8 ValueType = 0 ValueTypeInt8 ValueType = 1 ValueTypeUint16 ValueType = 2 ValueTypeInt16 ValueType = 3 ValueTypeUint32 ValueType = 4 ValueTypeInt32 ValueType = 5 ValueTypeFloat32 ValueType = 6 ValueTypeBool ValueType = 7 ValueTypeString ValueType = 8 ValueTypeArray ValueType = 9 ValueTypeUint64 ValueType = 10 ValueTypeInt64 ValueType = 11 ValueTypeFloat64 ValueType = 12 )
Metadata value types as defined in GGUF specification.