local

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Package local provides implementations for local embedding model providers. It supports various model formats and includes a tokenizer for text preprocessing.

Package local provides implementations for local embedding model providers. It supports various model formats and includes a tokenizer for text preprocessing.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Model        Model // Local embedding model implementation
	Dimension    int   // Embedding dimension
	MaxBatchSize int   // Maximum batch size for embedding generation
}

Config contains configuration parameters for the local embedding provider.

type Model

type Model interface {
	// Run performs inference on the given inputs and returns the model outputs.
	Run(inputs map[string]interface{}) (map[string]interface{}, error)

	// Close releases any resources associated with the model.
	Close() error
}

Model defines the interface for local embedding models. Implementations should handle model loading, inference, and resource cleanup.

type Provider

type Provider struct {
	// contains filtered or unexported fields
}

Provider implements the embedding.Provider interface for local models. It handles tokenization, batch processing, and model inference.

func New

func New(config Config) (*Provider, error)

New creates a new local embedding provider with the given configuration.

Parameters: - config: Configuration parameters for the provider

Returns: - *Provider: A new local embedding provider instance - error: Error if configuration is invalid or initialization fails

func (*Provider) Dimension

func (p *Provider) Dimension() int

Dimension returns the embedding dimension

func (*Provider) Embed

func (p *Provider) Embed(ctx context.Context, texts []string) ([][]float32, error)

Embed generates embeddings for the given texts

type Tokenizer

type Tokenizer struct {
	// contains filtered or unexported fields
}

Tokenizer handles text tokenization for embedding models. It converts raw text into token IDs that can be processed by embedding models.

func NewTokenizer

func NewTokenizer() (*Tokenizer, error)

NewTokenizer creates a new tokenizer instance.

Note: This is a simple whitespace tokenizer for demonstration purposes. In a production implementation, you would use a proper tokenizer like BPE or WordPiece.

Returns: - *Tokenizer: A new tokenizer instance - error: Error if initialization fails

func (*Tokenizer) TokenizeBatch

func (t *Tokenizer) TokenizeBatch(texts []string) ([][]int64, [][]int64, error)

TokenizeBatch tokenizes a batch of texts

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL