Documentation
¶
Overview ¶
Package local provides implementations for local embedding model providers. It supports various model formats and includes a tokenizer for text preprocessing.
Package local provides implementations for local embedding model providers. It supports various model formats and includes a tokenizer for text preprocessing.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
Model Model // Local embedding model implementation
Dimension int // Embedding dimension
MaxBatchSize int // Maximum batch size for embedding generation
}
Config contains configuration parameters for the local embedding provider.
type Model ¶
type Model interface {
// Run performs inference on the given inputs and returns the model outputs.
Run(inputs map[string]interface{}) (map[string]interface{}, error)
// Close releases any resources associated with the model.
Close() error
}
Model defines the interface for local embedding models. Implementations should handle model loading, inference, and resource cleanup.
type Provider ¶
type Provider struct {
// contains filtered or unexported fields
}
Provider implements the embedding.Provider interface for local models. It handles tokenization, batch processing, and model inference.
func New ¶
New creates a new local embedding provider with the given configuration.
Parameters: - config: Configuration parameters for the provider
Returns: - *Provider: A new local embedding provider instance - error: Error if configuration is invalid or initialization fails
type Tokenizer ¶
type Tokenizer struct {
// contains filtered or unexported fields
}
Tokenizer handles text tokenization for embedding models. It converts raw text into token IDs that can be processed by embedding models.
func NewTokenizer ¶
NewTokenizer creates a new tokenizer instance.
Note: This is a simple whitespace tokenizer for demonstration purposes. In a production implementation, you would use a proper tokenizer like BPE or WordPiece.
Returns: - *Tokenizer: A new tokenizer instance - error: Error if initialization fails