Documentation
¶
Overview ¶
Package embedding provides an ONNX-based embedding engine for semantic matching. It uses the MiniLM model to generate 384-dimensional embeddings for text.
Index ¶
- Constants
- type Config
- type Engine
- func (e *Engine) BatchEmbed(texts []string) ([][]float32, error)
- func (e *Engine) CosineSimilarity(a, b []float32) float64
- func (e *Engine) Embed(text string) ([]float32, error)
- func (e *Engine) GetDimension() int
- func (e *Engine) Initialize(sharedLibPath string) error
- func (e *Engine) IsEnabled() bool
- func (e *Engine) Shutdown() error
- type ModelLocator
- type SimpleTokenizer
- type TokenizedInput
Constants ¶
const ( // DefaultModelName is the default embedding model to use DefaultModelName = "all-MiniLM-L6-v2" // EmbeddingDimension is the output dimension of the MiniLM model EmbeddingDimension = 384 // MaxSequenceLength is the maximum input sequence length MaxSequenceLength = 256 )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
// ModelPath is the path to the ONNX model file
ModelPath string
// VocabPath is the path to the vocabulary file
VocabPath string
SharedLibraryPath string
}
Config holds configuration for the embedding engine.
type Engine ¶
type Engine struct {
// contains filtered or unexported fields
}
Engine provides embedding inference using ONNX runtime. It loads a MiniLM model and provides methods for computing embeddings.
func NewEngine ¶
NewEngine creates a new embedding engine with the given configuration. The engine is not initialized until Initialize() is called.
Parameters:
- cfg: Configuration for the engine
Returns:
- *Engine: A new engine instance
- error: Any error encountered during creation
func (*Engine) BatchEmbed ¶
BatchEmbed computes embeddings for multiple texts efficiently. This is more efficient than calling Embed() multiple times.
Parameters:
- texts: The input texts to embed
Returns:
- [][]float32: The embedding vectors for each text
- error: Any error encountered during embedding
func (*Engine) CosineSimilarity ¶
CosineSimilarity computes the cosine similarity between two embedding vectors. Both vectors should be normalized for accurate results.
Parameters:
- a: First embedding vector
- b: Second embedding vector
Returns:
- float64: Cosine similarity score (-1.0 to 1.0)
func (*Engine) Embed ¶
Embed computes the embedding vector for a single text. The text is tokenized and passed through the model.
Parameters:
- text: The input text to embed
Returns:
- []float32: The embedding vector (384 dimensions)
- error: Any error encountered during embedding
func (*Engine) GetDimension ¶
GetDimension returns the embedding output dimension.
Returns:
- int: The embedding dimension (384 for MiniLM)
func (*Engine) Initialize ¶
Initialize loads the ONNX model and prepares the engine for inference. This must be called before using Embed() or BatchEmbed().
Parameters:
- sharedLibPath: Path to the ONNX runtime shared library
Returns:
- error: Any error encountered during initialization
type ModelLocator ¶
type ModelLocator struct {
// BaseDir is the base directory for model storage
BaseDir string
}
ModelLocator helps find model files and ONNX runtime libraries.
func NewModelLocator ¶
func NewModelLocator() *ModelLocator
NewModelLocator creates a new model locator with default paths.
func (*ModelLocator) EnsureModelDir ¶
func (l *ModelLocator) EnsureModelDir(modelName string) error
EnsureModelDir creates the model directory if it doesn't exist.
Parameters:
- modelName: Name of the model
Returns:
- error: Any error encountered
func (*ModelLocator) GetModelPath ¶
func (l *ModelLocator) GetModelPath(modelName string) string
GetModelPath returns the path to the ONNX model file.
Parameters:
- modelName: Name of the model (e.g., "all-MiniLM-L6-v2")
Returns:
- string: Full path to the model file
func (*ModelLocator) GetSharedLibraryPath ¶
func (l *ModelLocator) GetSharedLibraryPath() string
GetSharedLibraryPath returns the path to the ONNX runtime shared library. It checks common installation locations based on the operating system.
Returns:
- string: Path to the shared library, or empty string if not found
func (*ModelLocator) GetVocabPath ¶
func (l *ModelLocator) GetVocabPath(modelName string) string
GetVocabPath returns the path to the vocabulary file.
Parameters:
- modelName: Name of the model
Returns:
- string: Full path to the vocabulary file
func (*ModelLocator) ModelExists ¶
func (l *ModelLocator) ModelExists(modelName string) bool
ModelExists checks if the model files exist.
Parameters:
- modelName: Name of the model
Returns:
- bool: true if model files exist
type SimpleTokenizer ¶
type SimpleTokenizer struct {
// contains filtered or unexported fields
}
SimpleTokenizer implements a basic WordPiece tokenizer for BERT-style models. This is a simplified implementation that handles common cases.
func NewSimpleTokenizer ¶
func NewSimpleTokenizer(vocabPath string) (*SimpleTokenizer, error)
NewSimpleTokenizer creates a new tokenizer from a vocabulary file. The vocabulary file should have one token per line.
Parameters:
- vocabPath: Path to the vocabulary file
Returns:
- *SimpleTokenizer: A new tokenizer instance
- error: Any error encountered during loading
func (*SimpleTokenizer) GetVocabSize ¶
func (t *SimpleTokenizer) GetVocabSize() int
GetVocabSize returns the size of the vocabulary.
func (*SimpleTokenizer) Tokenize ¶
func (t *SimpleTokenizer) Tokenize(text string, maxLength int) (*TokenizedInput, error)
Tokenize converts text into token IDs for model input. It applies basic preprocessing and WordPiece tokenization.
Parameters:
- text: The input text to tokenize
- maxLength: Maximum sequence length (including special tokens)
Returns:
- *TokenizedInput: The tokenized output
- error: Any error encountered during tokenization
type TokenizedInput ¶
type TokenizedInput struct {
// InputIDs are the token IDs
InputIDs []int64
// AttentionMask indicates which tokens are real (1) vs padding (0)
AttentionMask []int64
// TokenTypeIDs are segment IDs (0 for first segment)
TokenTypeIDs []int64
}
TokenizedInput represents the tokenized output ready for model inference.