Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func EnsureModelDownloaded ¶
EnsureModelDownloaded pulls the model files into the local cache if missing. We only need the tokenizers for sparse vector generation.
func GenerateSparseVector ¶
GenerateSparseVector builds a normalized sparse vector from text using the registered provider. If no provider is registered, it uses the default BoWProvider for backward compatibility.
func RegisterProvider ¶
func RegisterProvider(p Provider)
RegisterProvider registers a sparse vector provider, replacing the default.
Types ¶
type BoWProvider ¶
type BoWProvider struct {
// contains filtered or unexported fields
}
BoWProvider implements the Provider interface using a Bag-of-Words approach with a pretrained tokenizer.
func NewBoWProvider ¶
func NewBoWProvider() *BoWProvider
NewBoWProvider creates a new Bag-of-Words sparse provider.
func (*BoWProvider) GenerateSparseVector ¶
func (p *BoWProvider) GenerateSparseVector(ctx context.Context, text string) (*schema.SparseVector, error)
GenerateSparseVector builds a normalized BOW sparse vector from text. Special tokens (PAD, CLS, SEP) are filtered to reduce noise.